Description
NVIDIA's Llama 3.1 Nemotron 70B is a language model designed for generating precise and useful responses. Leveraging [Llama 3.1 70B](/models/meta-llama/llama-3.1-70b-instruct) architecture and Reinforcement Learning from Human Feedback (RLHF), it excels in automatic alignment benchmarks. This model is tailored for applications requiring high accuracy in helpfulness and response generation, suitable for diverse user queries across multiple domains. Usage of this model is subject to [Meta's Acceptable Use Policy](https://www.llama.com/llama3/use-policy/).
API Usage Examples
OpenAI Compatible Endpoint
Use this endpoint with any OpenAI-compatible library. Model: NVIDIA: Llama 3.1 Nemotron 70B Instruct (nvidia/llama-3.1-nemotron-70b-instruct)
curl https://api.ridvay.com/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer YOUR_API_KEY" -d '{
"model": "nvidia/llama-3.1-nemotron-70b-instruct",
"messages": [
{
"role": "user",
"content": "Explain the capabilities of the NVIDIA: Llama 3.1 Nemotron 70B Instruct model"
}
],
"temperature": 0.7,
"max_tokens": 1024
}'Supported Modalities
- Text
API Pricing
- Input: 1.2$ / 1M tokens
- Output: 1.2$ / 1M tokens
Token Limits
- Max Output: 16,384 tokens
- Max Context: 131,072 tokens
Subscription Tiers
- free
- pro
- ultimate