Back to Models

NVIDIA: Llama 3.1 Nemotron 70B Instruct

nvidia/llama-3.1-nemotron-70b-instruct

Description

NVIDIA's Llama 3.1 Nemotron 70B is a language model designed for generating precise and useful responses. Leveraging [Llama 3.1 70B](/models/meta-llama/llama-3.1-70b-instruct) architecture and Reinforcement Learning from Human Feedback (RLHF), it excels in automatic alignment benchmarks. This model is tailored for applications requiring high accuracy in helpfulness and response generation, suitable for diverse user queries across multiple domains. Usage of this model is subject to [Meta's Acceptable Use Policy](https://www.llama.com/llama3/use-policy/).

API Usage Examples

OpenAI Compatible Endpoint

Use this endpoint with any OpenAI-compatible library. Model: NVIDIA: Llama 3.1 Nemotron 70B Instruct (nvidia/llama-3.1-nemotron-70b-instruct)

curl https://api.ridvay.com/v1/chat/completions   -H "Content-Type: application/json"   -H "Authorization: Bearer YOUR_API_KEY"   -d '{
    "model": "nvidia/llama-3.1-nemotron-70b-instruct",
    "messages": [
      {
        "role": "user",
        "content": "Explain the capabilities of the NVIDIA: Llama 3.1 Nemotron 70B Instruct model"
      }
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

Supported Modalities

  • Text

API Pricing

  • Input: 1.2$ / 1M tokens
  • Output: 1.2$ / 1M tokens

Token Limits

  • Max Output: 16,384 tokens
  • Max Context: 131,072 tokens

Subscription Tiers

  • free
  • pro
  • ultimate