Back to Models

Inception: Mercury

inception/mercury

Description

Mercury is the first diffusion large language model (dLLM). Applying a breakthrough discrete diffusion approach, the model runs 5-10x faster than even speed optimized models like GPT-4.1 Nano and Claude 3.5 Haiku while matching their performance. Mercury's speed enables developers to provide responsive user experiences, including with voice agents, search interfaces, and chatbots. Read more in the blog post here.

API Usage Examples

OpenAI Compatible Endpoint

Use this endpoint with any OpenAI-compatible library. Model: Inception: Mercury (inception/mercury)

curl https://api.ridvay.com/v1/chat/completions   -H "Content-Type: application/json"   -H "Authorization: Bearer YOUR_API_KEY"   -d '{
    "model": "inception/mercury",
    "messages": [
      {
        "role": "user",
        "content": "Explain the capabilities of the Inception: Mercury model"
      }
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

Supported Modalities

  • Text

API Pricing

  • Input: 0.25$ / 1M tokens
  • Output: 1$ / 1M tokens

Token Limits

  • Max Output: 16,384 tokens
  • Max Context: 128,000 tokens

Subscription Tiers

  • free
  • pro
  • ultimate