MoonshotAI: Kimi Linear 48B A3B Instruct

moonshotai/kimi-linear-48b-a3b-instruct

Try model on Ridvay Chat Get Ridvay API Keys

Description

Kimi Linear is a hybrid linear attention architecture that outperforms traditional full attention methods across various contexts, including short, long, and reinforcement learning (RL) scaling regimes. At its core is Kimi Delta Attention (KDA)—a refined version of Gated DeltaNet that introduces a more efficient gating mechanism to optimize the use of finite-state RNN memory. Kimi Linear achieves superior performance and hardware efficiency, especially for long-context tasks. It reduces the need for large KV caches by up to 75% and boosts decoding throughput by up to 6x for contexts as long as 1M tokens.

API Usage Examples

OpenAI Compatible Endpoint

Use this endpoint with any OpenAI-compatible library. Model: MoonshotAI: Kimi Linear 48B A3B Instruct (moonshotai/kimi-linear-48b-a3b-instruct)

curl https://api.ridvay.com/v1/chat/completions   -H "Content-Type: application/json"   -H "Authorization: Bearer YOUR_API_KEY"   -d '{
    "model": "moonshotai/kimi-linear-48b-a3b-instruct",
    "messages": [
      {
        "role": "user",
        "content": "Explain the capabilities of the MoonshotAI: Kimi Linear 48B A3B Instruct model"
      }
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

Supported Modalities

Text

API Pricing

Input: 0.5$ / 1M tokens
Output: 0.6$ / 1M tokens

Token Limits

Max Output: 1,048,576 tokens
Max Context: 1,048,576 tokens

Subscription Tiers

free
pro
ultimate

MoonshotAI: Kimi Linear 48B A3B Instruct

Description

API Usage Examples

OpenAI Compatible Endpoint

Supported Modalities

API Pricing

Token Limits

Subscription Tiers

More from moonshotai

MoonshotAI: Kimi K2 0711

MoonshotAI: Kimi K2 0905

MoonshotAI: Kimi K2 Thinking

MoonshotAI: Kimi K2 0905 (exacto)