Description
Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for tasks like math, coding, and logical inference, and a "non-thinking" mode for faster, general-purpose conversation. The model demonstrates strong performance in instruction-following, agent tool use, creative writing, and multilingual tasks across 100+ languages and dialects. It natively handles 32K token contexts and can extend to 131K tokens using YaRN-based scaling.
API Usage Examples
OpenAI Compatible Endpoint
Use this endpoint with any OpenAI-compatible library. Model: Qwen: Qwen3 32B (qwen/qwen3-32b)
curl https://api.ridvay.com/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer YOUR_API_KEY" -d '{
"model": "qwen/qwen3-32b",
"messages": [
{
"role": "user",
"content": "Explain the capabilities of the Qwen: Qwen3 32B model"
}
],
"temperature": 0.7,
"max_tokens": 1024
}'
Supported Modalities
- Text
API Pricing
- Input: 0.018$ / 1M tokens
- Output: 0.072$ / 1M tokens
Token Limits
- Max Output: 40,960 tokens
- Max Context: 40,960 tokens
Subscription Tiers
- free
- pro
- ultimate