Back to Models

Nous: DeepHermes 3 Mistral 24B Preview

nousresearch/deephermes-3-mistral-24b-preview

Description

DeepHermes 3 (Mistral 24B Preview) is an instruction-tuned language model by Nous Research based on Mistral-Small-24B, designed for chat, function calling, and advanced multi-turn reasoning. It introduces a dual-mode system that toggles between intuitive chat responses and structured “deep reasoning” mode using special system prompts. Fine-tuned via distillation from R1, it supports structured output (JSON mode) and function call syntax for agent-based applications. DeepHermes 3 supports a **reasoning toggle via system prompt**, allowing users to switch between fast, intuitive responses and deliberate, multi-step reasoning. When activated with the following specific system instruction, the model enters a *"deep thinking"* mode—generating extended chains of thought wrapped in `<think></think>` tags before delivering a final answer. System Prompt: You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.

API Usage Examples

OpenAI Compatible Endpoint

Use this endpoint with any OpenAI-compatible library. Model: Nous: DeepHermes 3 Mistral 24B Preview (nousresearch/deephermes-3-mistral-24b-preview)

curl https://api.ridvay.com/v1/chat/completions   -H "Content-Type: application/json"   -H "Authorization: Bearer YOUR_API_KEY"   -d '{
    "model": "nousresearch/deephermes-3-mistral-24b-preview",
    "messages": [
      {
        "role": "user",
        "content": "Explain the capabilities of the Nous: DeepHermes 3 Mistral 24B Preview model"
      }
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

Supported Modalities

  • Text

API Pricing

  • Input: 0.05$ / 1M tokens
  • Output: 0.2$ / 1M tokens

Token Limits

  • Max Output: 32,768 tokens
  • Max Context: 32,768 tokens

Subscription Tiers

  • free
  • pro
  • ultimate