DeepSeek logo

DeepSeek R1 Distill Qwen 32B

DeepSeek

DeepSeek-R1 is a first-generation reasoning model built on DeepSeek-V3 (671 billion total parameters, 37 billion activated per token). It incorporates large-scale reinforcement learning (RL) to improve chain-of-thought reasoning and logical thinking capabilities, delivering high performance in math, coding, and multi-step reasoning tasks.

Key Specifications

Parameters
32.8B
Context
128.0K
Release Date
January 20, 2025
Average Score
74.2%

Timeline

Key dates in the model's history
Announcement
January 20, 2025
Last Update
July 19, 2025
Today
March 25, 2026

Technical Specifications

Parameters
32.8B
Training Tokens
14.8T tokens
Knowledge Cutoff
-
Family
-
Capabilities
MultimodalZeroEval

Pricing & Availability

Input (per 1M tokens)
$0.12
Output (per 1M tokens)
$0.18
Max Input Tokens
128.0K
Max Output Tokens
128.0K
Supported Features
Function CallingStructured OutputCode ExecutionWeb SearchBatch InferenceFine-tuning

Benchmark Results

Model performance metrics across various tests and benchmarks

Reasoning

Logical reasoning and analysis
GPQA
Diamond, Pass@1 For solutions tasks in GPQA we we use main approach, which I Diamond. Diamond - this solutions tasks with steps: 1. tasks: with analysis tasks, main and goal. On this I task on components, key determination, limitations and question. 2. strategies: various possible approaches to solving, considering tasks. I and methods, most for given problems. 3. solution: solutions, its on clearly specific stages. I sequence mathematical which necessary execute. 4. verification: solution with several view, cases and approaches for reliability. I task several in order to result. 5. Evaluation final answer: result on and correctness, its with I that answer matches tasks. 6. Independent : step I still times all process solutions from to how if would I first this task, in order to identify errors or This structured approach allows me and thoroughly solve complex tasks, probability errors. In I that independent in end critically important for identification errorsSelf-reported
62.1%

Other Tests

Specialized benchmarks
AIME 2024
Cons@64 AI: from context on basis prompts-chains Cons@64 should consider how for creation explanations. In difference from standard this technique thinking conclusions on more chains reasoning, solutions with high accuracy. Each step output should have 3 key : 1. context for step 2. step, on analysis 3. for to This technique supports reasoning on logical steps. Each output should directly on context, progress to solving. Cons@64 especially efficient for complex mathematical tasks, assignments by programming, logical evidence and other complex tasks, requiring step-by-step reasoningSelf-reported
83.3%
LiveCodeBench
Pass@1 In context testing Large Language Models (LLM) metric Pass@1 measures proportion tasks, which model can solve with first attempts. This metric reflects ability model generate correct answer "with ", without necessity several attempts or iterations. Pass@1 especially important for evaluation performance models in real scenarios use, where usually exact answer with first times. High score Pass@1 indicates on reliability and accuracy model in solving tasks without necessity additional or When Pass@1 test cases are evaluated how "", if first model answer matches criteria success (for example, correctly solves task, gives answer on question or successfully performs function). This metric often is used in analysis LLM for tasks, requiring accuracy, such how mathematical computation, programming or logical reasoningSelf-reported
57.2%
MATH-500
Pass@1 In tasks mathematical reasoning often useful measure, how many times model can solve problem with first attempts. This score Pass@1. Standard method evaluation Pass@1 set for each problems (for example, 100 or 1000) and these which successfully solve task. However this method requires computational resources and can be More method — use evaluation Pass@1 with number for example, with for each tasks. In this case: Pass@1 = Pass@k / k where Pass@k — this proportion solutions at k attempts. This method gives evaluation, but with For more evaluation with number can use method "self-consistency", when model generates several answers for one tasks, and then most often occurring answer. This approach can accuracy, especially when model in its correct answers and inSelf-reported
94.3%

License & Metadata

License
mit
Announcement Date
January 20, 2025
Last Updated
July 19, 2025

Similar Models

All Models

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.