DeepSeek R1 Distill Qwen 32B

Name: DeepSeek R1 Distill Qwen 32B
Author: DeepSeek

DeepSeek

DeepSeek-R1 is a first-generation reasoning model built on DeepSeek-V3 (671 billion total parameters, 37 billion activated per token). It incorporates large-scale reinforcement learning (RL) to improve chain-of-thought reasoning and logical thinking capabilities, delivering high performance in math, coding, and multi-step reasoning tasks.

Key Specifications

Parameters

32.8B

Context

128.0K

Release Date

January 20, 2025

Average Score

74.2%

API Documentation Research Paper Repository Model Weights

Timeline

Key dates in the model's history

Announcement

January 20, 2025

Last Update

July 19, 2025

Today

May 9, 2026

Technical Specifications

Parameters

32.8B

Training Tokens

14.8T tokens

Knowledge Cutoff

Family

Capabilities

MultimodalZeroEval

Pricing & Availability

Input (per 1M tokens)

$0.12

Output (per 1M tokens)

$0.18

Max Input Tokens

128.0K

Max Output Tokens

128.0K

Supported Features

Function CallingStructured OutputCode ExecutionWeb SearchBatch InferenceFine-tuning

Benchmark Results

Model performance metrics across various tests and benchmarks

Reasoning

Logical reasoning and analysis

GPQA

Diamond, Pass@1 For solutions tasks in GPQA we we use main approach, which I Diamond. Diamond - this solutions tasks with steps: 1. tasks: with analysis tasks, main and goal. On this I task on components, key determination, limitations and question. 2. strategies: various possible approaches to solving, considering tasks. I and methods, most for given problems. 3. solution: solutions, its on clearly specific stages. I sequence mathematical which necessary execute. 4. verification: solution with several view, cases and approaches for reliability. I task several in order to result. 5. Evaluation final answer: result on and correctness, its with I that answer matches tasks. 6. Independent : step I still times all process solutions from to how if would I first this task, in order to identify errors or This structured approach allows me and thoroughly solve complex tasks, probability errors. In I that independent in end critically important for identification errors • Self-reported

62.1%

Other Tests

Specialized benchmarks

AIME 2024

Cons@64 AI: from context on basis prompts-chains Cons@64 should consider how for creation explanations. In difference from standard this technique thinking conclusions on more chains reasoning, solutions with high accuracy. Each step output should have 3 key : 1. context for step 2. step, on analysis 3. for to This technique supports reasoning on logical steps. Each output should directly on context, progress to solving. Cons@64 especially efficient for complex mathematical tasks, assignments by programming, logical evidence and other complex tasks, requiring step-by-step reasoning • Self-reported

83.3%

LiveCodeBench

Pass@1 In context testing Large Language Models (LLM) metric Pass@1 measures proportion tasks, which model can solve with first attempts. This metric reflects ability model generate correct answer "with ", without necessity several attempts or iterations. Pass@1 especially important for evaluation performance models in real scenarios use, where usually exact answer with first times. High score Pass@1 indicates on reliability and accuracy model in solving tasks without necessity additional or When Pass@1 test cases are evaluated how "", if first model answer matches criteria success (for example, correctly solves task, gives answer on question or successfully performs function). This metric often is used in analysis LLM for tasks, requiring accuracy, such how mathematical computation, programming or logical reasoning • Self-reported

57.2%

MATH-500

Pass@1 In tasks mathematical reasoning often useful measure, how many times model can solve problem with first attempts. This score Pass@1. Standard method evaluation Pass@1 set for each problems (for example, 100 or 1000) and these which successfully solve task. However this method requires computational resources and can be More method — use evaluation Pass@1 with number for example, with for each tasks. In this case: Pass@1 = Pass@k / k where Pass@k — this proportion solutions at k attempts. This method gives evaluation, but with For more evaluation with number can use method "self-consistency", when model generates several answers for one tasks, and then most often occurring answer. This approach can accuracy, especially when model in its correct answers and in • Self-reported

94.3%

License & Metadata

License

mit

Announcement Date

January 20, 2025

Last Updated

July 19, 2025

Similar Models

All Models

DeepSeek R1 Distill Llama 70B

DeepSeek

70.6B

Best score:0.7 (GPQA)

Released:Jan 2025

Price:$0.10/1M tokens

DeepSeek R1 Distill Qwen 14B

DeepSeek

14.8B

Best score:0.6 (GPQA)

Released:Jan 2025

DeepSeek-R1-0528

DeepSeek

671.0B

Best score:0.8 (GPQA)

Released:May 2025

Price:$0.70/1M tokens

DeepSeek-V3 0324

DeepSeek

671.0B

Best score:0.7 (GPQA)

Released:Mar 2025

Price:$0.28/1M tokens

Llama-3.3 Nemotron Super 49B v1

NVIDIA

49.9B

Best score:0.7 (GPQA)

Released:Mar 2025

Jamba 1.5 Mini

AI21 Labs

52.0B

Best score:0.9 (ARC)

Released:Aug 2024

Price:$0.20/1M tokens

Mistral Small 3 24B Instruct

Mistral AI

24.0B

Best score:0.8 (HumanEval)

Released:Jan 2025

Price:$0.10/1M tokens

Gemma 2 27B

Google

27.2B

Best score:0.8 (MMLU)

Released:Jun 2024

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.