DeepSeek R1 Zero
DeepSeek-R1-Zero, a model trained using large-scale reinforcement learning (RL) without a prior supervised fine-tuning (SFT) stage, demonstrated remarkable reasoning performance. Through RL, DeepSeek-R1-Zero naturally developed numerous powerful and interesting reasoning behavioral patterns. However, DeepSeek-R1-Zero faces challenges such as infinite repetitions, poor readability, and language mixing. To address these issues and further improve reasoning performance, we introduce DeepSeek-R1, which incorporates cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1 on math, coding, and reasoning tasks.
Key Specifications
Timeline
Technical Specifications
Benchmark Results
Model performance metrics across various tests and benchmarks
Reasoning
Other Tests
License & Metadata
Similar Models
All ModelsDeepSeek-R1-0528
DeepSeek
DeepSeek-V3 0324
DeepSeek
DeepSeek-V3.2-Exp
DeepSeek
DeepSeek-V3.1
DeepSeek
DeepSeek-V3
DeepSeek
DeepSeek-R1
DeepSeek
DeepSeek-V3.2 (Non-thinking)
DeepSeek
DeepSeek-V3.2-Speciale
DeepSeek
Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.