DeepSeek-V3
A powerful Mixture-of-Experts (MoE) language model with 671 billion total parameters (37 billion activated per token). Features Multi-head Latent Attention (MLA), auxiliary-loss-free load balancing, and multi-token prediction training. Pre-trained on 14.8 trillion tokens with high performance in logical reasoning, math, and coding tasks.
Key Specifications
Timeline
Technical Specifications
Pricing & Availability
Benchmark Results
Model performance metrics across various tests and benchmarks
General Knowledge
Programming
Reasoning
Other Tests
License & Metadata
Articles about DeepSeek-V3

The Two Loops: How China's Open-Source AI Strategy Is Outpacing America
A new USCC report warns that China's open AI models now dominate global downloads. 80% of US startups use Chinese models. Washington is scrambling.
DeepSeek Core Researcher Daya Guo Rumored to Have Left
Reports suggest Daya Guo, a key researcher behind DeepSeek's code intelligence work, has resigned from the Chinese AI lab.
Similar Models
All ModelsDeepSeek-R1
DeepSeek
DeepSeek-R1-0528
DeepSeek
DeepSeek-V3.2 (Thinking)
DeepSeek
DeepSeek-V3.2-Exp
DeepSeek
DeepSeek-V3.1
DeepSeek
DeepSeek R1 Zero
DeepSeek
DeepSeek-V3 0324
DeepSeek
DeepSeek-V2.5
DeepSeek
Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.