DeepSeek logo

DeepSeek-V3.1

DeepSeek

DeepSeek-V3.1 is a hybrid model supporting both thinking and non-thinking modes through different chat templates. Built on DeepSeek-V3.1-Base with two-phase long context extension (32K phase: 630B tokens, 128K phase: 209B tokens), it has 671B total parameters with 37B activated. Key improvements include smart tool calling, high thinking efficiency comparable to DeepSeek-R1-0528, and FP8 format.

Key Specifications

Parameters
671.0B
Context
163.8K
Release Date
January 9, 2025
Average Score
82.4%

Timeline

Key dates in the model's history
Announcement / Last Update
January 9, 2025
Today
March 25, 2026

Technical Specifications

Parameters
671.0B
Training Tokens
-
Knowledge Cutoff
-
Family
-
Capabilities
MultimodalZeroEval

Pricing & Availability

Input (per 1M tokens)
$0.27
Output (per 1M tokens)
$1.00
Max Input Tokens
163.8K
Max Output Tokens
163.8K
Supported Features
Function CallingStructured OutputCode ExecutionWeb SearchBatch InferenceFine-tuning

Benchmark Results

Model performance metrics across various tests and benchmarks

Reasoning

Logical reasoning and analysis
GPQA
Pass@1, mode without thinkingSelf-reported
75.0%

Other Tests

Specialized benchmarks
SimpleQA
Mode thinking withSelf-reported
93.0%
MMLU-Redux
Mode without thinkingSelf-reported
92.0%
MMLU-Pro
Mode without thinkingSelf-reported
84.0%
Aider-Polyglot
Mode without thinkingSelf-reported
68.0%

License & Metadata

License
mit
Announcement Date
January 9, 2025
Last Updated
January 9, 2025

Similar Models

All Models

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.