Qwen2.5-Coder 7B Instruct
Qwen2.5-Coder is a specialized coding model trained on 5.5 trillion tokens of code data, supporting 92 programming languages with a 128K context window. It excels at code generation, completion, and fixing while maintaining high performance in math and general tasks. The model demonstrates exceptional capabilities in multi-language programming tasks and code reasoning.
Key Specifications
Parameters
7.0B
Context
-
Release Date
September 19, 2024
Average Score
58.0%
Timeline
Key dates in the model's history
Announcement
September 19, 2024
Last Update
July 19, 2025
Today
March 25, 2026
Technical Specifications
Parameters
7.0B
Training Tokens
5.5T tokens
Knowledge Cutoff
-
Family
-
Fine-tuned from
qwen-2.5-7b-instruct
Capabilities
MultimodalZeroEval
Benchmark Results
Model performance metrics across various tests and benchmarks
General Knowledge
Tests on general knowledge and understanding
HellaSwag
accuracy • Self-reported
MMLU
accuracy • Self-reported
TruthfulQA
accuracy • Self-reported
Winogrande
accuracy • Self-reported
Programming
Programming skills tests
HumanEval
pass@1 This method evaluates efficiency model, prompt total one times and correctness answer. pass@1 gives evaluation: 1, if answer correct, and 0, if no. This way evaluation, not execution for each example • Self-reported
MBPP
pass@1 Pass with first attempts (pass@1) means, that model should solve task correctly with first attempts, when she/it task first. In difference from other such how pass@k, at pass@1 model not has capabilities generate several attempts solutions with choice best answer. This strict metric, so how she/it measures ability model find correct solution with first times. When evaluation pass@1 conclusions model usually on match answer with using or specialized High score pass@1 indicates on then, that model understanding domain field and can exact solutions without necessity in additional attempts or • Self-reported
Mathematics
Mathematical problems and computations
GSM8k
accuracy • Self-reported
MATH
accuracy • Self-reported
Other Tests
Specialized benchmarks
Aider
pass@1 We we determine "pass@1" how probability that, that model correct answer with first attempts. Some model can use several attempts for solutions one and that indeed tasks (for example, with help sample and rating, such how majority voting), that can improve performance, but for this metrics we we consider only one attempt • Self-reported
ARC-C
accuracy • Self-reported
BigCodeBench
accuracy • Self-reported
CRUXEval-Input-CoT
accuracy • Self-reported
CRUXEval-Output-CoT
accuracy • Self-reported
LiveCodeBench
pass@1 Method measurement efficiency first for tasks, models artificial intelligence. This score measures proportion or percentage correct answers, model with first attempts, without preliminary iterations or for evaluation base abilities model find correct solution immediately, that has value how for efficiency, so and for application. High score pass@1 indicates on then, that model and reasoning for solutions tasks without necessity in several attempts or additional that makes her/its more and practically in real scenarios • Self-reported
MMLU-Base
accuracy • Self-reported
MMLU-Pro
accuracy • Self-reported
MMLU-Redux
accuracy • Self-reported
STEM
accuracy • Self-reported
TheoremQA
accuracy • Self-reported
License & Metadata
License
apache_2_0
Announcement Date
September 19, 2024
Last Updated
July 19, 2025
Similar Models
All ModelsQwen3.5 9B
Alibaba
9.0B
Released:Mar 2026
Qwen2.5 7B Instruct
Alibaba
7.6B
Best score:0.8 (HumanEval)
Released:Sep 2024
Price:$0.30/1M tokens
Qwen2 7B Instruct
Alibaba
7.6B
Best score:0.8 (HumanEval)
Released:Jul 2024
Qwen3-Coder 480B A35B Instruct
Alibaba
480.0B
Best score:0.8 (TAU)
Released:Jan 2025
Qwen2.5-Coder 32B Instruct
Alibaba
32.0B
Best score:0.9 (HumanEval)
Released:Sep 2024
Price:$0.09/1M tokens
Qwen2.5 32B Instruct
Alibaba
32.5B
Best score:0.9 (HumanEval)
Released:Sep 2024
QwQ-32B-Preview
Alibaba
32.5B
Best score:0.7 (GPQA)
Released:Nov 2024
Price:$1.20/1M tokens
Qwen2.5 72B Instruct
Alibaba
72.7B
Best score:0.9 (HumanEval)
Released:Sep 2024
Price:$1.20/1M tokens
Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.