Qwen2.5-Coder 7B Instruct

Name: Qwen2.5-Coder 7B Instruct
Author: Alibaba

Alibaba

Qwen2.5-Coder is a specialized coding model trained on 5.5 trillion tokens of code data, supporting 92 programming languages with a 128K context window. It excels at code generation, completion, and fixing while maintaining high performance in math and general tasks. The model demonstrates exceptional capabilities in multi-language programming tasks and code reasoning.

Key Specifications

Parameters

7.0B

Context

Release Date

September 19, 2024

Average Score

58.0%

API Documentation Research Paper Repository Results Blog

Timeline

Key dates in the model's history

Announcement

September 19, 2024

Last Update

July 19, 2025

Today

May 10, 2026

Technical Specifications

Parameters

7.0B

Training Tokens

5.5T tokens

Knowledge Cutoff

Family

Fine-tuned from

qwen-2.5-7b-instruct

Capabilities

MultimodalZeroEval

Benchmark Results

Model performance metrics across various tests and benchmarks

General Knowledge

Tests on general knowledge and understanding

HellaSwag

accuracy • Self-reported

76.8%

MMLU

accuracy • Self-reported

67.6%

TruthfulQA

accuracy • Self-reported

50.6%

Winogrande

accuracy • Self-reported

72.9%

Programming

Programming skills tests

HumanEval

pass@1 This method evaluates efficiency model, prompt total one times and correctness answer. pass@1 gives evaluation: 1, if answer correct, and 0, if no. This way evaluation, not execution for each example • Self-reported

88.4%

MBPP

pass@1 Pass with first attempts (pass@1) means, that model should solve task correctly with first attempts, when she/it task first. In difference from other such how pass@k, at pass@1 model not has capabilities generate several attempts solutions with choice best answer. This strict metric, so how she/it measures ability model find correct solution with first times. When evaluation pass@1 conclusions model usually on match answer with using or specialized High score pass@1 indicates on then, that model understanding domain field and can exact solutions without necessity in additional attempts or • Self-reported

83.5%

Mathematics

Mathematical problems and computations

GSM8k

accuracy • Self-reported

83.9%

MATH

accuracy • Self-reported

46.6%

Other Tests

Specialized benchmarks

Aider

pass@1 We we determine "pass@1" how probability that, that model correct answer with first attempts. Some model can use several attempts for solutions one and that indeed tasks (for example, with help sample and rating, such how majority voting), that can improve performance, but for this metrics we we consider only one attempt • Self-reported

55.6%

ARC-C

accuracy • Self-reported

60.9%

BigCodeBench

accuracy • Self-reported

41.0%

CRUXEval-Input-CoT

accuracy • Self-reported

56.5%

CRUXEval-Output-CoT

accuracy • Self-reported

56.0%

LiveCodeBench

pass@1 Method measurement efficiency first for tasks, models artificial intelligence. This score measures proportion or percentage correct answers, model with first attempts, without preliminary iterations or for evaluation base abilities model find correct solution immediately, that has value how for efficiency, so and for application. High score pass@1 indicates on then, that model and reasoning for solutions tasks without necessity in several attempts or additional that makes her/its more and practically in real scenarios • Self-reported

18.2%

MMLU-Base

accuracy • Self-reported

68.0%

MMLU-Pro

accuracy • Self-reported

40.1%

MMLU-Redux

accuracy • Self-reported

66.6%

STEM

accuracy • Self-reported

34.0%

TheoremQA

accuracy • Self-reported

34.0%

License & Metadata

License

apache_2_0

Announcement Date

September 19, 2024

Last Updated

July 19, 2025

Similar Models

All Models

Qwen3.5 9B

Alibaba

9.0B

Released:Mar 2026

Qwen2.5 7B Instruct

Alibaba

7.6B

Best score:0.8 (HumanEval)

Released:Sep 2024

Price:$0.30/1M tokens

Qwen2 7B Instruct

Alibaba

7.6B

Best score:0.8 (HumanEval)

Released:Jul 2024

Qwen3-Coder 480B A35B Instruct

Alibaba

480.0B

Best score:0.8 (TAU)

Released:Jan 2025

Qwen2.5-Coder 32B Instruct

Alibaba

32.0B

Best score:0.9 (HumanEval)

Released:Sep 2024

Price:$0.09/1M tokens

Qwen2.5 32B Instruct

Alibaba

32.5B

Best score:0.9 (HumanEval)

Released:Sep 2024

QwQ-32B-Preview

Alibaba

32.5B

Best score:0.7 (GPQA)

Released:Nov 2024

Price:$1.20/1M tokens

Qwen2.5 72B Instruct

Alibaba

72.7B

Best score:0.9 (HumanEval)

Released:Sep 2024

Price:$1.20/1M tokens

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.