Qwen2.5-Coder 32B Instruct

Name: Qwen2.5-Coder 32B Instruct
Author: Alibaba

Alibaba

Qwen2.5-Coder is a specialized coding model trained on 5.5 trillion tokens of code data, supporting 92 programming languages with a 128K token context window. The model excels at code generation, autocomplete, bug fixing, and multilingual programming tasks while maintaining high performance in math and general tasks.

Key Specifications

Parameters

32.0B

Context

128.0K

Release Date

September 19, 2024

Average Score

64.9%

API Documentation Research Paper Repository Model Weights Results Blog

Timeline

Key dates in the model's history

Announcement

September 19, 2024

Last Update

July 19, 2025

Today

May 10, 2026

Technical Specifications

Parameters

32.0B

Training Tokens

5.5T tokens

Knowledge Cutoff

Family

Fine-tuned from

qwen-2.5-32b-instruct

Capabilities

MultimodalZeroEval

Pricing & Availability

Input (per 1M tokens)

$0.09

Output (per 1M tokens)

$0.09

Max Input Tokens

128.0K

Max Output Tokens

128.0K

Supported Features

Function CallingStructured OutputCode ExecutionWeb SearchBatch InferenceFine-tuning

Benchmark Results

Model performance metrics across various tests and benchmarks

General Knowledge

Tests on general knowledge and understanding

HellaSwag

accuracy • Self-reported

83.0%

MMLU

accuracy • Self-reported

75.1%

TruthfulQA

accuracy • Self-reported

54.2%

Winogrande

accuracy • Self-reported

80.8%

Programming

Programming skills tests

HumanEval

pass@1 Method verifies, capable whether model solve task with first attempts. Task is considered if answer correct at If answer incorrect, task is considered additional question or attempts solutions on several examples not Model or handles with task with first attempts, or not handles. Metric not accounts for possible improvements at additional attempts and reflects base ability model solve tasks without approach • Self-reported

92.7%

MBPP

pass@1 This metric evaluates probability that, that model correct answer with first attempts, then is in its answer. For each tasks model receives one if her/its first answer contains correct solution, and in case. score represents itself average value by all tasks. In difference from other metrics, such how pass@k, which evaluate probability correct answer among several attempts, pass@1 evaluates ability model find correct solution with first times. This strict since he not allows no/none errors in process solutions. pass@1 is metric for evaluation reliability and accuracy model in tasks, where important correctness, for example, in or where at users can not be capabilities or resources for verification several answers • Self-reported

90.2%

Mathematics

Mathematical problems and computations

GSM8k

accuracy • Self-reported

91.1%

MATH

accuracy • Self-reported

57.2%

Other Tests

Specialized benchmarks

ARC-C

accuracy • Self-reported

70.5%

BigCodeBench-Full

accuracy • Self-reported

49.6%

BigCodeBench-Hard

accuracy • Self-reported

27.0%

LiveCodeBench

pass@1 — this way measurement performance model, when it is provided only one attempt. This standard score, for evaluation model on tasks, which require exact answer. For example, if model answers on 75 from 100 questions correctly with first attempts, then score pass@1 will 75%. In difference from other metrics, such how pass@k (where model generates k different answers and is considered if although would one correct), pass@1 ability model find correct answer with first attempts. This more strict measure, since model not receives several This score especially important for applications, where usually is required one specific answer, and not several options • Self-reported

31.4%

MMLU-Pro

accuracy • Self-reported

50.4%

MMLU-Redux

accuracy • Self-reported

77.5%

TheoremQA

accuracy • Self-reported

43.1%

License & Metadata

License

apache_2_0

Announcement Date

September 19, 2024

Last Updated

July 19, 2025

Similar Models

All Models

Qwen2.5 32B Instruct

Alibaba

32.5B

Best score:0.9 (HumanEval)

Released:Sep 2024

Qwen3 32B

Alibaba

32.8B

Released:Apr 2025

Price:$0.40/1M tokens

Qwen3.5 27B

Alibaba

27.0B

Released:Mar 2026

Qwen3.5 35B A3B

Alibaba

35.0B

Released:Mar 2026

Qwen3-Next-80B-A3B-Instruct

Alibaba

80.0B

Released:Sep 2025

Price:$0.15/1M tokens

Qwen2 72B Instruct

Alibaba

72.0B

Best score:0.9 (HumanEval)

Released:Jul 2024

Qwen2.5 14B Instruct

Alibaba

14.7B

Best score:0.8 (HumanEval)

Released:Sep 2024

Qwen2.5-Coder 7B Instruct

Alibaba

7.0B

Best score:0.9 (HumanEval)

Released:Sep 2024

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.