Qwen2-VL-72B-Instruct
MultimodalAn instruction-tuned large multimodal model that excels at visual understanding and step-by-step reasoning. It supports image and video input with dynamic resolution processing and improved positional embeddings (M-ROPE), enabling advanced capabilities such as complex problem-solving, multilingual text recognition in images, and agentic interaction in video contexts.
Key Specifications
Parameters
73.4B
Context
-
Release Date
August 29, 2024
Average Score
75.8%
Timeline
Key dates in the model's history
Announcement
August 29, 2024
Last Update
July 19, 2025
Today
March 25, 2026
Technical Specifications
Parameters
73.4B
Training Tokens
-
Knowledge Cutoff
June 30, 2023
Family
-
Capabilities
MultimodalZeroEval
Benchmark Results
Model performance metrics across various tests and benchmarks
Multimodal
Working with images and visual data
ChartQA
## Evaluation AI: you in order to I work or evaluation by scale. need to more information about that, that specifically you in order to I and by criteria • Self-reported
Other Tests
Specialized benchmarks
DocVQAtest
score • Self-reported
EgoSchema
score • Self-reported
InfoVQAtest
Evaluation • Self-reported
MathVista-Mini
score • Self-reported
MMBench_test
# Evaluation Evaluation indicates on then, how well well model solves problem or task. We we provide several by which can evaluate quality solutions model: 1. **correctness**: whether answer model correct ()? In some cases model can obtain points, even if she/it uses other method solutions, than solution - in other cases, model should follow (such how verification, manner). 2. ****: whether model task fully, or only her/its part? whether she/it all possible cases or only some from them? 3. **Efficiency**: whether approach model to solving tasks ? whether model steps? 4. ****: whether solution model and for understanding? model will more score, if she/it not its steps or not verifies its work, when this necessary. model more high evaluation, if her/its solution on errors, and she/it its approach manner • Self-reported
MMMU-Pro
score • Self-reported
MMMUval
score • Self-reported
MMVetGPT4Turbo
score • Self-reported
MTVQA
score • Self-reported
MVBench
score • Self-reported
OCRBench
Evaluation
AI: ChatGPT 4o • Self-reported
RealWorldQA
score • Self-reported
TextVQA
score • Self-reported
VCR_en_easy
Evaluation
AI: ChatGPT (GPT-4) • Self-reported
License & Metadata
License
tongyi_qianwen
Announcement Date
August 29, 2024
Last Updated
July 19, 2025
Similar Models
All ModelsQwen3 VL 32B Thinking
Alibaba
MM33.0B
Released:Sep 2025
Qwen2.5 VL 72B Instruct
Alibaba
MM72.0B
Released:Jan 2025
Qwen2.5 VL 32B Instruct
Alibaba
MM33.5B
Best score:0.9 (HumanEval)
Released:Feb 2025
QvQ-72B-Preview
Alibaba
MM73.4B
Released:Dec 2024
Qwen3.5-397B-A17B
Alibaba
MM397.0B
Released:Feb 2026
Qwen2.5 VL 7B Instruct
Alibaba
MM8.3B
Released:Jan 2025
Qwen2.5-Omni-7B
Alibaba
MM7.0B
Best score:0.8 (HumanEval)
Released:Mar 2025
Qwen3.5 35B A3B
Alibaba
35.0B
Released:Mar 2026
Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.