Alibaba logo

Qwen3 VL 32B Thinking

Multimodal
Alibaba

Qwen3-VL is a large multimodal model combining vision, language, and reasoning to achieve human-level perception and cognitive abilities. The 33B parameter Thinking version leads in multimodal reasoning and STEM tasks with OCR support, video understanding, and agentic interaction.

Key Specifications

Parameters
33.0B
Context
-
Release Date
September 21, 2025
Average Score
14450.8%

Timeline

Key dates in the model's history
Announcement
September 21, 2025
Last Update
February 17, 2026
Today
March 25, 2026

Technical Specifications

Parameters
33.0B
Training Tokens
-
Knowledge Cutoff
-
Family
-
Capabilities
MultimodalZeroEval

Benchmark Results

Model performance metrics across various tests and benchmarks

Other Tests

Specialized benchmarks
OCRBench
Self-reported
85500.0%
MM-MT-Bench
Self-reported
830.0%
DocVQAtest
Self-reported
96.0%
ScreenSpot
Self-reported
96.0%
MMLU-Redux
Self-reported
92.0%
MMBench-V1.1
EN_V1.1Self-reported
91.0%

License & Metadata

License
apache-2.0
Announcement Date
September 21, 2025
Last Updated
February 17, 2026

Compare Qwen3 VL 32B Thinking

All comparisons

Similar Models

All Models

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.