GPT-5.1 Codex High vs Qwen3 VL 32B Thinking: Specs & Benchmark Comparison
| Characteristic | GPT-5.1 Codex High | Qwen3 VL 32B Thinking |
|---|---|---|
| Company | OpenAI | Alibaba |
| Release Date | November 11, 2025 | September 21, 2025 |
| Parameters | — | 33B |
| Multimodal | Yes | Yes |
| Context (input) | 400K | — |
| Context (output) | 128K | — |
| Input Price / 1M | $1.25 | — |
| Output Price / 1M | $10.00 | — |
| Average Score | 1.0 | 144.5 |
Verdict
Both models show equal results — the choice depends on your specific use case.
Overall Performance
Qwen3 VL 32B Thinking shows a higher average benchmark score: 144.5 vs 1.0.
Recency
GPT-5.1 Codex High is newer: released 11/11/2025 vs 9/21/2025.
More About These Models
Related Comparisons
The GPT-5.1 Codex High and Qwen3 VL 32B Thinking comparison is updated for 2026. Data includes benchmark results, API pricing, context window size and other specifications. For more detailed information, visit the GPT-5.1 Codex High or Qwen3 VL 32B Thinking page. See also the complete list of AI model comparisons.