DeepSeek VL2 Small

Name: DeepSeek VL2 Small
Author: DeepSeek

Multimodal

DeepSeek

An advanced series of large multimodal Mixture-of-Experts (MoE) Vision-Language models that significantly surpasses its predecessor DeepSeek-VL. DeepSeek-VL2 demonstrates superior capabilities across various tasks including but not limited to visual question answering, optical character recognition, document/table/chart understanding, and visual grounding.

Key Specifications

Parameters

16.0B

Context

Release Date

December 13, 2024

Average Score

69.6%

API Documentation Research Paper Repository Model Weights

Timeline

Key dates in the model's history

Announcement

December 13, 2024

Last Update

July 19, 2025

Today

July 6, 2026

Technical Specifications

Parameters

16.0B

Training Tokens

Knowledge Cutoff

Family

Capabilities

MultimodalZeroEval

Benchmark Results

Model performance metrics across various tests and benchmarks

Multimodal

Working with images and visual data

AI2D

test • Self-reported

80.0%

ChartQA

test • Self-reported

84.5%

DocVQA

test • Self-reported

92.3%

MathVista

testmini • Self-reported

60.7%

MMMU

val • Self-reported

48.0%

Other Tests

Specialized benchmarks

InfoVQA

test • Self-reported

75.8%

MMBench

ru test • Self-reported

80.3%

MMBench-V1.1

cn test • Self-reported

79.3%

MME

Standard Evaluation AI: Standard evaluation • Self-reported

21.2%

MMStar

Standard evaluation AI: Standard evaluation • Self-reported

57.0%

MMT-Bench

Standard evaluation AI: I'm an AI assistant that answers questions. • Self-reported

62.9%

OCRBench

Standard evaluation AI: Standard evaluation • Self-reported

83.4%

RealWorldQA

Standard evaluation AI: ChatGPT assisted solving math problems Math problems are a significant challenge for state-of-the-art LLMs. This project studies how LLMs solve math problems. We explore direct solving and chain-of-thought (CoT) prompting, aiming to understand and improve solution approaches. Methods: 1. Direct Solving: We give the model a question and ask for an answer. 2. Chain-of-Thought (CoT): We instruct the model to break down the problem into steps. We study: - Problem solving approach (structured vs. unstructured reasoning) - Common error patterns - Reasoning path analysis - Impact of formula knowledge • Self-reported

65.4%

TextVQA

val • Self-reported

83.4%

License & Metadata

License

deepseek

Announcement Date

December 13, 2024

Last Updated

July 19, 2025

Similar Models

All Models

DeepSeek VL2

DeepSeek

MM27.0B

Released:Dec 2024

Price:$9.50/1M tokens

DeepSeek VL2 Tiny

DeepSeek

MM3.0B

Released:Dec 2024

DeepSeek R1 Distill Qwen 14B

DeepSeek

14.8B

Best score:0.6 (GPQA)

Released:Jan 2025

DeepSeek R1 Distill Llama 70B

DeepSeek

70.6B

Best score:0.7 (GPQA)

Released:Jan 2025

Price:$0.10/1M tokens

DeepSeek R1 Distill Qwen 32B

DeepSeek

32.8B

Best score:0.6 (GPQA)

Released:Jan 2025

Price:$0.12/1M tokens

Gemma 3 12B

Google

MM12.0B

Best score:0.9 (HumanEval)

Released:Mar 2025

Price:$0.05/1M tokens

GPT OSS 20B

OpenAI

MM20.0B

Best score:0.9 (MMLU)

Released:Aug 2025

Price:$0.10/1M tokens

Qwen3 VL 32B Thinking

Alibaba

MM33.0B

Released:Sep 2025

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.