DeepSeek VL2 Tiny

Name: DeepSeek VL2 Tiny
Author: DeepSeek

Multimodal

DeepSeek

An advanced series of large multimodal Mixture-of-Experts (MoE) Vision-Language models that significantly surpasses its predecessor DeepSeek-VL. DeepSeek-VL2 demonstrates superior capabilities across various tasks including but not limited to visual question answering, optical character recognition, document/table/chart understanding, and visual grounding.

Key Specifications

Parameters

3.0B

Context

Release Date

December 13, 2024

Average Score

63.1%

API Documentation Research Paper Repository Model Weights

Timeline

Key dates in the model's history

Announcement

December 13, 2024

Last Update

July 19, 2025

Today

May 9, 2026

Technical Specifications

Parameters

3.0B

Training Tokens

Knowledge Cutoff

Family

Capabilities

MultimodalZeroEval

Benchmark Results

Model performance metrics across various tests and benchmarks

Multimodal

Working with images and visual data

AI2D

test • Self-reported

71.6%

ChartQA

test • Self-reported

81.0%

DocVQA

test • Self-reported

88.9%

MathVista

testmini • Self-reported

53.6%

MMMU

AI: val val • Self-reported

40.7%

Other Tests

Specialized benchmarks

InfoVQA

test • Self-reported

66.1%

MMBench

test • Self-reported

69.2%

MMBench-V1.1

cn test • Self-reported

68.3%

MME

Standard evaluation AI: methods formation for solutions. should be exclusively and on context • Self-reported

19.1%

MMStar

Standard evaluation AI: (GPT-4o/Claude/etc.) • Self-reported

45.9%

MMT-Bench

Standard evaluation AI: The magic bullet is a model's ability to solve most questions in a benchmark given one try, or more generally, to solve many questions in one go. • Self-reported

53.2%

OCRBench

Standard Evaluation Standard evaluation AI: need to evaluate problem and solution. I and her/its solve, on data and mathematical • Self-reported

80.9%

RealWorldQA

Standard evaluation AI: Translation descriptions model artificial intelligence on Russian language - standard evaluation performance and capabilities • Self-reported

64.2%

TextVQA

In deep training and machine training, relates to to evaluation and testing models for verification their efficiency and to This not simply verification accuracy, but also evaluation abilities model data, which she/it not and her/its in real conditions. includes in itself: 1. set data for not at training 2. various scores efficiency 3. on and 4. 5. on to examples 6. Analysis cases, when model gives incorrect In LLM often includes also evaluation by such how: - Accuracy information - and answers - to in various tasks - Quality reasoning helps that model to and that in her can improvements on basis • Self-reported

80.7%

License & Metadata

License

deepseek

Announcement Date

December 13, 2024

Last Updated

July 19, 2025

Similar Models

All Models

DeepSeek VL2

DeepSeek

MM27.0B

Released:Dec 2024

Price:$9.50/1M tokens

DeepSeek VL2 Small

DeepSeek

MM16.0B

Released:Dec 2024

DeepSeek R1 Distill Qwen 1.5B

DeepSeek

1.8B

Best score:0.3 (GPQA)

Released:Jan 2025

DeepSeek R1 Distill Qwen 7B

DeepSeek

7.6B

Best score:0.5 (GPQA)

Released:Jan 2025

DeepSeek R1 Distill Llama 8B

DeepSeek

8.0B

Best score:0.5 (GPQA)

Released:Jan 2025

Phi-3.5-vision-instruct

Microsoft

MM4.2B

Released:Aug 2024

Granite 3.3 8B Base

IBM

MM8.2B

Best score:0.9 (HumanEval)

Released:Apr 2025

Gemma 3 4B

Google

MM4.0B

Best score:0.7 (HumanEval)

Released:Mar 2025

Price:$0.02/1M tokens

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.