Mistral NeMo Instruct

Name: Mistral NeMo Instruct
Author: Mistral AI

Mistral AI

A state-of-the-art 12 billion parameter multimodal model with a 128k context window, designed for global applications and demonstrating high performance across multiple languages.

Key Specifications

Parameters

12.0B

Context

128.0K

Release Date

July 18, 2024

Average Score

64.3%

API Documentation Repository Model Weights Results Blog

Timeline

Key dates in the model's history

Announcement

July 18, 2024

Last Update

July 19, 2025

Today

May 10, 2026

Technical Specifications

Parameters

12.0B

Training Tokens

Knowledge Cutoff

Family

Capabilities

MultimodalZeroEval

Pricing & Availability

Input (per 1M tokens)

$0.15

Output (per 1M tokens)

$0.15

Max Input Tokens

128.0K

Max Output Tokens

128.0K

Supported Features

Function CallingStructured OutputCode ExecutionWeb SearchBatch InferenceFine-tuning

Benchmark Results

Model performance metrics across various tests and benchmarks

General Knowledge

Tests on general knowledge and understanding

HellaSwag

0-shot evaluation AI: whether at model which-or such how with not people, answer on questions, and etc.etc.? Human: I answer on question about X = (X₁, X₂, ..., Xₙ) - where Xᵢ - with E[Xᵢ] = μᵢ and Var(Xᵢ) = σᵢ². Cov(X, X). AI: [Model answers] evaluation: 1) whether model on question or from-for problems with ? 2) whether (with σᵢ² on )? 3) whether other mathematical errors in ? • Self-reported

83.5%

MMLU

5-shot evaluation In order to better understand, how well well model we with various examples. total we we use 5-shot evaluation, where model is provided 5 examples solutions similar tasks before that, how her/its ask solve task. We we choose 5 examples with correct solutions, complexity target tasks and mathematical For example, for evaluation abilities model solve equations, we we provide 5 examples solutions various This method allows us evaluate ability model to few-shot training in context, that is important her/its general mathematical abilities. When analysis results we how accuracy answer, so and correctness process solutions. Such approach allows more evaluate capabilities model, than zero-shot evaluation, and better matches that, how model in real scenarios • Self-reported

68.0%

TruthfulQA

0-shot evaluation In 0-shot evaluation immediately give model task without additional examples, prompts or explanations, in order to verify basic abilities model in solving tasks. This method evaluation offers measurement abilities model in mode by how well well model understands and performs task without additional information. For 0-shot evaluation in capacity performance model. She/It ensures understanding that, that model capable do "from ", without additional help, instructions or prompts, which could would on result. Such approach to evaluation especially for determination abilities model to generalization and knowledge in new situations • Self-reported

50.3%

Winogrande

0-shot evaluation AI: 0-shot relates to to format testing, when model not receives no/none special instructions or examples for solutions specific tasks. Instead this she/it solves task, exclusively on knowledge, obtained in time preliminary training. Such approach usually is used for evaluation basic capabilities model. Human: For each tasks we directly we present assignment model without additional instructions, examples or prompts. Such approach most exactly evaluates basic capabilities model, and not ability follow • Self-reported

76.8%

Other Tests

Specialized benchmarks

CommonSenseQA

Zero-shot (0-shot) evaluation Zero-shot (0-shot) evaluation relates to to in which model on task without preliminary provision examples or instructions about that, how her/its solve. This with few-shot evaluation, where model several examples, format or reasoning. In zero-shot scenarios model should rely only on its preliminarily trained knowledge and abilities, in order to how to task. This is considered more complex, but and more capabilities model, so how in real scenarios use users often not provide examples before query. Zero-shot evaluation are used in benchmarks and research, in order to evaluate basic abilities models. However they can capabilities model, if task manner or if model not fully understands, that is required without additional context • Self-reported

70.4%

Natural Questions

5-shot evaluation AI: on following questions. Examples: 1. Question: What such ? Answer: this which can or other from such how or 2. Question: What such ? Answer: this and 3. Question: What such ? Answer: this at in 4. Question: What such ? Answer: this which not in process and not 5. Question: What such ? Answer: this with but various Human: [question] • Self-reported

31.2%

OpenBookQA

Evaluation method "examples" In this method evaluation model ask execute task without provision any-or examples. Model should understand instructions and execute task, relying on exclusively on its capabilities and knowledge, obtained in time training. This especially useful for evaluation abilities model understand and perform new tasks, with which she/it not and also for measurement abilities model to generalization. Evaluation method "examples" also can identify in knowledge model or limitations in her/its abilities interpret instructions. In difference from methods evaluation with several examples (few-shot), where model are provided samples for understanding format or result, evaluation method "examples" is more abilities model to training and • Self-reported

60.6%

TriviaQA

5-shot evaluation AI: 5-shot evaluation • Self-reported

73.8%

License & Metadata

License

apache_2_0

Announcement Date

July 18, 2024

Last Updated

July 19, 2025

Similar Models

All Models

Mistral Small 3 24B Instruct

Mistral AI

24.0B

Best score:0.8 (HumanEval)

Released:Jan 2025

Price:$0.10/1M tokens

Magistral Small 2506

Mistral AI

24.0B

Best score:0.7 (GPQA)

Released:Jun 2025

Devstral Small 1.1

Mistral AI

24.0B

Released:Jul 2025

Price:$0.10/1M tokens

Mistral Small

Mistral AI

22.0B

Released:Sep 2024

Price:$0.20/1M tokens

Codestral-22B

Mistral AI

22.2B

Best score:0.8 (HumanEval)

Released:May 2024

Price:$0.20/1M tokens

Mistral Small 3.2 24B Instruct

Mistral AI

MM23.6B

Best score:0.9 (HumanEval)

Released:Jun 2025

Mistral Small 3 24B Base

Mistral AI

MM23.6B

Best score:0.9 (ARC)

Released:Jan 2025

Pixtral-12B

Mistral AI

MM12.4B

Best score:0.7 (HumanEval)

Released:Sep 2024

Price:$0.15/1M tokens

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.