Mistral AI logo

Codestral-22B

Mistral AI

A 22 billion parameter code generation model trained on over 80 programming languages, including Python, Java, C, C++, JavaScript, and Bash. Supports both instruction execution and fill-in-the-middle (FIM) functionality for code autocomplete and generation tasks.

Key Specifications

Parameters
22.2B
Context
32.8K
Release Date
May 29, 2024
Average Score
65.9%

Timeline

Key dates in the model's history
Announcement
May 29, 2024
Last Update
July 19, 2025
Today
March 25, 2026

Technical Specifications

Parameters
22.2B
Training Tokens
-
Knowledge Cutoff
-
Family
-
Capabilities
MultimodalZeroEval

Pricing & Availability

Input (per 1M tokens)
$0.20
Output (per 1M tokens)
$0.60
Max Input Tokens
32.8K
Max Output Tokens
32.8K
Supported Features
Function CallingStructured OutputCode ExecutionWeb SearchBatch InferenceFine-tuning

Benchmark Results

Model performance metrics across various tests and benchmarks

Programming

Programming skills tests
HumanEval
pass@1 with first attempts (pass@1) - this metric, which evaluates accuracy, when model has only one attempt for solutions tasks. She/It measures proportion answers, which model solves correctly with first times, without capabilities its solution or several answers. This metric especially important for evaluation basic abilities model in context, when users results with first attempts, and when no capabilities for several iterations or capabilities choose best answer from several generated optionsSelf-reported
81.1%
MBPP
pass@1 that, that model with first attempts will solve task or correct answer. In difference from metrics accuracy, which determines correctness answer model in form (correctly/incorrectly), pass@1 accounts for answers. For tasks with answer or tasks generation code pass@1 measures that model correct answer with first attempts without necessity repeated attempts or For computation pass@1 model generates several independent answers on one and that indeed question. If k from n generated answers then pass@1 = k/n. This approach allows evaluate not only ability model find correct solution, but and her/its confidence in answerSelf-reported
78.2%

Other Tests

Specialized benchmarks
CruxEval-O
pass@1 Pass@1 In this approach model simply makes one attempt, without any-or tools, capabilities or verification answer. This method evaluation "in one ". Model receives example assignments and gives answer. For some types tasks, such how mathematical equations or puzzles, pass@1 can be efficient from-for LLM to However for other types tasks this method can give resultsSelf-reported
51.3%
HumanEval-Average
Pass@1 - this metric, proportion tasks, which model solves with first attempts. She/It reflects probability that, that most answer, model, is correct. When Pass@1 model generates one answer on task, and if this answer task is considered This strict metric, so how she/it requires, in order to model was with first attempts, without capabilities corrections or its answer. Pass@1 especially useful for evaluation basic abilities model and accuracy her/its in scenarios, where user on first answer without additional This metric, which well with using in situations, requiring and exact answer. For improvements scores Pass@1 model often more but exact answers, insteadSelf-reported
61.5%
HumanEvalFIM-Average
pass@1 In this work we we present metric, "pass@1", which can for evaluation quality answers LLM on tasks programming. Metric how many tasks can be correctly with first attempts. For computation metrics pass@1 we: 1. answer model on task programming 2. this answer on test cases 3. passes whether answer all tests Metric pass@1 shows proportion tasks, which model solved with first attempts. For example, if model correctly solved 75 from 100 tasks with first attempts, then pass@1 = 0.75 or 75%. In difference from other metrics, such how pass@k, which allow model do several attempts and best result, pass@1 evaluates ability model generate correct answer with first times, that to in scenariosSelf-reported
91.6%
RepoBench
pass@1 AI-system tries answer on question. If she/it answers correctly with first attempts, this how (1), in case - how (0). Evaluation pass@1 represents itself proportion questions, on which system answers correctly with first attemptsSelf-reported
34.0%
Spider
solution with first attempts AI: Translate following text on Russian languageSelf-reported
63.5%

License & Metadata

License
mnpl_0_1
Announcement Date
May 29, 2024
Last Updated
July 19, 2025

Similar Models

All Models

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.