Mistral AI logo

Magistral Small 2506

Mistral AI

Based on Mistral Small 3.1 (2503) with added reasoning capabilities, having undergone SFT on Magistral Medium traces and additional reinforcement learning, this is a small efficient reasoning model with 24 billion parameters. Magistral Small can be deployed locally, fitting on a single RTX 4090 or a MacBook with 32GB RAM after quantization.

Key Specifications

Parameters
24.0B
Context
-
Release Date
June 10, 2025
Average Score
63.2%

Timeline

Key dates in the model's history
Announcement
June 10, 2025
Last Update
July 19, 2025
Today
March 26, 2026

Technical Specifications

Parameters
24.0B
Training Tokens
-
Knowledge Cutoff
June 1, 2025
Family
-
Capabilities
MultimodalZeroEval

Benchmark Results

Model performance metrics across various tests and benchmarks

Reasoning

Logical reasoning and analysis
GPQA
Diamond AI: I I will answers, using method Diamond - structured approach to evaluation various aspects answer. What such method Diamond? Diamond - this system analysis answers, which evaluates 5 key : D - : How well and from errors information? I - : How well exactly instructions? A - : How well and answers on question? M - : How well process reasoning? O - : How well well and presented answer? N - : How well important nuances and details? D - : side answer can note? method Diamond, I I will: 1. each measurement 2. from 1 to 5 for evaluation each measurement 3. specific examples for evaluations 4. specific by 5. about strong and answer This methodology allows and evaluate answers, how strong side, so and field forSelf-reported
68.2%

Other Tests

Specialized benchmarks
AIME 2024
Score Method evaluation for — this proportion tasks, which successfully this method evaluation consists in that, in order to evaluate efficiency systems in solving complex mathematical tasks. In our case "solution" tasks means, that model can find correct or solution, solutions. This template solutions often includes justification and specific answer. For each tasks we we determine evaluation how 1, if answer model correct, and 0, if answer incorrect. For application method evaluation: 1. numerical or answers, obtained model, with correct answers. 2. each task how 1 () or 0 (not ). 3. evaluation by all tasks, that gives proportion successfully solved tasks. This and method evaluation, which allows us evaluate, how well well model handles with tasksSelf-reported
70.7%
AIME 2025
Evaluation AI: ChatGPT-4o (gpt-4o-2024-05-13) Evaluation models on mathematical sometimes includes in itself between errors from-for reasoning or other process solutions. Analysis often by means of models, step for step, and identification errors. Such analysis can give representation about that, which model most for errors and in We model about their errors and about that, by which incorrect answer. This can make how for own solutions model, so and for solutions, other modelsSelf-reported
62.8%
LiveCodeBench
# analysis reasoning: version (v5) ## analysis reasoning (Automatic Logic-Symbolic Reasoning Analysis, ALSRA) — this tool for analysis chains reasoning in models (LLM). ALSRA provides and way evaluation structure, methodology and solutions tasks, especially mathematical and logical. ## Methodology ALSRA determines and various reasoning in text solutions: ### reasoning 1. **/equations**: expressions in form 2. ****: or 3. ****: about mathematical or 4. ****: following from previous steps 5. ****: results or 6. **steps**: on approach or method solutions 7. ****: in process solutions ### value in with its in text. should be explicitly and for solutions. ### mathematical characters ALSRA general number mathematical characters in solving, including: - and (x, y, n, 5, π) - (+, -, ×, ÷, ∫, ∑, ∏) - (=, <, >, ≤, ≥, ∈, ⊂) - and (∞, ∅, ∀, ∃, ⇒, ⇔) ### (SDF) SDF is calculated how number mathematical characters to number tokens in text. This score reflects degree use in solving. ## Analysis and evaluation ALSRA provides: 1. scores by each type reasoning 2. mathematical characters 3. (SDF) 4Self-reported
51.3%

License & Metadata

License
apache_2_0
Announcement Date
June 10, 2025
Last Updated
July 19, 2025

Similar Models

All Models

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.