Moonshot AI logo

Kimi-k1.5

Multimodal
Moonshot AI

Kimi 1.5 is a next-generation multimodal language model developed by Moonshot AI. It leverages advanced reinforcement learning (RL) and scalable multimodal reasoning, delivering top-tier performance in mathematics, coding, computer vision, and long-context reasoning tasks.

Key Specifications

Parameters
-
Context
-
Release Date
January 20, 2025
Average Score
81.7%

Timeline

Key dates in the model's history
Announcement
January 20, 2025
Last Update
July 19, 2025
Today
March 25, 2026

Technical Specifications

Parameters
-
Training Tokens
-
Knowledge Cutoff
-
Family
-
Capabilities
MultimodalZeroEval

Benchmark Results

Model performance metrics across various tests and benchmarks

General Knowledge

Tests on general knowledge and understanding
MMLU
AI: accuracySelf-reported
87.4%

Multimodal

Working with images and visual data
MathVista
Pass@1 AI system performance is often evaluated by measuring the fraction of samples that are solved correctly on the first attempt. This captures whether the model is able to get the right answer right away, but does not allow the model to refine its solution through multiple attempts, which may be a more realistic measure of the model's usefulness for challenging problems that require exploration. To compute Pass@1, we run the model once on each test sample and measure the percentage of samples where the model's response produces the correct answer. Pass@1 is a straightforward benchmark metric that has been widely used in prior work to assess the reasoning and problem-solving abilities of large language models. AI: Pass@1 Pass@1 test Pass@1 —Self-reported
74.9%
MMMU
Pass@1 — Pass@1 (100 200) Pass@1 : Pass@1 ≥ 1 - (1 - c/n)^n n — c — HumanEval, 100 40 : Pass@1 ≥ 1 - (1 - 40/100)^100 = 1 - (1 - 0.4)^100 = 1 - 0.6^100 ≈ 1 - 10^(-22) ≈ 1Self-reported
70.0%

Other Tests

Specialized benchmarks
AIME 2024
Pass@1 Pass@1 — : 1. (200) 2. 3. Pass@1 = Σ(i ) / Pass@k, k Pass@1 Pass@1 HumanEval MBPP, Codex Code LlamaSelf-reported
77.5%
C-Eval
AI-powered systems: LLMs can solve complex problems like a human expert would, by reasoning through them step-by-step. But how can we know if the final answer is correct? Problems in domains like math have exact answers, and we can check if the model's answer matches the correct one. Method details: This approach simply compares the model's final answer with the known correct answer. If they match exactly, the model is marked as correct. Advantages: - Simple to implement - Works well for problems with unique answers - Objective assessment with no human judgment required Limitations: - Very sensitive to formatting differences - May penalize valid alternative expressions or notations - Can't assess reasoning quality or alternative approaches - Often misses near-correct answers When to use: Best for problems with clear, unambiguous answers that can be standardized in form, like multiple choice questions or specific numerical answersSelf-reported
88.3%
CLUEWSC
AI: «1789» "correct"Self-reported
91.4%
IFEval
AI: ChatGPT processes these math problems by analyzing the equations and solving them step by step, similar to how a human would. It identifies the mathematical concepts involved (like calculus, algebra, or geometry), applies relevant formulas and theorems, and works through the solution methodically. For example, when faced with an integral or differential equation, ChatGPT breaks down the problem into manageable parts, applies standard techniques like substitution or integration by parts, and carries out the calculations carefully to arrive at the final answer. The model can handle a wide range of mathematical tasks, from basic arithmetic to more complex problems involving multiple variables, though its performance on very advanced mathematics may vary. When solving problems, ChatGPT shows its work by explaining each step of the reasoning process, which helps users understand how it arrived at the solutionSelf-reported
87.2%
LiveCodeBench v5 24.12-25.2
Pass@1 Pass@1 accuracy Pass@1 Pass@k k > 1 Pass@1. Chen et al. (2021), Pass@1, Pass@k, : Pass@1 ≈ Pass@k × (1 / k) k c c/k. Pass@1Self-reported
62.5%
MATH-500
AI: ChatGPT-4o Reference: Exact Match Specific Information: - Test if the AI's answer exactly matches the reference answer. - Example: If reference = "42", AI answer = "42" would be correct, but AI answer = "forty-two" would be incorrect. - Useful for: Factual questions with definitive answers, calculations, dates, names. - Limitations: Strict matching doesn't account for semantically equivalent answers expressed differently. Scoring Protocol: 1 = Answer exactly matches the reference 0 = Answer doesn't exactly match the referenceSelf-reported
96.2%

License & Metadata

License
proprietary
Announcement Date
January 20, 2025
Last Updated
July 19, 2025

Similar Models

All Models

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.