Grok-1.5

Name: Grok-1.5
Author: xAI

xAI

An advanced language model with improved reasoning capabilities, particularly excelling in coding and math tasks. Features a 128K token context window and enhanced problem-solving abilities compared to its predecessor.

Key Specifications

Parameters

Context

Release Date

March 28, 2024

Average Score

63.9%

Repository Results Blog

Timeline

Key dates in the model's history

Announcement

March 28, 2024

Last Update

July 19, 2025

Today

March 25, 2026

Technical Specifications

Parameters

Training Tokens

Knowledge Cutoff

Family

Capabilities

MultimodalZeroEval

Benchmark Results

Model performance metrics across various tests and benchmarks

General Knowledge

Tests on general knowledge and understanding

MMLU

5-shot • Self-reported

81.3%

Programming

Programming skills tests

HumanEval

Question: functions $\frac{1}{2x \cdot \sqrt{x^2 + 1}}$. In order to functions $\frac{1}{2x \cdot \sqrt{x^2 + 1}}$, I method First I $x^2 + 1 = u$, $x^2 = u - 1$ and $2x \, dx = du$. $dx = \frac{du}{2x}$. this in : $\int \frac{1}{2x \cdot \sqrt{x^2 + 1}} \, dx = \int \frac{1}{2x \cdot \sqrt{u}} \cdot \frac{du}{2x} = \int \frac{1}{4x^2 \cdot \sqrt{u}} \, du$ Since $x^2 = u - 1$, we receive: $\int \frac{1}{4(u-1) \cdot \sqrt{u}} \, du$ This $\int \frac{1}{(u-1) \cdot \sqrt{u}} \, du$, which can solve or using by I error. Let us other approach. $\sqrt{x^2 + 1} = t$, $x^2 + 1 = t^2$ and $x^2 = t^2 - 1$. we receive $2x \, dx = 2t \, dt$, $dx = \frac{t \, dt}{x}$. from $x^2 = t^2 - 1$ should, that $x = \sqrt{t^2 - 1}$ (since $x > 0$ in context tasks). in : $\int \frac{1}{2x \cdot \sqrt{x^2 + 1}} \, dx = \int \frac{1}{2x \cdot t} \cdot \frac{t \, dt}{x} = \int \frac{1}{2x^2} \, dt$ Using $x^2 = t^2 - 1$, we receive: $\int \frac{1}{2(t^2 - 1)} \, dt = \frac{1}{2} \int \frac{1}{t^2 - 1} \, dt$ $\int \frac{1}{t^2 - 1} \, dt$ can with help method : \(\frac{ • Self-reported

74.1%

Mathematics

Mathematical problems and computations

GSM8k

8-shot • Self-reported

90.0%

MATH

4-shot • Self-reported

50.6%

Reasoning

Logical reasoning and analysis

GPQA

0-shot This method means, that model performs task without any-or examples or instructions by tasks. This most and direct test abilities model, so how information, which receives model — this assignment, which necessary execute. Method 0-shot especially useful for evaluation capabilities model, but can be for tasks, which require specific format answer or specific instructions, which not were explicitly • Self-reported

35.9%

Multimodal

Working with images and visual data

DocVQA

shot In this mode we ability model correctly answer on questions without examples. We question with simple give answer. This allows us better understand capabilities model answer on questions from field. Examples: - with 5, 5 and 6. - f(x) = x^3 + 2x^2 - 5x + 7. - Solve equation 3x + 5 = 2x - 7 • Self-reported

85.6%

MathVista

0-shot In model testing 0-shot model should solve task with only on its training, without any-or specific examples, solution similar problems. testing allows us evaluate ability model solve tasks without special instructions. This, how most complex for model way testing, since it not is provided no/none additional prompts or context for solutions problems. In context our 0-shot testing basic capabilities model in field mathematical reasoning and her/its ability preliminarily obtained knowledge on new tasks without additional training • Self-reported

52.8%

MMMU

analysis models artificial intelligence, Anthropic, with on performance model Claude 3 Opus at solving tasks. I Claude 3 Opus with Claude 2 and Claude 3 Sonnet, in order to evaluate their performance at solving mathematical tasks and understand, how well was improvement from Claude 2 to Claude 3 Opus. Using tasks and instruction ("Solve step-by-step"), I queries all models. Results showed, that Claude 3 Opus significantly outperforms how Claude 2, so and Claude 3 Sonnet by accuracy solutions complex tasks. : - Claude 2 errors in test tasks and solutions - Claude 3 Sonnet some improvements by comparison with Claude 2, but all still errors in complex tasks - Claude 3 Opus correctly practically all tasks with and These results Anthropic about mathematical abilities in new models Claude 3, especially in model Opus • Self-reported

53.6%

Other Tests

Specialized benchmarks

MMLU-Pro

0-shot Approach «with training» means, that model not no/none examples execution specific tasks before her/its solution. Instead this it give only instructions with tasks, which need to execute. For example, model could would obtain instruction «is whether statement or » without any-or examples or Approach with training especially important at evaluation abilities language models, since he measures their understanding and generalization, and not simply ability follow from examples. This also makes evaluation more so how in real scenarios use model not receive examples in advance. When language model, such how GPT-4, are evaluated in mode with training, this shows, how well well they can apply its general knowledge and understanding to new tasks without additional training or settings • Self-reported

51.0%

License & Metadata

License

proprietary

Announcement Date

March 28, 2024

Last Updated

July 19, 2025

Similar Models

All Models

Grok Code Fast 1

xAI

Released:Aug 2025

Price:$0.20/1M tokens

Mercury 2

Inception

Best score:0.7 (GPQA)

Released:Feb 2026

Gemini Diffusion

Google

Best score:0.9 (HumanEval)

Released:May 2025

Qwen3 Max

Alibaba

Best score:0.6 (GPQA)

Released:Dec 2025

Grok-3 Mini

xAI

Best score:0.8 (GPQA)

Released:Feb 2025

Price:$0.30/1M tokens

Grok-3

xAI

Best score:0.8 (GPQA)

Released:Feb 2025

Price:$3.00/1M tokens

Grok-4 Heavy

xAI

Best score:0.9 (GPQA)

Released:Jul 2025

Grok 4 Fast

xAI

Best score:0.9 (GPQA)

Released:Aug 2025

Price:$0.20/1M tokens

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.