xAI logo

Grok-2

Multimodal
xAI

Grok-2 is a state-of-the-art language model with cutting-edge reasoning capabilities, featuring advanced abilities in chat, coding, and logical reasoning. It demonstrates superior performance in visual math reasoning, document question answering, and outperforms other models across various academic benchmarks including logical reasoning, reading comprehension, math, and science.

Key Specifications

Parameters
-
Context
128.0K
Release Date
August 13, 2024
Average Score
76.5%

Timeline

Key dates in the model's history
Announcement
August 13, 2024
Last Update
July 19, 2025
Today
March 25, 2026

Technical Specifications

Parameters
-
Training Tokens
-
Knowledge Cutoff
-
Family
-
Capabilities
MultimodalZeroEval

Pricing & Availability

Input (per 1M tokens)
$2.00
Output (per 1M tokens)
$10.00
Max Input Tokens
128.0K
Max Output Tokens
8.0K
Supported Features
Function CallingStructured OutputCode ExecutionWeb SearchBatch InferenceFine-tuning

Benchmark Results

Model performance metrics across various tests and benchmarks

General Knowledge

Tests on general knowledge and understanding
MMLU
accuracySelf-reported
87.5%

Programming

Programming skills tests
HumanEval
Pass@1 - this metric for evaluation algorithms, used for measurement efficiency language models (LLM) in solving tasks. Metric represents itself probability correct solutions tasks with first attempts. For Pass@1 model solve set tasks. For each tasks model generates n various solutions. If although would one from n solutions correctly, is considered, that model capable solve task. Pass@1 evaluates probability that, that first solution will correct, using for from set solutions. Pass@1 is metric, since she/it: 1. Allows exactly evaluate ability model solve tasks with first attempts 2. more evaluation performance by comparison with direct one solutions 3. compare different model by their efficiency solutions tasks This metric often is used in research by for evaluation performance LLM in such fields, how programming, mathematical reasoning and solution complex tasksSelf-reported
88.4%

Mathematics

Mathematical problems and computations
MATH
maj@1 AI: I answers with 1 model. Then I process generation 2 times, total 3 answer. answer is determined by means of choice most often answer (). In case one from answers mannerSelf-reported
76.1%

Reasoning

Logical reasoning and analysis
GPQA
accuracySelf-reported
56.0%

Multimodal

Working with images and visual data
DocVQA
AccuracySelf-reported
93.6%
MathVista
accuracySelf-reported
69.0%
MMMU
accuracySelf-reported
66.1%

Other Tests

Specialized benchmarks
MMLU-Pro
AccuracySelf-reported
75.5%

License & Metadata

License
proprietary
Announcement Date
August 13, 2024
Last Updated
July 19, 2025

Similar Models

All Models

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.