Grok-2

Name: Grok-2
Author: xAI

Multimodal

xAI

Grok-2 is a state-of-the-art language model with cutting-edge reasoning capabilities, featuring advanced abilities in chat, coding, and logical reasoning. It demonstrates superior performance in visual math reasoning, document question answering, and outperforms other models across various academic benchmarks including logical reasoning, reading comprehension, math, and science.

Key Specifications

Parameters

Context

128.0K

Release Date

August 13, 2024

Average Score

76.5%

Results Blog

Timeline

Key dates in the model's history

Announcement

August 13, 2024

Last Update

July 19, 2025

Today

July 7, 2026

Technical Specifications

Parameters

Training Tokens

Knowledge Cutoff

Family

Capabilities

MultimodalZeroEval

Pricing & Availability

Input (per 1M tokens)

$2.00

Output (per 1M tokens)

$10.00

Max Input Tokens

128.0K

Max Output Tokens

8.0K

Supported Features

Function CallingStructured OutputCode ExecutionWeb SearchBatch InferenceFine-tuning

Benchmark Results

Model performance metrics across various tests and benchmarks

General Knowledge

Tests on general knowledge and understanding

MMLU

accuracy • Self-reported

87.5%

Programming

Programming skills tests

HumanEval

Pass@1 - this metric for evaluation algorithms, used for measurement efficiency language models (LLM) in solving tasks. Metric represents itself probability correct solutions tasks with first attempts. For Pass@1 model solve set tasks. For each tasks model generates n various solutions. If although would one from n solutions correctly, is considered, that model capable solve task. Pass@1 evaluates probability that, that first solution will correct, using for from set solutions. Pass@1 is metric, since she/it: 1. Allows exactly evaluate ability model solve tasks with first attempts 2. more evaluation performance by comparison with direct one solutions 3. compare different model by their efficiency solutions tasks This metric often is used in research by for evaluation performance LLM in such fields, how programming, mathematical reasoning and solution complex tasks • Self-reported

88.4%

Mathematics

Mathematical problems and computations

MATH

maj@1 AI: I answers with 1 model. Then I process generation 2 times, total 3 answer. answer is determined by means of choice most often answer (). In case one from answers manner • Self-reported

76.1%

Reasoning

Logical reasoning and analysis

GPQA

accuracy • Self-reported

56.0%

Multimodal

Working with images and visual data

DocVQA

Accuracy • Self-reported

93.6%

MathVista

accuracy • Self-reported

69.0%

MMMU

accuracy • Self-reported

66.1%

Other Tests

Specialized benchmarks

MMLU-Pro

Accuracy • Self-reported

75.5%

License & Metadata

License

proprietary

Announcement Date

August 13, 2024

Last Updated

July 19, 2025

Similar Models

All Models

Grok-2 mini

xAI

Best score:0.9 (MMLU)

Released:Aug 2024

Grok-3 Mini

xAI

Best score:0.8 (GPQA)

Released:Feb 2025

Price:$0.30/1M tokens

Grok-3

xAI

Best score:0.8 (GPQA)

Released:Feb 2025

Price:$3.00/1M tokens

Grok-1.5V

xAI

Released:Apr 2024

Grok 4.20

xAI

Released:Mar 2026

Grok-4.1 Fast Non-Reasoning

xAI

Released:Nov 2025

Price:$0.20/1M tokens

Grok-4.1 Fast Reasoning

xAI

Released:Nov 2025

Price:$0.20/1M tokens

Grok-4 Fast Non-Reasoning

xAI

Released:Aug 2025

Price:$0.20/1M tokens

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.