Grok-2
MultimodalGrok-2 is a state-of-the-art language model with cutting-edge reasoning capabilities, featuring advanced abilities in chat, coding, and logical reasoning. It demonstrates superior performance in visual math reasoning, document question answering, and outperforms other models across various academic benchmarks including logical reasoning, reading comprehension, math, and science.
Key Specifications
Parameters
-
Context
128.0K
Release Date
August 13, 2024
Average Score
76.5%
Timeline
Key dates in the model's history
Announcement
August 13, 2024
Last Update
July 19, 2025
Today
March 25, 2026
Technical Specifications
Parameters
-
Training Tokens
-
Knowledge Cutoff
-
Family
-
Capabilities
MultimodalZeroEval
Pricing & Availability
Input (per 1M tokens)
$2.00
Output (per 1M tokens)
$10.00
Max Input Tokens
128.0K
Max Output Tokens
8.0K
Supported Features
Function CallingStructured OutputCode ExecutionWeb SearchBatch InferenceFine-tuning
Benchmark Results
Model performance metrics across various tests and benchmarks
General Knowledge
Tests on general knowledge and understanding
MMLU
accuracy • Self-reported
Programming
Programming skills tests
HumanEval
Pass@1 - this metric for evaluation algorithms, used for measurement efficiency language models (LLM) in solving tasks. Metric represents itself probability correct solutions tasks with first attempts. For Pass@1 model solve set tasks. For each tasks model generates n various solutions. If although would one from n solutions correctly, is considered, that model capable solve task. Pass@1 evaluates probability that, that first solution will correct, using for from set solutions. Pass@1 is metric, since she/it: 1. Allows exactly evaluate ability model solve tasks with first attempts 2. more evaluation performance by comparison with direct one solutions 3. compare different model by their efficiency solutions tasks This metric often is used in research by for evaluation performance LLM in such fields, how programming, mathematical reasoning and solution complex tasks • Self-reported
Mathematics
Mathematical problems and computations
MATH
maj@1 AI: I answers with 1 model. Then I process generation 2 times, total 3 answer. answer is determined by means of choice most often answer (). In case one from answers manner • Self-reported
Reasoning
Logical reasoning and analysis
GPQA
accuracy • Self-reported
Multimodal
Working with images and visual data
DocVQA
Accuracy • Self-reported
MathVista
accuracy • Self-reported
MMMU
accuracy • Self-reported
Other Tests
Specialized benchmarks
MMLU-Pro
Accuracy • Self-reported
License & Metadata
License
proprietary
Announcement Date
August 13, 2024
Last Updated
July 19, 2025
Similar Models
All ModelsGrok-2 mini
xAI
MM
Best score:0.9 (MMLU)
Released:Aug 2024
Grok-3 Mini
xAI
MM
Best score:0.8 (GPQA)
Released:Feb 2025
Price:$0.30/1M tokens
Grok-3
xAI
MM
Best score:0.8 (GPQA)
Released:Feb 2025
Price:$3.00/1M tokens
Grok-1.5V
xAI
MM
Released:Apr 2024
Grok 4.20
xAI
MM
Released:Mar 2026
Grok-4.1 Fast Non-Reasoning
xAI
MM
Released:Nov 2025
Price:$0.20/1M tokens
Grok-4.1 Fast Reasoning
xAI
MM
Released:Nov 2025
Price:$0.20/1M tokens
Grok-4 Fast Non-Reasoning
xAI
MM
Released:Aug 2025
Price:$0.20/1M tokens
Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.