Claude 3 Opus
MultimodalClaude 3 Opus is Anthropic's most intelligent model with market-leading performance on highly complex tasks. It can handle open-ended prompts and unforeseen scenarios with remarkable fluency and human-like understanding, demonstrating the cutting edge of generative AI.
Key Specifications
Parameters
-
Context
200.0K
Release Date
February 29, 2024
Average Score
81.6%
Timeline
Key dates in the model's history
Announcement
February 29, 2024
Last Update
July 19, 2025
Today
March 26, 2026
Technical Specifications
Parameters
-
Training Tokens
-
Knowledge Cutoff
-
Family
-
Capabilities
MultimodalZeroEval
Pricing & Availability
Input (per 1M tokens)
$15.00
Output (per 1M tokens)
$75.00
Max Input Tokens
200.0K
Max Output Tokens
200.0K
Supported Features
Function CallingStructured OutputCode ExecutionWeb SearchBatch InferenceFine-tuning
Benchmark Results
Model performance metrics across various tests and benchmarks
General Knowledge
Tests on general knowledge and understanding
HellaSwag
10-shot Method 10-shot (10 examples) represents itself approach prompting, at which LLM (language model) is provided 10 examples, task and format answer, before than model execute assignment. In difference from other methods prompting with number examples, 10-shot ensures between context for understanding model and Usually examples thoroughly in order to diverse cases or scenarios, with which model can 10-shot especially useful for: • solving complex tasks with specific • various logical or approaches to solving • when prompts 0-shot or few-shot • and analysis for answers model When evaluation with using methodology 10-shot model shows, how she/it can patterns from examples and apply their to new task, that often gives results above, than at with number examples • Self-reported
MMLU
5-shot In our research we first model several examples solutions tasks, and then we evaluate, how well well model can apply then, that she/it from these examples, to new tasks. model should solve several mathematical puzzles. We we provide model 5 (task, solution), and then we ask her/its solve new task. This approach allows us evaluate ability model new templates and strategies solutions tasks from numbers examples. Evaluation on 10 various sets tasks, where each set contains 5 examples and 1 task. We we evaluate how correctness final answer, so and reasoning, to answer. This method allows us understand, how well well model can adapt to new tasks without preliminary training on large sets similar tasks. This important for evaluation model and her/its abilities to evaluation represents itself percentage correctly solved test tasks by all 10 • Self-reported
Programming
Programming skills tests
HumanEval
Task without examples 0-shot means provision model only instructions for execution tasks without any-or examples, data and results. This means, that model should rely only on its preliminarily trained knowledge for interpretation tasks and generation answer. For example, if you in order to model about query in 0-shot will simply: "about ". Model should understand query and generate without examples that, which you 0-shot — this for with models LLM, and usually this first approach, which should before than to more complex prompts • Self-reported
Mathematics
Mathematical problems and computations
GSM8k
**0-shot CoT** Zero-shot Chain-of-Thought (0-shot CoT) — this method, which encourages model perform reasoning at solving tasks without necessity use examples. In difference from queries, where at model answer, method 0-shot CoT encourages model "think step by step", in order to she/it its course thoughts before that, how give answer. This method usually by means of simple prompts, such how "Let's let's solve this step for step" or "Let's let's think ", after descriptions tasks. Such approach allows models generate intermediate steps reasoning, that often leads to more high accuracy, especially in tasks, requiring several steps computations or logical conclusions. Research showed, that application 0-shot CoT can significantly improve performance language models in various tasks, including and tasks general output, at this not no/none examples or additional training • Self-reported
MATH
Zero-shot Chain-of-Thought (0-shot CoT) - this technique, which encourages model step-by-step reasoning without examples. This method was first presented in work Kojima et al., "Large Language Models are Zero-Shot Reasoners" (2022). In difference from few-shot CoT, which requires examples reasoning, 0-shot CoT uses simple prompts, such how "Let's let's think step for step" or "Let's let's solve this problem, by ". These phrases model generate chain intermediate reasoning before final answer. 0-shot CoT consists in its and efficiency without necessity creation examples reasoning for each tasks. Research showed, that even such simple prompts can significantly improve performance model in tasks, requiring reasoning, especially in mathematical and logical tasks. Although 0-shot CoT not so efficient, how few-shot CoT in complex tasks, he represents itself when examples or when is required application to tasks • Self-reported
MGSM
0-shot AI: about mathematics. I from mathematics, task and solution. I I will consider how so and "". Task: At us is 10 from 1 to 10. We manner we choose 4 probability that, that among selected will although would one with more 8? Solution: In order to find probability "among selected is although would one with more 8", I probability "all have not more 8", and then this probability from 1. number ways choose 4 from 10 C(10,4) = 10!/(4!×6!) = 210. number ways choose 4 only from with from 1 to 8. This C(8,4) = 8!/(4!×4!) = 70. manner, probability that, that all have not more 8, 70/210 = 1/3. probability that, that among selected is although would one with more 8, 1 - 1/3 = 2/3. Answer: 2/3 • Self-reported
Reasoning
Logical reasoning and analysis
BIG-Bench Hard
3-shot CoT Method reasoning by chain (Chain-of-Thought, CoT) with three examples. This approach CoT-method, providing model three example that, how break down complex tasks on sequential steps reasoning. Each example demonstrates process step-by-step solutions, that helps model structure reasoning. When 3-shot CoT to new task model should format, solution on logical stages, that especially useful for mathematical and logical tasks. This method requires on and shows improvement performance by comparison with more and reasoning • Self-reported
DROP
3-shot, F1 Score • Self-reported
GPQA
0-shot CoT - Diamond AI: ChatGPT-4o Reviewer: Anthropic Claude 3 Opus For evaluation reasoning model in solving tasks about was method 0-shot Chain of Thought (CoT), when model solves task without examples reasoning. In each task only instruction "Let's think step by step" in end text tasks. This encourages model perform sequential reasoning instead that, in order to immediately give answer. Using standard approach 0-shot CoT, we evaluate ability model to reasoning without provision examples or prompts about specific in answer. This approach for analysis that, how model rules about and applies their in different scenarios, that gives representation about her/its basic to reasoning and • Self-reported
Other Tests
Specialized benchmarks
ARC-C
25-shot • Self-reported
MMLU-Pro
0-shot CoT Chain-of-thought (CoT) — this method, which model intermediate reasoning before answer. Model to solving problems, that allows it track complex tasks. prompts for chains reasoning query type "let's think step by step", model perform output at solving tasks. In difference from few-shot CoT, where examples chains reasoning, 0-shot CoT not provides such examples. Efficiency 0-shot CoT can substantially in dependency from tasks, model and specific prompts, to Although few-shot CoT often gives more results, 0-shot CoT can be method for specific types tasks, especially when examples or their difficult • Self-reported
License & Metadata
License
proprietary
Announcement Date
February 29, 2024
Last Updated
July 19, 2025
Similar Models
All ModelsClaude 3.5 Sonnet
Anthropic
MM
Best score:0.9 (HumanEval)
Released:Oct 2024
Price:$3.00/1M tokens
Claude Haiku 4.5
Anthropic
MM
Best score:0.8 (TAU)
Released:Oct 2025
Price:$1.00/1M tokens
Claude Opus 4.5
Anthropic
MM
Best score:0.9 (TAU)
Released:Nov 2025
Price:$5.00/1M tokens
Claude 3.5 Sonnet
Anthropic
MM
Best score:0.9 (HumanEval)
Released:Jun 2024
Price:$3.00/1M tokens
Claude Sonnet 4.5
Anthropic
MM
Best score:0.9 (TAU)
Released:Sep 2025
Price:$3.00/1M tokens
Claude Opus 4.1
Anthropic
MM
Best score:0.8 (TAU)
Released:Aug 2025
Price:$15.00/1M tokens
Claude Opus 4
Anthropic
MM
Best score:0.8 (GPQA)
Released:May 2025
Price:$15.00/1M tokens
Claude 3.7 Sonnet
Anthropic
MM
Best score:0.8 (GPQA)
Released:Feb 2025
Price:$3.00/1M tokens
Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.