Anthropic logo

Claude 3.5 Sonnet

Multimodal
Anthropic

Claude 3.5 Sonnet is a powerful AI model with industry-leading software development skills. It excels at coding, planning, and problem-solving, demonstrating significant improvements in agentic coding and tool use. The model includes computer use capabilities in public beta, enabling it to interact with computer interfaces like a human user.

Key Specifications

Parameters
-
Context
200.0K
Release Date
October 22, 2024
Average Score
73.3%

Timeline

Key dates in the model's history
Announcement
October 22, 2024
Last Update
July 19, 2025
Today
March 25, 2026

Technical Specifications

Parameters
-
Training Tokens
-
Knowledge Cutoff
-
Family
-
Capabilities
MultimodalZeroEval

Pricing & Availability

Input (per 1M tokens)
$3.00
Output (per 1M tokens)
$15.00
Max Input Tokens
200.0K
Max Output Tokens
200.0K
Supported Features
Function CallingStructured OutputCode ExecutionWeb SearchBatch InferenceFine-tuning

Benchmark Results

Model performance metrics across various tests and benchmarks

General Knowledge

Tests on general knowledge and understanding
MMLU
5-shot CoT AI: I I will use 5-shot Chain-of-Thought (5-shot CoT) for training model solving tasks with help chains reasoning. I 5 examples solutions tasks, step-by-step reasoning. How this works: 1. I 5 tasks, on problem, with solution for each 2. Each example contains task and reasoning, to answer 3. After demonstration these examples I model new task 4. Model should apply process reasoning to new task Advantages: - demonstrates process reasoning, and not only answers - model "aloud" at solving problems - More effectively, than simple "solve step by step" - Allows model understand structure solutions 5-shot CoT especially efficient for mathematical tasks, logical puzzles and tasks, requiring reasoning. several examples with detailed reasoning, I model process, for solutions similar tasksSelf-reported
90.4%

Programming

Programming skills tests
HumanEval
0-shot AI: Zero-Shot In case zero-shot tasks model without any-or examples or prompts. Model should execute task, only on its knowledge, in time preliminary training. For example, at solving mathematical tasks model receives only task, but not examples solutions similar tasks. : basic knowledge and abilities model without additional training. : Performance usually than at few-shot approachSelf-reported
93.7%
SWE-Bench Verified
Standard AI: Translate on Russian language following text method analysis. ONLY translation, without quotes, without without explanationsSelf-reported
49.0%

Mathematics

Mathematical problems and computations
GSM8k
0-shot CoT When use 0-shot CoT (reasoning by chain) model think step for step about that, how solve problem, even not examples such reasoning. This prompts, such how "Let's let's think about this step for step" to query. Research showed, that such prompt significantly improves ability language models solve tasks reasoning by comparison with direct answer on question, not to reasoning. Although 0-shot CoT few-shot CoT, where model are provided samples step-by-step reasoning, method all substantially performance without necessity in additional examples. This method especially efficient for more large language models, which already ability to reasoning, but can not apply this ability without make thisSelf-reported
96.4%
MATH
StandardSelf-reported
78.3%
MGSM
0-shot CoT Method "chain thinking" without preliminary examples (0-shot Chain-of-Thought) represents itself approach, at which model solves task, her/its on sequential steps reasoning, not at this access to examples such reasoning in advance. In this approach model generate intermediate reasoning, which to answer, but makes this without demonstration examples that, how should look chain thinking. Usually method by means of in query phrases "Let's let's think step for step" or prompts, which model on reasoning. This encourages model perform step-by-step reasoning and its that often leads to more exact results by comparison with answerSelf-reported
91.6%

Reasoning

Logical reasoning and analysis
BIG-Bench Hard
3-shot CoT In given approach we standard method Chain-of-Thought (CoT), providing model several examples (usually three) with reasoning for solutions tasks. Such approach "few-shot CoT", where "few-shot" number examples, and "CoT" indicates on chains reasoning. When model receives new task, she/it can on these examples, in order to process reasoning in common option includes three example, therefore we its "3-shot CoT". method 3-shot CoT in that, that he not requires complex instructions or query - sufficiently simply provide examples solutions. This especially useful for mathematical and logical tasks, where step-by-step reasoning critically important for obtaining correct answerSelf-reported
93.1%
DROP
3-shot F1 Score AI: 3-shot F1 ScoreSelf-reported
87.1%
GPQA
Maj@32 5-shot CoT This method for improvement performance models at solving tasks logical output and decision-making solutions. He combines several approaches: 1. **Chain reasoning (Chain-of-Thought)**: Model solution complex tasks on sequence intermediate steps, process thinking. 2. **Few-shot examples**: Models is provided several (in given case 5) examples with correct reasoning and answers, that helps it better understand format solutions. 3. **(Majority voting)**: Model generates set independent solutions for one tasks (in given case 32), and then answer, which total. This approach significantly accuracy at solving complex tasks, since: - Chain reasoning process solutions - Few-shot examples model in format - errors in attempts Maj@32 5-shot CoT especially efficient for mathematical tasks, logical puzzles and tasks, requiring reasoningSelf-reported
67.2%

Multimodal

Working with images and visual data
AI2D
testSelf-reported
94.7%
ChartQA
test, accuracySelf-reported
90.8%
DocVQA
test, evaluation ANLSSelf-reported
95.2%
MathVista
testminiSelf-reported
67.7%
MMMU
Standard evaluationSelf-reported
68.3%

Other Tests

Specialized benchmarks
MMLU-Pro
5-shot method (few-shot) prompting for training model task. We we provide model 5 examples solutions mathematical tasks with steps. This allows model understand format solutions and apply approach to new task without additional settings. In this method context includes 5 examples solutions, for which should task for solutions. Model should follow that indeed format and reasoning, that and in examplesSelf-reported
77.6%
OSWorld Extended
In standard mode we we evaluate model in that form, in which she/it usually is used in real situations. Model receives prompt without instructions about that, how to solving tasks. This basic mode allows us measure performance modelSelf-reported
22.0%
OSWorld Screenshot-only
StandardSelf-reported
14.9%
TAU-bench Airline
Standard In this approach model directly generates solutions to tasks without any-or instructions. This also for comparison at various methods prompts. In its we used following format prompts: ``` task: [task] Please, task step for step. ``` However for some tasks we format, in order to follow specific instructions, in data. For example, for tasks from GPQA we used format: ``` [task] ```Self-reported
46.0%
TAU-bench Retail
Standard AI: task, aboveSelf-reported
69.2%

License & Metadata

License
proprietary
Announcement Date
October 22, 2024
Last Updated
July 19, 2025

Similar Models

All Models

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.