Key Specifications
Parameters
-
Context
128.0K
Release Date
November 20, 2024
Average Score
67.0%
Timeline
Key dates in the model's history
Announcement
November 20, 2024
Last Update
July 19, 2025
Today
March 25, 2026
Technical Specifications
Parameters
-
Training Tokens
-
Knowledge Cutoff
-
Family
-
Capabilities
MultimodalZeroEval
Pricing & Availability
Input (per 1M tokens)
$0.03
Output (per 1M tokens)
$0.14
Max Input Tokens
128.0K
Max Output Tokens
128.0K
Supported Features
Function CallingStructured OutputCode ExecutionWeb SearchBatch InferenceFine-tuning
Benchmark Results
Model performance metrics across various tests and benchmarks
General Knowledge
Tests on general knowledge and understanding
MMLU
0-shot Chain-of-Thought
AI: 0-shot Chain-of-Thought • Self-reported
Programming
Programming skills tests
HumanEval
pass@1 accuracy AI: GPT-4 Technical Report • Self-reported
Mathematics
Mathematical problems and computations
GSM8k
0-shot Chain-of-Thought AI: 0-shot • Self-reported
MATH
0-shot Chain-of-Thought
AI: 0-shot Chain-of-Thought • Self-reported
Reasoning
Logical reasoning and analysis
DROP
6-shot Chain-of-Thought 6 (6-shot Chain-of-Thought) — : 1. Chain-of-Thought (CoT): 2. Few-shot learning: (6) 6-shot CoT : - accuracy : - • Self-reported
GPQA
0-shot Chain-of-Thought
AI: 0-shot Chain-of-Thought • Self-reported
Other Tests
Specialized benchmarks
ARC-C
0-shot AI: ChatGPT : "2x² + 3x - 2 = 0", : [] AI: [] • Self-reported
BBH
3-shot Chain-of-Thought Chain-of-Thought (CoT) — 3-shot CoT 3 3-shot CoT • Self-reported
BFCL
accuracy • Self-reported
CRAG
accuracy • Self-reported
FinQA
0-shot accuracy • Self-reported
IFEval
0-shot AI: : [] AI: [] • Self-reported
SQuALITY
ROUGE-L ROUGE-L (LCS) LCS n-ROUGE-L n- • Self-reported
Translation en→Set1 COMET22
COMET22 (Conceptual understanding and multi-step explanation generation using transformers) COMET22 LLM COMET22 COMET22 thinking. Score LLM COMET22 • Self-reported
Translation en→Set1 spBleu
spBleu BLEU. spBleu BLEU, spBleu spBleu evaluationBLEU • Self-reported
Translation Set1→en COMET22
COMET22 thinking (COMET22) — 22 (). COMET22, (1) (2) (3) (GPT-4, Claude, PaLM-2) (Falcon, Llama-2, Mistral). (28% 25%). (33%) (35%), (23%). COMET22 COMET22 • Self-reported
Translation Set1→en spBleu
spBleu spBleu (BLEU) — BLEU, BLEU, n-spBleu : 1) 2) () 3) • Self-reported
License & Metadata
License
proprietary
Announcement Date
November 20, 2024
Last Updated
July 19, 2025
Similar Models
All ModelsNova Lite
Amazon
MM
Best score:0.9 (ARC)
Released:Nov 2024
Price:$0.06/1M tokens
Nova Pro
Amazon
MM
Best score:0.9 (ARC)
Released:Nov 2024
Price:$0.80/1M tokens
GPT-4 Turbo
OpenAI
Best score:0.9 (HumanEval)
Released:Apr 2024
Price:$10.00/1M tokens
o1-mini
OpenAI
Best score:0.9 (HumanEval)
Released:Sep 2024
Price:$3.00/1M tokens
o1
OpenAI
Best score:0.9 (MMLU)
Released:Dec 2024
Price:$15.00/1M tokens
KAT-Coder-Pro V1
KwaiKAT
Released:Mar 2026
MiniMax M2.1
MiniMax
Best score:0.8 (GPQA)
Released:Dec 2025
Price:$0.30/1M tokens
Grok Code Fast 1
xAI
Released:Aug 2025
Price:$0.20/1M tokens
Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.