Nova Pro
MultimodalAmazon Nova Pro is a high-performance multimodal model that balances accuracy, speed, and cost for a wide range of tasks. It processes text, images, and video inputs, and supports agentic workflows. Ideal for complex enterprise tasks requiring in-depth analysis, multi-step reasoning, and content generation.
Key Specifications
Parameters
-
Context
300.0K
Release Date
November 20, 2024
Average Score
73.2%
Timeline
Key dates in the model's history
Announcement
November 20, 2024
Last Update
July 19, 2025
Today
March 25, 2026
Technical Specifications
Parameters
-
Training Tokens
-
Knowledge Cutoff
-
Family
-
Capabilities
MultimodalZeroEval
Pricing & Availability
Input (per 1M tokens)
$0.80
Output (per 1M tokens)
$3.20
Max Input Tokens
300.0K
Max Output Tokens
300.0K
Supported Features
Function CallingStructured OutputCode ExecutionWeb SearchBatch InferenceFine-tuning
Benchmark Results
Model performance metrics across various tests and benchmarks
General Knowledge
Tests on general knowledge and understanding
MMLU
0-shot Chain-of-Thought
AI: 0-shot Chain-of-Thought • Self-reported
Programming
Programming skills tests
HumanEval
0-shot pass@1 AI: ** • Self-reported
Mathematics
Mathematical problems and computations
GSM8k
0-shot Chain-of-Thought
AI: 0-shot Chain-of-Thought • Self-reported
MATH
0-shot Chain-of-Thought
AI: 0-shot Chain-of-Thought • Self-reported
Reasoning
Logical reasoning and analysis
DROP
## 0-shot 0-shot AI, few-shot () fine-tuning (). 0-shot evaluation : 1. 2. 3. 0-shot : - "", : - 0-shot (GPQA, MATH), (HumanEval) • Self-reported
GPQA
6-shot Chain-of-Thought AI: 6 (chain-of-thought) • Self-reported
Multimodal
Working with images and visual data
ChartQA
relaxed accuracy • Self-reported
DocVQA
ANLS (ANLS) - ANLS (LCS) ANLS : ANLS = (LCS()) / ((), ()) LCS - ANLS 0 1, 1 0 ANLS • Self-reported
MMMU
AI: : Chain-of-thought (CoT) prompting is a technique that helps large language models (LLMs) tackle challenging problems by breaking down their reasoning into manageable steps. First introduced in the paper "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models" (Wei et al., 2022), CoT prompting has become one of the most important techniques for improving the reasoning capabilities of LLMs. CoT prompting can be implemented in various ways: - Few-shot CoT: The prompt includes examples that demonstrate step-by-step reasoning for similar problems - Zero-shot CoT: The model is simply instructed to "think step by step" with no examples provided - Self-consistency with CoT: The model generates multiple reasoning paths and selects the most consistent answer • Self-reported
Other Tests
Specialized benchmarks
ARC-C
0-shot Chain-of-Thought
AI: 0-shot Chain-of-Thought • Self-reported
BBH
3-shot Chain-of-Thought • Self-reported
BFCL
accuracy • Self-reported
CRAG
accuracy • Self-reported
EgoSchema
accuracy • Self-reported
FinQA
0-shot accuracy • Self-reported
GroundUI-1K
accuracy • Self-reported
IFEval
# 0-shot • Self-reported
LVBench
accuracy • Self-reported
MM-Mind2Web
Accuracy AI: 1 step accuracy, in %, [n]: The percentage of n-step reasoning traces that are correct at each step. For example, the proportion of n-step reasoning traces that get step 1 right, the proportion that get step 2 right given that step 1 is right, etc. This metric is useful for identifying where models make errors in multi-step reasoning, and how error rates change along the course of a reasoning trace • Self-reported
SQuALITY
ROUGE-L ROUGE-L (Recall-Oriented Understudy for Gisting Evaluation ) (LCS). LCS — — "", — "", LCS "" (3 ). ROUGE-L : (Recall) = _LCS / __Accuracy (Precision) = _LCS / __F-= (1 + β²) × (× Accuracy) / (β² × + Accuracy) β 1. ROUGE-L • Self-reported
TextVQA
weighted accuracy • Self-reported
Translation en→Set1 COMET22
COMET22 Score • Self-reported
Translation en→Set1 spBleu
spBleu BLEU. BLEU, spBleu (AST) (AST), AST "int foo(int a)" "INT_TYPE IDENTIFIER(INT_TYPE IDENTIFIER)". spBleu N-BLEU-spBleu • Self-reported
Translation Set1→en COMET22
COMET22 COMET22, COMET22 COMET22 • Self-reported
Translation Set1→en spBleu
spBleu BLEU BLEU spBleu "A B" "C D". — — • Self-reported
VATEX
CIDEr CIDEr (Consensus-based Image Description Evaluation) – : - n-accuracy, TF-IDF CIDEr n-(n-) • Self-reported
VisualWebBench
Standard evaluation • Self-reported
License & Metadata
License
proprietary
Announcement Date
November 20, 2024
Last Updated
July 19, 2025
Similar Models
All ModelsNova Lite
Amazon
MM
Best score:0.9 (ARC)
Released:Nov 2024
Price:$0.06/1M tokens
Nova Micro
Amazon
Best score:0.9 (ARC)
Released:Nov 2024
Price:$0.03/1M tokens
ERNIE 5.0
Baidu
MM
Best score:0.8 (GPQA)
Released:Jan 2025
Kimi-k1.5
Moonshot AI
MM
Best score:0.9 (MMLU)
Released:Jan 2025
Gemini 2.0 Flash Thinking
MM
Best score:0.7 (GPQA)
Released:Jan 2025
GPT-4
OpenAI
MM
Best score:1.0 (ARC)
Released:Jun 2023
Price:$30.00/1M tokens
GPT-4o
OpenAI
MM
Best score:0.9 (HumanEval)
Released:May 2024
Price:$2.50/1M tokens
Gemini 1.5 Pro
MM
Best score:0.9 (MMLU)
Released:May 2024
Price:$2.50/1M tokens
Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.