Gemini 3.1 Pro

Name: Gemini 3.1 Pro
Author: Google

Multimodal

Google

Gemini 3.1 Pro is the latest model in the Gemini 3 series. Excels at complex tasks requiring extensive world knowledge and advanced multimodal reasoning. Gemini 3.1 Pro uses dynamic thinking by default to process prompts and has a 1 million token input context window with 64k output tokens.

Key Specifications

Parameters

Context

1.0M

Release Date

February 19, 2026

Average Score

80.3%

API Documentation Results Blog

Timeline

Key dates in the model's history

Announcement

February 19, 2026

Last Update

February 20, 2026

Today

May 10, 2026

Technical Specifications

Parameters

Training Tokens

Knowledge Cutoff

January 1, 2025

Family

Capabilities

MultimodalZeroEval

Pricing & Availability

Input (per 1M tokens)

$2.50

Output (per 1M tokens)

$15.00

Max Input Tokens

1.0M

Max Output Tokens

65.5K

Supported Features

Function CallingStructured OutputCode ExecutionWeb SearchBatch InferenceFine-tuning

Benchmark Results

Model performance metrics across various tests and benchmarks

Programming

Programming skills tests

SWE-bench Verified

SWE-bench Verified — benchmark for evaluation abilities model solve real tasks from GitHub- • Self-reported

80.6%

Reasoning

Logical reasoning and analysis

GPQA

GPQA Diamond — benchmark for evaluation abilities model answer on questions level PhD by and • Self-reported

94.3%

Other Tests

Specialized benchmarks

ARC-AGI v2

ARC-AGI v2 — benchmark for evaluation abilities to reasoning and generalization • Self-reported

77.1%

MMMLU

MMMLU — version MMLU for evaluation knowledge model on languages • Self-reported

92.6%

CharXiv-R

CharXiv-R — benchmark for evaluation abilities model understand and reason about and • Self-reported

85.9%

MMMU-Pro

MMMU-Pro — version MMMU for evaluation on level experts • Self-reported

80.5%

HLE

HLE (Humanity's Last Exam) — benchmark from questions, experts for verification knowledge AI • Self-reported

51.4%

License & Metadata

License

proprietary

Announcement Date

February 19, 2026

Last Updated

February 20, 2026

Compare Gemini 3.1 Pro

All comparisons

vs Gemini 3 Flash vs Gemini 3 Pro vs Claude 3 Opus vs Kimi-k1.5 vs Grok 4 Fast vs GPT-5.4 mini vs DeepSeek-V3.1 vs o1-pro

Articles about Gemini 3.1 Pro

Every Frontier AI Model Scored Under 1% on ARC-AGI-3. Humans Got 100%.

Chollet's new benchmark drops the same week Jensen Huang declared AGI. GPT-5.4 scored 0.26%. Claude Opus 4.6 scored 0.25%. The gap with humans is 99+ points.

March 27, 2026

9 min

Chollet to Jensen Huang: 'You Don't Have AGI. You Have Expensive Autocomplete.'

The ARC-AGI creator fires back at NVIDIA's AGI claims with a benchmark that every frontier model fails. His argument is simple: if humans can do it cold, AGI should too.

March 27, 2026

2 min

Similar Models

All Models

Gemini 3 Pro

Google

Best score:0.9 (GPQA)

Released:Nov 2025

Price:$2.00/1M tokens

Gemini 3 Flash

Google

Best score:0.9 (GPQA)

Released:Dec 2025

Price:$0.50/1M tokens

Gemini 2.5 Pro Preview 06-05

Google

Best score:0.9 (GPQA)

Released:Jun 2025

Price:$1.25/1M tokens

Gemini 1.5 Pro

Google

Best score:0.9 (MMLU)

Released:May 2024

Price:$2.50/1M tokens

Gemini 1.5 Flash

Google

Best score:0.8 (MMLU)

Released:May 2024

Price:$0.15/1M tokens

Gemini 2.0 Flash

Google

Best score:0.6 (GPQA)

Released:Dec 2024

Price:$0.10/1M tokens

Gemini 2.5 Flash

Google

Best score:0.8 (GPQA)

Released:May 2025

Price:$0.30/1M tokens

Gemini 2.0 Flash-Lite

Google

Best score:0.5 (GPQA)

Released:Feb 2025

Price:$0.07/1M tokens

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.