Google logo

Gemini Diffusion

Google

Gemini Diffusion is an experimental text diffusion model from Google DeepMind, representing a cutting-edge solution in its field. It explores a new type of language model designed to give users more control, creativity, and text generation speed. Instead of predicting text token by token, it learns to generate results by gradually refining noise, enabling rapid iteration and error correction during generation. Key capabilities include fast response times (claimed 1,479 tokens/sec excluding overhead), more coherent text generation by outputting entire token blocks simultaneously, and iterative refinement for consistent results. It excels at editing tasks, including math and coding contexts.

Key Specifications

Parameters
-
Context
-
Release Date
May 20, 2025
Average Score
46.9%

Timeline

Key dates in the model's history
Announcement
May 20, 2025
Last Update
July 19, 2025
Today
March 25, 2026

Technical Specifications

Parameters
-
Training Tokens
-
Knowledge Cutoff
-
Family
-
Capabilities
MultimodalZeroEval

Benchmark Results

Model performance metrics across various tests and benchmarks

Programming

Programming skills tests
HumanEval
@1 AI: ChatGPT-4 Query: log(1 + e^x) dx Answer: For computation $\int \log(1 + e^x) dx$, I $u = e^x$, $du = e^x dx$ and $dx = \frac{du}{u}$. : $\int \log(1 + e^x) dx = \int \log(1 + u) \frac{du}{u}$ This not very other approach. that $\log(1 + e^x) = \log\left(\frac{e^x(e^{-x} + 1)}{e^{-x}}\right) = \log\left(\frac{1 + e^x}{e^{-x}}\right) = \log(1 + e^x) + \log(e^x) = \log(1 + e^x) + x$ I error. by : $\int \log(1 + e^x) dx$ $u = \log(1 + e^x)$ and $dv = dx$. $du = \frac{e^x}{1+e^x} dx$ and $v = x$. $\int u dv = uv - \int v du$ $\int \log(1 + e^x) dx = x\log(1 + e^x) - \int x \frac{e^x}{1+e^x} dx$ still one approach. $u = 1 + e^x$, $du = e^x dx$ or $dx = \frac{du}{u-1}$. $\int \log(1 + e^x) dx = \int \log(u) \frac{du}{u-1}$ This still by $u = \log(1 + e^x)$ and $dv = dx$. $du = \frac{e^x}{1+e^x} dx$ and $v = x$. $\int \log(1 + e^x) dx = x\log(1 + e^x) - \int \frac{xe^x}{1+e^x} dx$ that $\frac{xe^x}{1+e^x} = x - \frac{x}{1+e^x}$. $\int \frac{xe^x}{1+e^x} dx = \int x dx - \int \frac{x}{1+e^x} dx = \frac{x^2}{2} - \int \frac{x}{1+e^x} dxSelf-reported
89.6%
MBPP
In method (pass @1) we task from set with help specific model. If task and solution we we evaluate task for one pass through model. When we we use methods (thoughts, Chain-of-Thought, tool-augmented Chain-of-Thought), we prompt, which model use this method, and then we verify answerSelf-reported
76.0%
SWE-Bench Verified
@1, evaluation (only ), prompt 32KSelf-reported
22.9%

Reasoning

Logical reasoning and analysis
GPQA
pass @1Self-reported
40.4%

Other Tests

Specialized benchmarks
AIME 2025
Pass @1 AI: chatgpt-4 AI system: ChatGPT-4 (aka gpt-4-turbo). AI behavior: The AI is provided directly with task descriptions and resources. Description: ChatGPT refers to the GPT systems built by OpenAI that power the ChatGPT website and the GPT-4-Turbo API. In the Chatbot Arena, this is the GPT system that was deployed at the time the match was run. For multiple-round conversations, the AI retains some memory of the earlier interaction. Deployment: The AI is accessed via OpenAI's ChatGPT website (or the GPT-4-Turbo API). The AI system gets the user's message directly, and can respond in a variety of forms (text, photos, drawings, etc.) Pros: - Latest version of GPT model - Direct access to model - Minimal latency Cons: - Limited context window (128k tokens) - No real-time web access - Cannot solve lengthy or complex problems that require more than the context windowSelf-reported
23.3%
BIG-Bench Extra Hard
pass @1Self-reported
15.0%
BigCodeBench
@1Self-reported
45.4%
Global-MMLU-Lite
with first attempts When solving tasks we measure, solves whether her/its model with first attempts. In real situations, users, with LLM, obtain correct answer on its tasks with first attempts, without additional Therefore ability model give correct answer with first attempts is metric. We we measure, capable whether model give correct answer at tasks, without additional prompts or questionsSelf-reported
69.1%
LBPP (v2)
pass @1Self-reported
56.8%
LiveCodeBench
@1 In this we we determine correctness solutions with points view its answer. Solution is considered correct, if correct final answer, at this solution can other errors. System receives score 1, if generates correct answer, and 0 in case. This score has value for applications and evaluation "step", when user only correctness final answer, and not correctness each stepSelf-reported
30.9%

License & Metadata

License
proprietary
Announcement Date
May 20, 2025
Last Updated
July 19, 2025

Similar Models

All Models

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.