Google logo

Gemma 3n E2B Instructed

Multimodal
Google

Gemma 3n is a multimodal model designed for local hardware deployment, supporting image, text, audio, and video inputs. It includes a language decoder, audio encoder, and visual encoder and is available in two sizes: E2B and E4B. The model is optimized for efficient memory usage, allowing it to run on devices with limited GPU RAM. Gemma is a family of lightweight, state-of-the-art open models from Google, built on the same research and technology used to create Gemini models. Gemma models are well-suited for various content understanding tasks, including question answering, summarization, and reasoning. Their relatively small size enables deployment in resource-constrained environments such as laptops, desktops, or private cloud infrastructure, democratizing access to state-of-the-art AI models and fostering innovation for everyone. Gemma 3n models are designed for efficient execution on resource-constrained devices. They can process multimodal inputs, working with text, images, video, and audio, and generate text outputs, with open weights for instruction-tuned variants. These models were trained on data in over 140 spoken languages.

Key Specifications

Parameters
8.0B
Context
-
Release Date
June 26, 2025
Average Score
33.7%

Timeline

Key dates in the model's history
Announcement
June 26, 2025
Last Update
July 19, 2025
Today
March 25, 2026

Technical Specifications

Parameters
8.0B
Training Tokens
11.0T tokens
Knowledge Cutoff
June 1, 2024
Family
-
Capabilities
MultimodalZeroEval

Benchmark Results

Model performance metrics across various tests and benchmarks

General Knowledge

Tests on general knowledge and understanding
MMLU
Accuracy. 0-shot.Self-reported
60.1%

Programming

Programming skills tests
HumanEval
pass@1. 0-shot.Self-reported
66.5%
MBPP
pass@1. 3-shot. Indicator pass@1 with 3 examples.Self-reported
56.6%

Mathematics

Mathematical problems and computations
MGSM
0-shot Zero-shot means, that we we evaluate performance model by task without any-or examples that, that is required. This most strict type testing performance, since he measures, how well well model can directly apply its basic knowledge to new task. Examples use approach zero-shot: - model mathematical questions, not showing preliminarily other examples - abilities to reasoning by means of representations new puzzles - analysis text without examples that, how should its zero-shot consists in that, that he demonstrates ability model for its training data, although usually results than at several examplesSelf-reported
53.1%

Reasoning

Logical reasoning and analysis
GPQA
Diamond. 0-shot AI: not are used tools, knowledge or prompt. This reflects basic capabilities model. Set accuracy: if model answers «correctly» or «correctly» on question, answer should be indeed correct. Limitations: direct output without tool use, thinking or additional knowledge from : evaluate basic abilities model generate answers, not on tools or This most strict evaluationSelf-reported
24.8%

Other Tests

Specialized benchmarks
AIME 2025
Accuracy. With promptsSelf-reported
6.7%
Codegolf v2.2
pass@1. 0-shot.Self-reported
11.0%
ECLeKTic
0-shot In (0-shot) LLM solves task without any-or examples that, how execute this task. and not are provided; model not how perform tasks. In this scenarios LLM uses its internal representations for generation answers, not instructions or examples that, how specifically should solve task. Such approach is called 0-shot (), because that model should generate answers without execution specific tasks. This method evaluates basic abilities LLM use its internal knowledge without additional training or promptsSelf-reported
2.5%
Global-MMLU
0-shot AI: no data. directly on query user, without any-or examples that, how solve problem or which tasks perform. : • This most general method: shows, that model can make without which-or specific help • ability model follow instructions : • Usually gives results among all methods in complex tasks • Not demonstrates full ability model solve tasks • Can not then, how users actually use LLM (so how users can give examples)Self-reported
55.1%
Global-MMLU-Lite
Accuracy. 0-shot.Self-reported
59.0%
HiddenMath
Accuracy. 0-shot.Self-reported
27.7%
Include
WithoutSelf-reported
38.6%
LiveCodeBench
pass@1. with first attemptsSelf-reported
13.2%
LiveCodeBench v5
pass@1. 0-shot.Self-reported
18.6%
MMLU-Pro
Accuracy. 0-shot.Self-reported
40.5%
MMLU-ProX
In given method model problem without or preliminary context. This most common and basic way evaluation models, which measures their performance in conditions, when they exclusively on knowledge, obtained in time preliminary training. additional examples, context or prompts means, that results in this mode can be below by comparison with other methods, especially for tasks, requiring specialized knowledge or format. In this approach instruction and in order to to solvingSelf-reported
8.1%
OpenAI MMLU
0-shot AI: Solve task, using only obtained instructions. Human: Solve this problem. AI: [solution, which usually leads to incorrect answers] Human: This solution incorrectly. AI Thinking AssistantSelf-reported
22.3%
WMT24++
Character-level F-score. 0-shot.Self-reported
42.7%

License & Metadata

License
proprietary
Announcement Date
June 26, 2025
Last Updated
July 19, 2025

Similar Models

All Models

Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.