Gemma 3 12B
MultimodalGemma 3 12B is a vision-language model from Google with 12 billion parameters that processes text and visual input and generates text output. The model has a 128K context window, multi-language support, and open weights. Suitable for question answering, summarization, reasoning, and image understanding tasks.
Key Specifications
Timeline
Technical Specifications
Pricing & Availability
Benchmark Results
Model performance metrics across various tests and benchmarks
Programming
Mathematics
Reasoning
Multimodal
Other Tests
License & Metadata
Articles about Gemma 3 12B

Where Is Gemma 4? The Community Is Getting Impatient
Google hasn't said a word about Gemma 4, and the open-source AI community is growing restless. Prediction markets are open, Reddit is debating, and competitors aren't waiting.

Google's TurboQuant Compresses AI Models to 2.5 Bits Without Breaking Them
A new quantization method from Google Research achieves 4.9x KV cache compression with zero accuracy loss. No training required — it works on any model instantly.
Similar Models
All ModelsGemma 3 27B
Gemma 3n E4B
Gemini 1.5 Flash
Gemini 2.0 Flash
Gemma 2 27B
Llama 3.2 90B Instruct
Meta
Mistral Small 3.1 24B Base
Mistral AI
Gemma 3n E2B Instructed LiteRT (Preview)
Recommendations are based on similarity of characteristics: developer organization, multimodality, parameter size, and benchmark performance. Choose a model to compare or go to the full catalog to browse all available AI models.