Google unveils Gemma 3 multi-modal AI models

Google DeepMind has introduced Gemma 3, an update to the company’s family of generative AI models, featuring multi-modality that allows the models to analyze images, answer questions about images, identify objects, and perform other tasks that involve analyzing and understanding visual data.

The update was announced March 12 and can be tried out in Google AI Studio for AI development. Gemma 3 also significantly improves math, coding, and instruction following capabilities, according to Google DeepMind.

Gemma 3 supports vision-language inputs and text outputs, handles context windows up to 128k tokens, and understands more than 140 languages. Improvements also were made for math, reasoning, and chat, including structured outputs and function calling. Gemma 3 comes in four “developer friendly” sizes of 1B, 4B, 12B, and 27B and in pre-trained and general-purpose instruction-tuned versions. “The 128k-token context window allows Gemma 3 to process and understand massive amounts of information, easily tackling complex tasks,” Google DeepMind’s announcement said.

Google unveils Gemma 3 multi-modal AI models

@HPCpodcast: Dr. Ian Cutress on the State of Advanced Chips, the GPU Landscape and AI Compute, Global Chip Manufacturing and GTC Expectations

Oracle adds agentic AI to anti-fraud platform

Google recommendations for the U.S. AI Action Plan

California has 30 new proposals to rein in AI. Trump could complicate them

New Cohere AI model’s performance rivals latest versions of DeepSeek, ChatGPT

World’s first universal embodied AI platform launched in Beijing

Google search test “AI” mode

Mastercard Accelerate: four programs, a vision

IA and quantum IT in the accent

Latest

Google search test “AI” mode

Mastercard Accelerate: four programs, a vision

IA and quantum IT in the accent

Subscribe to Updates

Subscribe To Updates

Google unveils Gemma 3 multi-modal AI models

Related Posts

Subscribe to Updates