
Gemma 4 introduces a new family of open-source models featuring significant architectural advancements and expanded multimodal capabilities. The lineup includes two on-device models and two larger variants, notably the 26B mixture-of-experts (MOE) and the 31B dense model, both of which rank among the top open-source models on the LLM arena. Key technical improvements include the implementation of grouped query attention, per-layer embeddings (PLE) for memory-efficient on-device performance, and native support for vision and audio. The transition to an Apache 2.0 license enhances accessibility for developers, facilitating seamless integration from testing to deployment. These models support advanced reasoning, autonomous workflows, and structured JSON outputs, with flexible resolution and aspect ratio handling for vision tasks. By optimizing inference efficiency and expanding multimodal functionality, Gemma 4 establishes a new performance benchmark for both small-scale local applications and large-scale reasoning tasks.
Sign in to continue reading, translating and more.
Continue