Article Source: Academic Highlights
Image Source: Generated by Boundaryless AI
Recently, Google has made a new move in the field of large models by unveiling a series of lightweight, advanced open models called Gemma.
According to reports, Gemma was developed jointly by Google DeepMind and other teams at Google, built on the same research and technology as the creation of the Gemini model, specifically designed for responsible artificial intelligence development, with its name derived from the Latin word gemma, meaning “gemstone”.
Demis Hassabis, the CEO of Google DeepMind, stated at X, “We have long been supportive of responsible open-source and science, which can drive rapid research progress, so we are proud to release Gemma…“
The research team shared some key details about Gemma in an official blog post by Google DeepMind, such as:
- Google will release two sizes of model weights: Gemma 2B and Gemma 7B, with each size having pre-trained and instruction fine-tuned variants.
- The new Responsible Generative AI toolkit provides guidance and essential tools for creating safer AI applications using Gemma.
- Google also offers toolchains for inference and supervised fine-tuning for all major frameworks: JAX, PyTorch, and TensorFlow, as well as native Keras 3.0.
- Ready-to-use Colab and notebooks, along with integrations with popular tools like Hugging Face, MaxText, NVIDIA NeMo, and TensorRT-LLM, make it easy for developers to get started with Gemma.
- Pre-trained and instruction fine-tuned Gemma models can run on users’ laptops, workstations, or Google Cloud, and can be easily deployed on Vertex AI and Google Kubernetes Engine (GKE).
- Optimization across multiple AI hardware platforms ensures industry-leading performance, including NVIDIA GPUs and Google Cloud TPUs.
- Under the terms of use, all organizations (regardless of size) are allowed to engage in responsible commercial use and distribution.
Moreover, the research team also mentioned in the blog post, “Starting today, Gemma will be released globally.” This means that developers in your country can also start using Gemma today. (Quick Start Guide: https://ai.google.dev/gemma?hl=zh-cn)
Strongest in Its Size
According to the official blog, the Gemma model shares technologies and infrastructure components, enabling Gemma 2B and 7B to achieve best-in-class performance at their scale compared to other open models. Additionally, Gemma models can run directly on developers’ laptops or desktop computers.
It is worth mentioning that Gemma outperforms larger models on key benchmarks while still meeting Google’s strict security and responsible output standards.
Furthermore, to ensure the safety and reliability of Gemma pre-trained models, Google uses automation technology to filter out certain personal information and other sensitive data from the training set; extensive fine-tuning and reinforcement learning based on human feedback are also employed to keep Gemma’s instruction fine-tuning models consistent with responsible behavior; to understand and mitigate the risks of Gemma models, Google conducts rigorous evaluations, including manual red-teaming, automated adversarial testing, and dangerous behavior model capability assessments.
Cross-Framework, Tool, and Hardware Optimization
According to Google, to meet specific application needs such as summarization or retrieval-enhanced generation, developers can fine-tune Gemma models with their own data.
Currently, Gemma supports various tools and systems:
- Multi-framework tools: Reference implementations for inference and fine-tuning in favorite frameworks across multi-framework Keras 3.0, native PyTorch, JAX, and Hugging Face Transformers.
- Cross-device compatibility: Gemma models can run on popular device types like laptops, desktops, IoT, mobile, and cloud, enabling a wide range of AI capabilities.
- State-of-the-art hardware platforms: Google collaborates with NVIDIA to optimize Gemma for NVIDIA GPUs, ensuring industry-leading performance and integration with cutting-edge technology from data centers to the cloud to local RTX AI PCs.
- Optimized for Google Cloud: Vertex AI offers a wide range of MLOps tools with various tuning options and one-click deployments with built-in inference optimization. Utilize fully managed Vertex AI tools or self-managed GKE for advanced customization, including cost-effective infrastructure on GPUs, TPUs, and CPUs on any platform.
Reference Links:
https://blog.google/technology/developers/gemma-open-models/
Technical Report Link: https://storage.googleapis.com/deepmind-media/gemma/gemma-report.pdf