Source: Tech Daily
Screenshot of the Genie effect in action. Image Source: Google Official Website
Google’s subsidiary, Deep Thought, has recently unveiled a groundbreaking basic world model called Genie, trained based on internet videos, as reported on Google’s official website on the 26th.
In recent years, generative artificial intelligence (AI) models have been able to create content through language, images, and even videos. Google is introducing a new paradigm of generative interactive environments with Genie, allowing for the generation of interactive and controllable environments from a single image prompt.
Genie is a basic world model with 110 billion parameters, trained on over 200,000 hours of two-dimensional (2D) game videos without human supervision. This means that Genie can autonomously recognize different action features and patterns from videos, learning various character movements, controls, and actions.
What sets Genie apart is its ability to learn fine-grained control from internet videos. Not only can Genie observe controllable parts, but it can also infer multiple potential actions based on the generated environment.
The model takes a single image (whether it’s AI-synthesized, a photo, or a sketch) and transforms it into a playable game that responds to user controls. From an image to a basic interactive environment in one go.
Users can simply provide a sketch on paper, a perfect digital artwork, or even a description of a 2D world generated by AI, and Genie will take care of the rest, helping users create 2D games.
According to Google’s official website, Genie focuses on videos related to 2D platform games and robotics technology, but the methodology is versatile and applicable to any field, scalable to larger internet datasets. With just one image, a whole new interactive environment can be created, opening up various new pathways for generating and entering the virtual world.