Google Genie 3, a new AI model that has been unveiled by Google DeepMind, can create interactive 3D worlds. The model will recreate the environment in real time at 24 frames per second, staying consistent at 720p for a few minutes, after users simply submit a text prompt that describes the environment.
Unlike earlier versions, Genie 3 supports continuous interaction for a few minutes, remembers where objects were placed, and allows dynamic changes like adding characters or altering weather conditions.
According to a blog post that accompanied the release, agents may anticipate changes in the environment and the potential effects of their actions by using world models, which can comprehend and recreate settings.
According to the study, “world models are also a crucial first step on the path to AGI, since they enable AI agents to be trained in an infinite curriculum of rich simulation environments.”
The company claims that whereas the interactive window of Genie 2 lasted anywhere from 10 to 20 seconds, Genie 3 offers a “few minutes” of involvement. Furthermore, if a user leaves a location and returns later, the spot will still look the same because the AI model can be more consistent with graphics.
But Genie 3 isn’t yet available for public preview; instead, it will be made available to a small number of artists for testing.
Key features of Google Genie 3
Rather than producing static information, Genie 3 is a member of a class of AI systems known as world models, which imitate dynamic settings. These models can be applied to robotics, video games, training simulations, and education.
Story continues below this ad
Using a suggestion, such as “a forest during a thunderstorm”, the model is supposed to create a playable 3D environment that you can explore with simple movement controls.
The video maintains consistency throughout at 24 frames per second in 720p resolution. According to The Verge, that represents a significant improvement from Genie 2, where engagement lasted only ten to twenty seconds.
Recall what you observed: Visual memory is one of Genie 3’s greatest improvements. A capacity that was absent from the majority of earlier world models is the ability to leave an object behind and return to it later. According to Google, this visual memory lasts for approximately one minute.
Set off actual events: According to the DeepMind blog, Genie 3 has “promptable world events,” which let users add rain, add characters, or transform items by just inputting new commands.
Story continues below this ad
Limitation
Despite significant progress, Genie 3 has several limitations that Google DeepMind is addressing. The model cannot simulate real-world locations with geographic accuracy, and legible text often appears only if it was included in the original prompt. Its range of interactions is currently limited, with multi-agent interactions still under development. While more stable than previous versions, it only supports a few minutes of continuous exploration. The technology also presents new safety and responsibility challenges, which is why its rollout is being handled with a gradual approach.