Genie 3 you have model: The recently introduced Genie 3 by Google Deepmind is an advanced AI model that can create a 3D interactive virtual world through text command. In this, users only gives a simple text input such as “wet forests in rain” and this model presents that environment in real time at the rate of 24 frames per second at 720p resolution.
Better interaction and visual memory than before
Genie 3 is ahead in several cases from its first version Genie 2. While Genie 2 was able to give an interaction of only 10 to 20 seconds, Genie 3 gives the facility of continuous interaction for a few minutes. Not only this, if a user leaves a place and returns again, then the things and scenes there remain as before. The credit goes to the ‘visual memory’ of Genie 3 which can remember a scene for about a minute.
Dynamic change and event trigger
Genie 3 is not limited to stable scenes only, it is a “world model” i.e. an AI system that can mimic the dynamic environment. Users can change the weather by giving new text commands, add new characters or change items to some other form.
Steps towards agent training and AGI
According to Google Deepmind’s blog post, such world models are an important step towards future Agi (Artificial General Intelligence). These models are extremely useful to train AI agents in virtual environment in various situations, especially in areas such as robotics, gaming, training and education.
Boundaries
Although Genie 3 has improved a lot, yet some limitations still remain. This model cannot yet manufacture accurate geographical locations, and the text in a scene is clear only when it is given in the initial input. The facility of multi-agent interaction is currently in the stage of development. For this reason, Genie 3 has not been released to the general public at present. Google is providing it for testing with a limited number of creators to handle issues like security and responsibility properly.
Also read:
A British MP prepared his AI avatar! Because of this, this work is done