I don’t know about you, but when I play video games, I actually enjoy, you know, playing them. If I wanted to delegate the gameplay to someone else, I would watch a Let’s Play, a Twitch stream or something. But Google is working on an AI model that will be able to play video games on your behalf, provided you tell it what you want: it’s called SIMA, low for Scalable Instructable Multiworld Agent, and if it works as advertised, the AI can just take over your favorite hobby.
Google DeepMind, the company’s artificial intelligence division, announced a recent model in post on a blog but also entry on X (formerly Twitter). According to Google, DeepMind SIMA is the first universal artificial intelligence agent that can execute natural language instructions in 3D environments. In other words, it can play video games based on your commands. You say “turn left” and SIMS turns the sign to the left.
Google DeepMind worked with eight video game studios to train SIMA, including: No Man’s Sky Hello Games and Demolition Tuxedo labs. The development team wanted to train SIMA on as many different types of games as possible because each recent variable added another skill to the model’s capabilities. Google DeepMind even built a sandbox-like environment in which SIMA would have to build structures to test its knowledge of physics and object manipulation.
What makes SIMA so effective, at least in theory, is that it does not need any technical information about the video game itself, such as source code or APIs. It can operate based on video game images and natural language commands. Google DeepMind says SIMA can perform over 600 “basic skills” such as turning in a specific direction, interacting with objects and using game menus. That said, Google DeepMind is still working on more intricate actions, as well as commands with multiple subtasks. Telling the AI to climb a ladder ahead of them is one thing, but training it to accurately respond to “I’m extracting resources to build a shelter” commands is another thing entirely. The company says this is generally a limitation with gigantic language models – bots will respond to basic commands but have difficulty performing intuitive actions on their own.
Meanwhile, Google DeepMind is touting its success with its multi-game training model, claiming that SIMA outperforms models trained on one specific game at a time. In fact, the company claims that SIMA can respond better in a game it has never seen before than a model that has only been trained on that game.
While SIMA is not yet publicly available, several potential applications for the technology can be imagined. I think this could be a great accessibility option in the future: for players who have trouble using time-honored controllers, telling a bot how to control the player could be a game changer. Of course, Google’s ultimate goal seems to go beyond this, as it wants AI to be able to play games itself. This can be a great solution for bypassing repetitive tasks like leveling up or making money, but it also begs the question: why are you even playing this game if you want a robot to do the entire game?
This is Google’s second massive step in the field of AI-based games: last month we learned that the company is working on a model that can also generate 2D platformers based on natural language commands. Perhaps in the near future the company will introduce Google Gaming games: just tell the AI what type of game you want to see and it will generate AND play the game for you in real time. How comical.