It’s no secret that Google’s flagship Gemini AI chatbot has had some issues. His producing historically wrong images forced parent company Google to temporarily suspend the product earlier this year.
But Google is trying to turn around its early artificial intelligence missteps. Keynote speakers at the tech giant’s annual event Google Cloud Conference Next in Las Vegas on Tuesday, it unveiled novel features of Gemini Pro 1.5, the latest version of its chatbot, which is now publicly available. Viewers watched as demonstrators muttered to themselves and typed suggestions into the improved AI chatbot to highlight its novel tools — perhaps the most essential of which is its ability to “ground” queries. “Grounded” means that answers in Gemini Pro 1.5 are tied to “verifiable sources of information”, – the company announced on Tuesday.
The Gemini 1.5 Pro announcements included a number of updates to the chatbot as part of Google’s push to sell AI products to enterprise customers. Gemini now includes further capabilities in terms of so-called “long context understanding”, which basically means it can process much more information. It also has multimodal capabilities – that is, the ability to process not only text, but also audio, video and other formats to generate responses.
“Thanks to these two advances, enterprises today can do things that weren’t previously possible with AI,” Google CEO Sundar Pichai said during the presentation.
The companies have already started piloting the product. Google said Goldman Sachs, Mercedes and Uber are among the first customers of Gemini 1.5 Pro. Goldman Sachs CEO David Solomon himself appeared in a Google Next video right after Pichai. Mercedes-Benz CEO Ola Källenius also spoke about the German carmaker’s partnership with Google and the employ of its AI products.
Google says Gemini 1.5 Pro enables customers to “process massive amounts of information in a single stream” – including 1 hour of video, 11 hours of audio, or over 700,000 words.
“For example,” the company added, “a gaming company could provide video analysis of a player’s performance along with tips for improvement. The insurance company can also combine video, image and text data to create an incident report, making the claims process easier.
Google also made other AI announcements, a full list of which can be found on the website Google Next 2024 conference website.
Google Videos
Google launches AI-powered video creation app Google Videos. On Tuesday, Aparna Pappu, vice president of Google Workspace, presented the application in a demo version.
“Gemini suggests a narrative outline for the story that I can easily adapt and edit,” Pappu said, based on suggestions in Google Docs.
Generate image from live text
The latest version of Google’s artificial intelligence generator, Imagen 2.0, powered by Gemini technology, allows you to create live images based on text prompts. It’s still in “preview” mode, but keynote speakers in Las Vegas demonstrated the feature.
“Marketing and innovative teams can generate animated images based on text prompts, including product images, ads, GIFs and storyboards,” Pappu said. Another demonstrator noted that the tool creates live images that would otherwise require “days or weeks of searching and photographing.”
Pappu also announced that Google AI-generated Imagen images will be able to be watermarked using Google DeepMind’s SynthID.