Gemini AI updates, new search features and more

user

2 months ago

Google CEO Sundar Pichai speaks at the Google I/O developer conference.

Andrey Sokolov | Picture Alliance | Getty Images

Google on Tuesday hosted its annual I/O developer conference and released a range of AI products, from new search and chat features to AI hardware for cloud clients. The announcements highlight the company’s focus on AI as it fends off rivals such as OpenAI.

Many of the features or tools that Google has introduced are only in testing or limited to developers, but they give an idea of how Google thinks about AI and where it is investing. Google makes money from AI by charging developers who use its models and from customers who pay for Gemini Advanced, its ChatGPT competitor, which costs $19.99 a month and can help users summarize PDFs, Google Docs and others.

Tuesday’s announcements follow similar events held by its AI competitors. Earlier this month, AmazonAnthropic announced its first enterprise offering and free iPhone app. Meanwhile, OpenAI on Monday released a new AI model and a desktop version of ChatGPT, along with a new user interface.

Here’s what Google announced.

Gemini AI updates

Google presented updates to Gemini 1.5 Pro, its AI model that will soon be able to process even more data—for example, the tool can summarize 1,500 pages of user-uploaded text.

There’s also a new Gemini 1.5 Flash AI model, which the company says is more cost-effective and designed for smaller tasks like quickly summarizing conversations, adding captions to images and videos, and extracting data from large documents.

Google CEO Sundar Pichai highlighted Gemini’s translation improvements, adding that it will be available to all developers worldwide in 35 languages. Within Gmail, Gemini 1.5 Pro will analyze PDF and video attachments, providing summaries and more, Pichai said. This means that if you’ve missed a long email thread on vacation, Gemini will be able to summarize it along with any attachments.

The new Gemini updates are also useful for Gmail search. One example the company gave: If you’ve compared prices from different contractors to repair your roof and are looking for a summary to help you decide which one to choose, Gemini can return three quotes along with the estimated start dates suggested in the various email threads.

Google said that Gemini will eventually replace Google Assistant on Android phones, meaning it will be a more powerful competitor to of Apple Siri on iPhone.

Google Veo, Imagen 3 and audio reviews

Google announced Veo, its latest high-definition video generation model, and Imagen 3, its highest-quality text-to-image model, which promises lifelike images and “less distracting visual artifacts than our previous models.” .

The tools will be available to select creators on Monday and will come with Vertex AI, Google’s machine learning platform that allows developers to train and deploy AI applications. Until then, there will be a waiting list.

The company also demonstrated “Audio Reviews,” the ability to generate audio discussions based on text input. For example, if a user uploads a lesson plan, the chatbot can speak a summary of it. Or, if you ask for an example of a real-life science problem, he can do it via interactive audio.

Separately, the company also demonstrated the “AI Sandbox,” a set of generative AI tools for creating music and sounds from scratch, based on user prompts.

However, generative AI tools such as chatbots and image builders continue to have problems with accuracy.

Head of Google Search Prabhakar Raghavan told employees last month that competitors “may have a new gizmo that people like to play with, but they still come to Google to check what they see there because it’s the trusted source and it’s becoming more critical in this era of generative AI.”

Earlier this year, Google introduced the image generator powered by Gemini. Users discovered historical inaccuracies that went viral online, and the company pulled the feature, saying it would relaunch it in the coming weeks. The feature has not yet been re-released.

New search features

Google launches ‘AI Insights’ on Google Search on Monday in the US AI Insights show a brief summary of the answers to the most complex search questions, according to Liz Reed, head of Google Search. For example, if a user searches for the best way to clean leather boots, the results page might show an “AI Overview” at the top with a multi-step cleaning process gathered from information synthesized from around the web.

The company said it plans to bring assistant-like scheduling capabilities directly into search. explains that users will be able to search for something like, “‘Create an easy-to-prepare 3-day meal plan for a group,’ and you’ll get a starting point with a wide variety of recipes from around the web.”

As for its progress in offering “multimodality,” or integrating more images and video within generative AI tools, Google said it will begin testing the ability for users to ask questions through video, such as capturing a problem with a product that own, upload and ask the search engine to understand the problem. In one example, Google showed someone photographing a broken record player while asking why it wasn’t working. Googling found the turntable model and suggested it might be malfunctioning because it wasn’t properly balanced.

Another new feature in testing called “AI Teammate” will integrate into the user’s Google Workspace. It can build a searchable collection of work from messages and email threads with multiple PDFs and documents. For example, a would-be founder can ask the AI teammate, “Are we ready to launch?” and the assistant will provide analysis and summary based on the information accessed in Gmail, Google Docs, and other Workspace apps.

Project Astra

Project Astra is Google’s latest advancement to its AI assistant, which is built by Google’s DeepMind AI unit. It’s just a prototype for now, but you can think of it as Google’s aim to develop its own version of JARVIS, Tony Stark’s omniscient AI assistant from the Marvel Universe.

In a demo video presented at Google I/O, the assistant — via video and audio rather than a chatbot interface — was able to help a user remember where they left their glasses, review code, and answer questions about what represents a certain part of the speaker is called when that speaker is shown on video.

Google said that a truly useful chatbot should allow users to “talk to it naturally and without delay or lag.” The conversation in the demo video happened in real time, with no lag. The demo followed OpenAI’s Monday showcase of a similar audio back-and-forth with ChatGPT.

DeepMind CEO Demis Hassabis said on stage that “getting response time down to something conversational is a difficult engineering challenge.”

Pichai said he expects Project Astra to launch in Gemini later this year.

AI hardware

Finally, Google announced Trillium, its sixth-generation TPU, or tensor processor—a piece of hardware integral to performing complex AI operations—which will be available to cloud customers in late 2024.

TPUs are not designed to compete with other chips of Nvidia graphics processors. Pichai noted during I/O, for example, that Google Cloud will start offering Nvidia’s Blackwell GPUs in early 2025.

Nvidia said in March that Google will use the Blackwell platform for “a variety of internal deployments and will be one of the first cloud providers to offer Blackwell-based instances,” and that access to Nvidia’s systems will help Google offer large-scale enterprise tools developers building large language models.

In his speech, Pichai emphasized “Google’s long-standing partnership with Nvidia.” The companies have worked together for more than a decade, and Pichai has said in the past that he expects them to continue to do so a decade from now.

WATCHING: CNBC’s full interview with Alphabet CEO Sundar Pichai

Watch CNBC's full interview with Alphabet CEO Sundar Pichai

https://www.cnbc.com/2024/05/14/google-io-2024-ai-gemini.html