Google shared a video on its social media platforms on Monday, teasing the new capabilities of its Gemini artificial intelligence (AI)-based chatbot. The video was released just a day before the company’s annual developer-focused Google I/O event. It is believed that the tech giant may make several AI announcements and reveal new features and possibly new AI models. Additionally, the center stage is likely to be occupied by Android 15 and Wear OS 5, which may be unveiled during the event.

In a short video posted on X (formerly known as Twitter), Google’s official account teased the new capabilities of its in-house AI chatbot. The 50-second video highlighted significant improvements to his speech, giving Gemini a more emotional voice and modulations that give him a more human appearance. In addition, the video highlights the new capabilities of computer vision. Artificial intelligence can capture the visual images on the screen and analyze them.

Gemini can also access the smartphone’s camera, a capability it doesn’t currently have. The user moved the camera through the space and asked the AI ​​to describe what it was seeing. With almost no time lag, the chatbot can describe the setup as a stage, and when prompted, can even recognize the Google I/O logo and share information around it.

The video doesn’t share any more details about the AI ​​and instead asks people to watch the event to learn more. There are some questions that could be answered during the event, such as whether Google is using a new Large Language Model (LLM) for computer vision or an upgraded version of Gemini 1.5 Pro. Additionally, Google may also reveal what else AI can do with its computer vision. It should be noted that there are rumors that the tech giant may introduce Gems, which are considered chatbot agents that can be designed for specific tasks, similar to OpenAI’s GPT.

While Google’s event is expected to introduce new Gemini features, OpenAI held its Spring Update event on Monday and unveiled its latest GPT-4o AI model that added features to ChatGPT, similar to the video shared by Google. The new AI model allows it to have conversational speech, computer vision, real-time language translation, and more.

https://www.gadgets360.com/ai/news/google-i-o-gemini-ai-computer-vision-conversational-capabilities-teased-5662477#rss-gadgets-all