Google is counting on its own GPT-4 competitor, Gemini, which it staged in a recent demo video. In a comment section, Bloomberg Google acknowledges the video titled “Hands-on with Gemini: Communicating with Multimodal AI,” Not only has it been edited to speed up releases (announced in the video description), but implicit voice communication between the human user and the AI is virtually non-existent.
Instead, the actual demo was done “using still image frames from scenes, and prompting via text,” rather than having Gemini draw or change objects on the table in real-time in response to — or predict. It’s a lot less impressive than the video would like to mislead us to be, and worse, the lack of denial about the actual input method calls into question the Gemini’s readiness.
Google’s denial of any wrongdoing here, as noted, is surprising on the edge The X post, written by Gemini co-president Oriol Vinales, says that “all user prompts and outputs in the video are real” and that his team created the video “to inspire developers.” Given the recent industry and regulatory focus on AI, the tech giant may be more sensitive about its presentations in the field.
So happy to see the interest surrounding our “Hands-on with Gemini” video. In our developer blog yesterday, we described how Gemini was used to build it. https://t.co/50gjMkaVc0
We presented Gemini arrays of different modes – in this case image and text – and it had to respond… pic.twitter.com/Beba5M5dHP
— Oriole Vinales (@OreoleVinealesML) December 7, 2023