For those who wanted extra proof that GenAI is inclined to creating stuff up, Google’s Gemini chatbot, previously Bard, thinks that the 2024 Tremendous Bowl already occurred. It even has the (fictional) statistics to again it up.
Per a Reddit thread, Gemini, powered by Google’s GenAI fashions of the identical title, is answering questions on Tremendous Bowl LVIII as if the sport wrapped up yesterday — or weeks earlier than. Like many bookmakers, it appears to favor the Chiefs over the 49ers (sorry, San Francisco followers).
Gemini adorns fairly creatively, in at the least one case giving a participant stats breakdown suggesting Kansas Chief quarterback Patrick Mahomes ran 286 yards for 2 touchdowns and an interception versus Brock Purdy’s 253 operating yards and one landing.
It’s not simply Gemini. Microsoft’s Copilot chatbot, too, insists the sport ended and gives faulty citations to again up the declare. However — maybe reflecting a San Francisco bias! — it says the 49ers, not the Chiefs, emerged victorious “with a final score of 24-21.”
Copilot is powered by a GenAI mannequin related, if not equivalent, to the mannequin underpinning OpenAI’s ChatGPT (GPT-4). However in my testing, ChatGPT was loath to make the identical mistake.
It’s all quite foolish — and probably resolved by now, provided that this reporter had no luck replicating the Gemini responses within the Reddit thread. (I’d be shocked if Microsoft wasn’t engaged on a repair as effectively.) But it surely additionally illustrates the most important limitations of at the moment’s GenAI — and the risks of putting an excessive amount of belief in it.
GenAI fashions haven’t any actual intelligence. Fed an infinite variety of examples often sourced from the general public internet, AI fashions learn the way doubtless information (e.g. textual content) is to happen primarily based on patterns, together with the context of any surrounding information.
This probability-based method works remarkably effectively at scale. However whereas the vary of phrases and their chances are doubtless to lead to textual content that is smart, it’s removed from sure. LLMs can generate one thing that’s grammatically appropriate however nonsensical, as an example — just like the declare concerning the Golden Gate. Or they’ll spout mistruths, propagating inaccuracies of their coaching information.
Tremendous Bowl disinformation definitely isn’t probably the most dangerous instance of GenAI going off the rails. That distinction most likely lies with endorsing torture, reinforcing ethnic and racial stereotypes or writing convincingly about conspiracy theories. It’s, nevertheless, a helpful reminder to double-check statements from GenAI bots. There’s an honest probability they’re not true.