When Gemini 3 Flash doesn’t know, it still answers anyway

By Eric Hal SchwartzDecember 22, 2025Latest from TechRadar

Gemini 3 Flash is smart - but when it doesn’t know, it makes stuff up anyway News Eric Hal Schwartz published Google's latest AI model would rather bluff than admit confusion Gemini 3 Flash often invents answers instead of admitting when it doesn’t know something The problem arises with factual or high‐stakes questions But it still tests as the most accurate and capable AI model Gemini 3 Flash is fast and clever. But if you ask it something it doesn’t actually know - something obscure or tricky or just outside its training - it will almost always try to bluff its way through, according to a recent evaluation from the independent testing group Artificial Analysis. It seems Gemini 3 Flash hit 91% on the “hallucination rate” portion of the AA-Omniscience benchmark. That means when it didn’t have the answer, it still gave one anyway, almost all the time, one that was entirely fictional. AI chatbots making things up has been an issue since they first debuted. Knowing when to stop and say I don't know is just as important as knowing how to answer in the first place. Currently, Google Gemini 3 Flash AI doesn’t do that very well. That's what the test is for: seeing whether a model can differentiate actual knowledge from a guess. Lest the number distract from reality, it should be noted that Gemini’s high hallucination rate doesn’t mean 91% of its total answers are false. Instead, it means that in situations where the correct answer would be “I don’t know,” it fabricated an answer 91% of the time. That’s a subtle but important distinction, but one that has real-world implications, especially as Gemini is integrated into more products like Google Search. Ok, it's not only me. Gemini 3 Flash has a 91% hallucination rate on the Artificial Analysis Omniscience Hallucination Rate benchmark!?Can you actually use this for anything serious?I wonder if the reason Anthropic models are so good at coding is that they hallucinate much... https://t.co/b3CZbX9pHw pic.twitter.com/uZnF8KKZD4 December 18, 2025 This result doesn't diminish the power and utility of Gemini 3. The model remains the highest-performing in general-purpose tests and ranks alongside, or even ahead of, the latest versions of ChatGPT and Claude. It just errs on the side of confidence when it should be modest. The overconfidence in answering crops up with Gemini's rivals as well. What makes Gemini’s number stand out is how often it happens in these uncertainty scenarios, where there’s simply no correct answer in the training data or no definitive public source to point to. Hallucination Honesty Part of the issue is simply that generative AI models are largely word-prediction tools, and predicting a new word is not the same as evaluating truth. And that means the default behavior is to come up with a new word, even when saying "I don't know" would be more honest. Sign up for breaking news, reviews, opinion, top tech deals, and more. OpenAI has started addressing this and getting its models to recognize...

Preview: ~500 words

Continue reading at Techradar

Read Full Article

Read on Your E-Reader

When Gemini 3 Flash doesn’t know, it still answers anyway

More from Latest from TechRadar