Why large language models rarely say “I don’t know”
On Reddit, /u/Ok-Review-3047 asks why AI assistants don’t just admit when they lack information. It’s a fair frustration: models often sound confident even when wrong.
Short answer: today’s models are trained to be fluent, helpful and decisive. That mix optimises for plausible text, not calibrated truthfulness or uncertainty. Unless we deliberately ask for caution or build systems that check facts, they tend to bluff.
What “hallucinations” are and why they happen
Hallucinations are confident but incorrect outputs from a model. They emerge from how modern systems are built:
- Next-token prediction: A large language model (LLM) is trained to predict the next word given the context. This objective rewards fluent continuations, not verifying facts.
- Instruction tuning and RLHF: Models are fine-tuned to be “helpful” using human feedback. Historically, answers that hedge or refuse can be marked as less helpful, so the model learns to answer anyway.
- Decoding choices: Sampling methods (temperature, nucleus sampling) and beam search can prefer confident-sounding phrasings. The model’s internal uncertainty is not directly exposed.
- No calibrated confidence: Most chat UIs don’t show probabilities or “I’m unsure” scores. Even when token probabilities exist under the hood, they aren’t calibrated to real-world correctness.
- Gaps in knowledge or context: Models have a fixed context window (the amount they can read at once). Outside that, they rely on memory from training data, which can be incomplete, outdated, or wrong.
Fluency is cheap; calibration is hard. We’ve optimised for the former more than the latter.
Why models don’t simply refuse: incentives and UX
- User expectations: People want quick answers. Frequent “I don’t know” responses are often rated poorly.
- Provider incentives: Vendors aim for utility and engagement. A cautious model can feel less useful, even if it’s safer.
- Misaligned training signals: When humans label data, they may reward thorough-sounding answers over accurately saying “insufficient information”.
When to expect more hallucinations
- Ambiguous or under-specified prompts: If crucial details are missing, the model will make assumptions to continue.
- Requests for recent events: Many models have a knowledge cut-off and no live browsing unless explicitly enabled.
- Long documents: Important details can fall outside the context window, so the model fills gaps.
- Specialist domains: Medicine, law, accounting and safety-critical topics require sources and domain constraints.
How to elicit “I don’t know” and reduce wrong answers
Prompt techniques that work in practice
- Ask for uncertainty explicitly: “If you’re not at least 80% confident, say ‘I’m unsure’ and explain what’s missing.”
- Require sources: “Cite primary sources. If none are available, say you can’t verify.”
- State assumptions: “List any assumptions you made before answering.”
- Constrain the task: Multiple-choice, step-by-step checks, or asking for a plan before an answer reduces free-form invention.
- Ask for alternatives: “Give two plausible answers and the conditions under which each would be correct.”
Product and engineering patterns
- RAG (retrieval-augmented generation): Fetch documents or database results and have the model quote them. RAG grounds responses in verifiable text. If nothing relevant is found, the model can explicitly say so.
- Tool use and function calling: Let the model call calculators, search APIs or internal systems. Factual operations shift from the model to reliable tools.
- Selective prediction: Don’t answer when confidence is low. Some APIs expose token probabilities (log-probs); these can be calibrated and thresholded.
- Citations in the UI: Showing sources invites scrutiny and discourages the model from fabricating specifics.
- Guardrails: Refusal rules for sensitive domains (medical, legal, financial) to force “I can’t answer” without explicit sources.
Simple system prompt you can adapt
For general Q&A, add a top-of-chat instruction like:
When the question lacks sufficient information or sources, say “I don’t know” or “I’m unsure,” and list what extra data you’d need to answer reliably.
If you’re integrating AI with spreadsheets or internal data, make your grounding explicit. I’ve written a guide on connecting ChatGPT to Google Sheets which shows how to keep the model tied to actual rows and formulas rather than guessing.
Why this matters in the UK: risk, compliance and trust
For UK teams, overconfident models aren’t just annoying – they can be risky. In regulated sectors (health, finance, legal), a fabricated answer can breach professional standards or mislead customers. If personal data is involved, ungrounded outputs can also create data protection issues.
- Data protection: The ICO’s guidance on AI and data protection stresses transparency and accuracy. If your system can’t verify facts, design it to disclose uncertainty. See ICO AI guidance.
- Operational reliability: Build “answer-or-abstain” logic. When confidence is under a threshold, escalate to a human or ask clarifying questions.
- Customer communications: Always signal when an answer is AI-generated and provide links to sources or a human handoff.
- Cost control: RAG and tool use add complexity, but they can cut rework and reduce expensive token churn from back-and-forth clarifications.
How vendors are responding
Providers increasingly ship features to curb hallucinations: better refusal behaviour, tool-use, retrieval, and optional probability outputs. Some UIs now nudge users to add more context. Nonetheless, perfect calibration is not disclosed and remains an open research problem.
Takeaways
- LLMs are built to be fluent and helpful by default, not to abstain.
- Ask for uncertainty and sources explicitly. Ground answers via RAG and tools.
- In UK organisations, design for “abstain when unsure” and follow ICO guidance on transparency and accuracy.
If you want fewer confident mistakes, change the incentives: prompts, product design, and pipelines that reward saying “I don’t know” when the evidence just isn’t there.