Hallucination
When an LLM produces fluent, confident output that is simply false, which is a direct consequence of how the model works and cannot be fully eliminated, only reduced.
A hallucination is when a model generates something that sounds right and is wrong: an invented citation, a made-up API, a plausible but false fact. It is not a glitch you can patch out. An LLM predicts likely text, and a fluent, confident-sounding answer is statistically likely whether or not it happens to be true. The model has no internal sense of “I do not know,” so it fills the gap with something that fits the pattern.
Because it is structural, the honest framing is risk reduction, not elimination. The biggest lever is grounding: give the model the actual facts at runtime via retrieval (RAG) so it answers from real source text instead of its training-time guess. Constrain output formats so there is less room to improvise. Ask the model to cite sources you can check. And critically, evaluate, build a test set of real questions, measure how often the system is wrong, and treat that number as a quality metric you track like any other.
This is where AI meets QA, and most teams skip it. Shipping an LLM feature without an evaluation harness is shipping untested code and calling it done. You need to know your failure rate before your users find it for you. We treat that as non-negotiable under Software Quality Assurance.
The trust angle is the whole game in regulated or high-stakes domains. A hallucination in a chatbot that recommends a movie is a shrug. The same hallucination in financial, legal, or medical output is a liability. Match the guardrails to the cost of being wrong, and never let a fluent answer substitute for a verified one.