Skip to main content
Glossary · AI methods

LLM Hallucination.

An LLM hallucination is confident-sounding output that is not grounded in retrieved evidence.

Definition

An LLM hallucination is output from a large language model that sounds confident and plausible but is not grounded in retrieved evidence — invented citations, wrong metrics, fabricated quotes, or made-up product features. Hallucinations happen when a model generates from parametric memory rather than from a verified source. Mitigations include retrieval-augmented generation, citation-grounded pipelines, and deterministic design.

Definition

An LLM hallucination is any output from a language model that presents as factual but is not supported by evidence available to the model at generation time. The term covers a range of failures: invented citations to papers or URLs that do not exist, quoted text attributed to a real person who never said it, summary metrics that do not match the underlying data, and features or specifications described for a product that does not have them. It is a technical failure mode, not a rhetorical choice; the model is not lying in the human sense, it is generating.

Hallucinations arise from a structural property of open-ended generation. A language model predicts plausible next tokens based on training data and prompt context. When neither contains the grounded answer, the model produces a fluent guess. The output often reads as confident precisely because fluency is what the model optimizes for.

Why it matters

Feedback intelligence touches money-attached decisions: listing changes, factory escalations, response playbooks, roadmap priority. An AI summary of 8,000 reviews that invents a failure mode not present in the corpus sends a QA team on a wild-goose chase. An AI response drafted for a negative review that misstates warranty terms creates a legal exposure. A weekly leadership briefing that cites a fabricated competitor metric damages the briefer's credibility the moment someone checks.

The mitigations are well-understood, as of Q1 2026. Retrieval-augmented generation forces the model to read from a defined corpus. Citation-grounded pipelines make every claim traceable. Deterministic design constrains generation to summarize retrieved evidence rather than invent text. Each reduces hallucination risk; together they approach what teams need for decision-grade output.

Example

A Product lead at a small-appliance brand pastes 300 Amazon reviews into a generic chat tool and asks, "What's the top complaint?" The tool answers, "Reviewers frequently cite the product's 90-day battery degradation issue." The brand's SKU is corded and has no battery. The model pattern-matched against similar products in training data and generated a plausible-sounding complaint that does not exist in the pasted corpus. A deterministic pipeline would have returned a ranked theme list drawn only from the supplied records and cited each one — producing a true top complaint and no invented ones. indelliaGPT™ is designed to this shape. The distinction is not theoretical: the first version gets escalated into a roadmap meeting, where three people spend a week investigating a battery issue that does not exist and a real complaint goes unaddressed.

Ask Indellia

Have a specific question?

Indellia's AI agents answer with citations from real customer feedback across Amazon, Walmart, Best Buy, and 20+ retail channels.

Get started

Citations, not confabulations.

indelliaGPT™ answers feedback questions from your corpus and cites the source records. Built on Indellia's NEC Labs foundations. Unlimited users. Unmetered data. $495/mo SME, $1,995/mo Mid-Market.