Skip to main content
← Back to the Journal
AI · Quality · April 2026·April 2026·7 min read

Hallucinated citations — auditing RAG source links before you trust the UI.

Faisal Al-Anqoodi · Founder & CEO

The UI shows a "source" while the paragraph is missing, truncated, or from the wrong page. This article gives a practical audit path before you ship the assistant to staff or customers.

A Muscat compliance lead opened an internal report. Beside a sentence: a policy filename and page number. The paragraph was not in the file. Two hours of triage showed retrieval had pulled an old chunk whose index was never retired.

Hallucinated citation is not only fluent lying — it breaks the trust chain between product and compliance. The fix is rarely "swap the model first" — it is verify retrieval-to-document grounding before blaming the LLM [1][2].

Hallucinated citations in one sentence.

A citation is hallucinated in production terms when the UI implies a specific document supports a sentence, but literal verification fails — wrong chunk, drifted summarisation, or a superseded file still indexed [2].

Why Arabic documents raise the rate.

Mixed Arabic–English clauses, broken table extraction, and long headings increase the odds of retrieving a "semantically close" but wrong chunk. Tie language failure modes to why Arabic bots fail [3][5].

If you did not open the document, you do not have a citation — you have UI chrome that looks complete.

A four-layer audit path.

Layer one — stable chunk IDs on every answer. Layer two — open the file and verify literal text. Layer three — version policy: expired files leave the index. Layer four — monthly human sampling on high-risk prompts [1][4].

Depth numbers we use at Nuqta.

Medium risk: 50–100 human-reviewed answers pre-launch. Contracts and policies: 200–300 on real operational questions. Tune to your team size — these bands come from our deployments [5].

Caveats: over-audit kills velocity unless you automate the bulk.

Do not hand-review every answer; hand-review what touches legal or financial commitments. Automate the rest with retrieval-vs-generation disagreement alerts.

Closing.

Hallucinated citations are an operations problem before they are a model problem. Tie RAG metrics to citation QA, then launch. If you do not have a high-risk question list this week, you are still testing the interface — not the product.

Frequently asked questions.

  • Is showing the filename enough? No — chunk id and page or offset reduce arguments.
  • What about scanned PDFs? Extraction quality becomes part of risk; read the RAG guide.
  • Does summarisation void citations? It can drift; treat summaries as low-trust without review.
  • Multiple file versions? One active version in the index by policy.
  • Who signs launch? Compliance owner with product — in writing [4].

Sources.

[1] Lewis et al. — RAG (NeurIPS 2020).

[2] Ji et al. — Survey of Hallucination in NLG (ACM CSUR, 2023).

[3] OWASP — LLM Top 10 (insecure output handling).

[4] Sultanate of Oman — PDPL (6/2022) — processing documentation duties.

[5] Nuqta — internal citation QA protocols, April 2026.

Related posts

Share this article

← Back to the JournalNuqta · Journal