Security · Retrieval · April 2026·April 2026·8 min read

Prompt injection and corpus poisoning — the RAG gap vendors smooth over.

In a security workshop, an engineer uploaded a file titled "leave policy — draft." Hidden instructions told the model to ignore prior rules and invent an account number. Retrieval fired, generation followed — output left policy despite a "locked down" UI [1].

Prompt injection exploits the fact that LLMs do not separate "system instructions" from "document text" like a classical program. Corpus poisoning places malicious content where retrieval will surface it [2].

A simple threat map: file to response.

Attackers often do not need firewall breaches — they need upload rights or a poisoned wiki. Treat knowledge-base write permissions like database roles — not like a shared drive [1][3].

FIG. 1 — RAG THREAT PATH: UPLOAD → CHUNK → INDEX → RETRIEVE → LLM

Operational defenses we run with regulated clients.

Separate upload from publish: new files do not hit the production index without raw-text review and alert keywords. Log uploader identity. Constrain generation policy for low-trust sources [3][5].

The index is not a library — it is an attack surface if everyone can write to it.

Effort numbers: prevention vs post-incident counsel.

A upload-review gate often takes one to three engineering days to stand up — versus weeks of legal review if a wrong answer reaches an external customer. Directional from our projects [5].

Caveats: defense is not a keyword blocklist.

Models are linguistically flexible; symbolic blocks are bypassed. Combine document governance, output policy, and periodic red-team samples [1].

Closing.

Prompt injection and corpus poisoning show RAG expands surface area — it does not shrink it. Tie defenses to RAG metrics and MCP boundaries when wiring tools. If you cannot name who may upload to the corpus this month, you still run an open index.

Frequently asked questions.

Is content filtering enough? Partially; governance beats filters alone [1].
Shared vendor documents? Contract access and mutual upload review.
Insider-only threat? Often yes — permissions first [3].
How to test? Poisoned fixtures in an isolated environment pre-production.
Does private AI fix it? It reduces external leakage but not malicious internal upload; read Private AI.

Sources.

[1] OWASP — LLM Top 10.

[2] Perez & Ribeiro — Ignore Previous Prompt (NAACL 2022 workshop).

[3] NIST — AI RMF.

[4] MITRE ATLAS.

[5] Nuqta — internal KB security checklists, April 2026.

Five RAG metrics to check before you blame the LLM.
Before you raise model spend or switch vendors, measure retrieval, chunks, and escalation. Most production hallucination starts in documents and indexes — not parameter count.
Model Context Protocol at work: the bridge is not the border.
MCP explains how tools plug into an LLM — it does not replace decisions on where data is processed, who owns logs, or whether inference leaves your network.
Enterprise AI agents vs a RAG-first pipeline — when orchestration is theater.
Most "agents" in production are solid retrieval + a few tools + policies — not a self-driving orchestrator making unsupervised decisions. This article gives a blunt product decision before you multiply complexity.
Enterprise Prompt Injection: Defence Layers Beyond Word Blocklists.
A word list won’t stop instructions hidden in innocent sentences — real defence separates privileges, judges retrieval, and logs manipulation like classic intrusions.
What prompt injection actually is — before you flip on tools.
A blocklist stops neither an adversary nor a clever employee paste. Strings merge in one stream; attackers hide instructions inside email your assistant ingests quietly.

Share this article

X (Twitter)LinkedIn WhatsApp

← Back to the JournalNuqta · Journal