AI Models & Technology

Indirect Prompt Injection

📖

Definition

Indirect prompt injection is an attack vector in which malicious instructions are embedded within external content — such as a webpage, document, email, or database record — that an AI agent retrieves and processes as part of completing a task. Unlike direct prompt injection (where a user manipulates the model via their own input), indirect injection exploits the model's tendency to follow instructions embedded in any text it reads, regardless of source.

In enterprise AI deployments, this poses serious security and compliance risks. An AI agent tasked with summarizing customer support tickets or browsing the web could unknowingly execute attacker instructions hidden in that content — leaking data, performing unauthorized actions, or corrupting outputs. Mitigation requires input sanitization, privilege separation between data and instructions, and careful scoping of what actions agents are permitted to take based on retrieved content.

🔗

Meta Prompt / System PromptPrompt EngineeringAI as an Appreciating AssetAI Assistant

Last updated: May 12, 2026

Definition

Related Terms