Definition

Direct prompt injection

A prompt injection attack where the attacker places malicious instructions directly in the input sent to the AI model — for example, a user message that tells the agent to ignore its system prompt and perform a different action. Contrasted with indirect prompt injection, where the attack is embedded in external content the agent reads, such as a web page, email, or document.

How TrustSkills detects this

TrustSkills scans OpenClaw and ClawHub skills for direct prompt injection patterns before you install them. The scanner returns plain-English findings — no CVE IDs, no security jargon — with a risk level and a clear explanation of what was found.

Scan a skill free How we detect direct prompt injection →

Related terms

Indirect prompt injection

A prompt injection attack where malicious instructions are embedded in external content that the age…

Prompt injection

A class of attack where malicious input alters an AI model's behavior in ways the system designer di…

System prompt

Instructions passed to an AI model before the user's message, typically used to define the model's p…

Deep dive

Basic knowledge

What is prompt injection?

Prompt injection is not just a clever string. It is any input that changes a model's behavior in a way the system designer did not intend, especially when the model can reach tools, data, and accounts.