Definition

Direct prompt injection

A prompt injection attack where the attacker places malicious instructions directly in the input sent to the AI model — for example, a user message that tells the agent to ignore its system prompt and perform a different action. Contrasted with indirect prompt injection, where the attack is embedded in external content the agent reads, such as a web page, email, or document.

How TrustSkills detects this

TrustSkills scans OpenClaw and ClawHub skills for direct prompt injection patterns before you install them. The scanner returns plain-English findings — no CVE IDs, no security jargon — with a risk level and a clear explanation of what was found.

Related terms

Deep dive

Basic knowledge

What is prompt injection?

Prompt injection is not just a clever string. It is any input that changes a model's behavior in a way the system designer did not intend, especially when the model can reach tools, data, and accounts.