skillaudit.sh

Prompt Injection

An attack where malicious input overrides or manipulates an LLM's system instructions.

Prompt injection is a security vulnerability where an attacker crafts input that causes a large language model to ignore or override its original instructions. In the context of AI coding assistants and skill files, this can mean instructions embedded in SKILL.md or .cursorrules that attempt to make the model disregard safety constraints, reveal system prompts, or execute unintended actions.

Common patterns include phrases like "ignore previous instructions," "you are now in developer mode," or structured markers like [SYSTEM] or [OVERRIDE]. Attackers may also hide injection attempts inside HTML comments or code blocks to evade casual review.

skillaudit detects prompt injection patterns across all supported skill file formats. We scan for instruction-override phrases, persona-switching attempts, jailbreak terminology, and secret-keeping directives. Each finding is flagged with severity and linked to CWE-94 (Improper Control of Generation of Code) where applicable.

Related terms