AI Repo-Borne Instruction Risk

Detects instructions embedded in repository text (README, AGENTS.md, comments, issues) that target an AI coding agent and direct it to take risky action - skipping review/tests, exfiltrating data, or committing secrets (OWASP LLM01 indirect injection via source control).

Type
keyword_list
Confidence
low
Confidence justification
Low by design. Legitimate AGENTS.md / assistant-instruction files and security-training repos contain agent-addressed directives (the named FP traps); the AI-agent addressee plus a risky directive is necessary-not-sufficient. Confidence is raised only with injection-verb corroboration.
Jurisdictions
global
Regulations
OWASP LLM Top 10 2025, NIST AI RMF GenAI Profile
Frameworks
ISO 27001
Data categories
emerging, security
Risk rating
7

Pattern

(?i)\b(?:AI agent|copilot|coding assistant|the assistant)\b[^.\n]{0,40}\b(?:must|should|always|ignore|skip (?:the )?(?:review|tests)|exfiltrate|commit secrets)\b

Corroborative evidence keywords

repository, README, AGENTS.md, commit, ignore previous instructions, ignore all previous instructions, disregard the system prompt, disregard prior instructions, forget your instructions, override your rules, you are now, new instructions, do not follow your guidelines, bypass your restrictions

Proximity: 300 characters

Should match

Should not match

Collections