AI Agent Action Misuse

Detects agent tool-misuse / high-risk action instructions in AI/agent context: destructive or irreversible action language coupled with a high-impact target or an approval-bypass clause (OWASP LLM06 / Agentic Top 10). The action verb and target must co-occur to suppress IT-runbook false positives.

Type: keyword_list
Confidence: low
Confidence justification: Low by design. Destructive action verbs appear routinely in legitimate IT runbooks and change tickets (the named FP traps); the action verb plus a high-impact target or approval-bypass clause is necessary-not-sufficient. Confidence is raised only with AI/agent-context corroboration.
Jurisdictions: global
Regulations: OWASP LLM Top 10 2025, NIST AI RMF GenAI Profile
Frameworks: ISO 27001
Data categories: emerging, security
Risk rating: 8

Pattern

(?i)\b(?:delete|drop|truncate|wipe|exfiltrate|bulk export|grant (?:admin|access)|revoke|disable)\b[^.\n]{0,40}\b(?:without (?:approval|confirmation)|all (?:records|users|data)|production|entire database)\b

Corroborative evidence keywords

agent, tool, action, approval, AI, artificial intelligence, LLM, large language model, Copilot, chatbot, assistant, prompt, system prompt, tool call, completion, model

Proximity: 300 characters

Should match

Agent: delete all records in production without approval. — Destructive action verb + high-impact target + bypass approval

Should not match

The runbook explains how to restart the service safely. — IT runbook with no destructive action + target (FP trap)

Collections

AI Threat Classifiers