Germany Value Added Tax Number
Detects Germany Value Added Tax Number patterns. This pattern is based on a Microsoft Purview built-in sensitive information type. VAT numbers have country-specific prefixes that aid detection accuracy.
- Type
- regex
- Engine
- universal
- Confidence
- high
- Confidence justification
- High confidence: the DE prefix followed by exactly 9 digits is a distinctive format unique to German VAT identification numbers. Added context gating and exclusion rules improve precision and reduce incidental matches.
- Detection quality
- Verified
- Jurisdictions
- de, eu
- Regulations
- GDPR
- Data categories
- financial, business-identifier
- Scope
- narrow
- Risk rating
- 5
- Platform compatibility
- Purview: Compatible, GCP DLP: Compatible, Macie: Compatible, Zscaler: Compatible, Palo Alto: Compatible, Netskope: Compatible
Pattern
\bDE\d{9}\b
Corroborative evidence keywords
Umsatzsteuer, USt-IdNr, VAT, Mehrwertsteuer, Steuernummer, value added tax, VAT number, belasting, BTW, TVA, VAT registration, BTW-nummer, numéro de TVA, tax number, tax registration, taxe sur la valeur ajoutée, adószám, imposta sul valore aggiunto, intracommunautaire, IVA (+40 more)
Proximity: 300 characters
Should match
DE123456789— Standard German VAT number formatDE987654321— German VAT number with DE prefixDE112233445— German VAT identification number
Should not match
DE12345678— Only 8 digits after DE prefix, too shortAT123456789— Austrian prefix, not German VAT formattemplate example placeholder record identifier— Template/sample context should be excluded even when anchor words are present
Known false positives
- Other identifier schemes that coincidentally share a similar prefix and digit structure Mitigation: Validate the complete format including prefix and digit count. Layer with document context to confirm financial or tax-related content.
- Test or example VAT numbers used in documentation or training materials Mitigation: Maintain an allow-list of known test/example numbers. Use document classification to distinguish production data from training content.