EU Social Security Number (SSN) or Equivalent ID
Detects references to EU Social Security Number (SSN) or Equivalent ID using keyword matching. This pattern is based on a Microsoft Purview built-in sensitive information type. Users already running Purview may prefer to enable the built-in SIT directly, or use this version as a starting point for customisation.
- Type
- regex
- Engine
- universal
- Confidence
- medium
- Confidence justification
- Medium confidence: pattern has structural constraints but corroborative keywords are recommended to reduce false positive rates. Added context gating and exclusion rules improve precision and reduce incidental matches.
- Detection quality
- Mixed
- Jurisdictions
- eu
- Regulations
- GDPR
- Frameworks
- ISO 27001, ISO 27701
- Data categories
- pii, government-id
- Scope
- wide
- Risk rating
- 9
- Platform compatibility
- Purview: Compatible, GCP DLP: Compatible, Macie: Compatible, Zscaler: Compatible, Palo Alto: Degraded, Netskope: Unsupported
Pattern
(?i)\\b(?:eu\\s+social\\s+security\\s+number\\s+\(ssn\)\\s+or\\s+equivalent\\s+id|social|security|ssn|equivalent)\\b
Corroborative evidence keywords
social security, SSN, sozialversicherung, sécurité sociale, previdenza sociale, ID number, identification, ID card, license, permit, registration, certificate, field, column, row, entry, record, value, form, register (+21 more)
Proximity: 300 characters
Should match
social security number— English keyword matchsozialversicherungsnummer— German keyword matchnuméro de sécurité sociale— French keyword match
Should not match
employee benefits plan— Non-SSN referenceinsurance policy number— Generic insurance referencetemplate example placeholder record identifier— Template/sample context should be excluded even when anchor words are present
Known false positives
- Broad keyword matching across EU languages may match general policy discussion, training materials, or compliance documentation. Mitigation: Layer with additional contextual signals such as structured identifiers or form fields to distinguish actual data from policy references.
- In multiple languages, similar terminology used in formal or administrative contexts (education, professional documentation) that does not constitute sensitive data collection. Mitigation: Layer with additional contextual signals such as structured identifiers, form fields, or database column headers to distinguish sensitive records from general references.