Employer Identification Number
Detects Employer Identification Number patterns.
- Type
- regex
- Engine
- universal
- Confidence
- medium
- Confidence justification
- Medium confidence: pattern has structural constraints but corroborative keywords are recommended to reduce false positive rates.
- Detection quality
- Verified
- Jurisdictions
- us
- Regulations
- CCPA/CPRA
- Frameworks
- ISO 27001, ISO 27701
- Data categories
- pii, government-id
- Scope
- narrow
- Platform compatibility
- Purview: Compatible, GCP DLP: Compatible, Macie: Compatible, Zscaler: Compatible, Palo Alto: Compatible, Netskope: Compatible
Pattern
\b(0[1-6]|1[0-6]|2[0-7]|3[0-9]|4[0-8]|5[0-9]|6[0-8]|7[1-7]|8[1-8]|9[0-5]|9[89])-?\d{7}\b
Corroborative evidence keywords
identifier, number, ID, ID number, identification, ID card, license, permit, registration, certificate, transaction, transfer, payment, deposit, withdrawal, debit, credit
Proximity: 300 characters
Should match
12-3456789— Standard EIN with hyphen123456789— EIN without hyphen01-0000001— Low-range EIN
Should not match
00-1234567— Invalid prefix (00 not in valid EIN prefix range)07-1234567— Invalid prefix (07 not in valid EIN prefix range)12-123456— Only 6 digits after prefix instead of 7
Known false positives
- Common words and phrases related to employer identification number appearing in policy documents, training materials, HR templates, or compliance guidelines without actual personal data. Mitigation: Require corroborative evidence keywords within the proximity window to confirm sensitive data context rather than general discussion.
- In American English, similar terminology used in formal or administrative contexts (education, professional documentation) that does not constitute sensitive data collection. Mitigation: Layer with additional contextual signals such as structured identifiers, form fields, or database column headers to distinguish sensitive records from general references.