Date of birth
Detects date-of-birth references in documents across multiple date formats (numeric slash, dot, hyphen, written month, ISO). Requires birth-record keywords in proximity to distinguish DOB from invoice dates, event dates, and other temporal references. No Microsoft built-in SIT exists for unstructured DOB detection.
- Type
- regex
- Engine
- boost_regex
- Confidence
- medium
- Confidence justification
- Medium confidence: date formats appear in every business document. Birth-record keywords are essential to distinguish DOB from the thousands of other dates in any corpus.
- Jurisdictions
- global
- Regulations
- GDPR, CCPA, HIPAA, PIPEDA
- Frameworks
- ISO 27001, ISO 27701, NIST CSF
- Data categories
- pii
- Scope
- wide
- Risk rating
- 7
- Platform compatibility
- Purview: Compatible, GCP DLP: Compatible, Macie: Compatible, Zscaler: Compatible, Palo Alto: Compatible, Netskope: Unsupported
Pattern
\b(?:(?:0?[1-9]|[12]\d|3[01])[\/.\-](?:0?[1-9]|1[0-2])|(?:0?[1-9]|1[0-2])[\/.\-](?:0?[1-9]|[12]\d|3[01]))[\/.\-](?:19|20)\d{2}\b|\b(?:19|20)\d{2}-(?:0[1-9]|1[0-2])-(?:0[1-9]|[12]\d|3[01])\b
Corroborative evidence keywords
DOB, date of birth, birth date, born, born on, birthday, d.o.b, d.o.b., address, age, citizenship, city, email, ethnicity, fax, first name, full name, gender, given name, last name (+14 more)
Proximity: 300 characters
Should match
Date of birth: 03/15/1990— US month-first dateDOB 15/03/1985— Day-first dateborn 1990-03-15— ISO date
Should not match
reference 12345— Not a datesee page 42— Not a datenot a date here— Plain prose
Known false positives
- Non-DOB dates such as invoice dates, event dates, document timestamps, and meeting dates. Mitigation: Require birth-record keywords (DOB, date of birth, born) in proximity. Without these keywords the pattern fires at low confidence only.
- Historical dates in news articles or academic papers that happen to match the date format. Mitigation: Corroborative evidence keywords distinguish personal records from historical references.
- Dates in financial documents (settlement dates, maturity dates, filing dates). Mitigation: Filter excludes common business date labels via TextMatchFilter on the matched context.
References
- https://learn.microsoft.com/en-us/answers/questions/2153006/microsoft-purview-custom-sit-date-of-birth
- https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/personal-information-what-is-it/what-is-personal-data/