IPv6
Detects IPv6 patterns. This pattern is based on a Microsoft Purview built-in sensitive information type. Users already running Purview may prefer to enable the built-in SIT directly, or use this version as a starting point for customisation.
- Type
- regex
- Engine
- universal
- Confidence
- high
- Confidence justification
- High confidence: structurally constrained pattern with corroborative keyword support reduces false positive rates significantly. Added context gating and exclusion rules improve precision and reduce incidental matches.
- Detection quality
- Verified
- Jurisdictions
- global
- Frameworks
- CIS Controls, ISO 27001, NIST CSF, SOC 2
- Data categories
- network, pii
- Scope
- wide
- Risk rating
- 4
- Platform compatibility
- Purview: Compatible, GCP DLP: Compatible, Macie: Compatible, Zscaler: Compatible, Palo Alto: Compatible, Netskope: Unsupported
Pattern
\b(?:[0-9A-Fa-f]{1,4}:){7}[0-9A-Fa-f]{1,4}\b
Corroborative evidence keywords
IP address, IP, network, host, server, address, age, birthday, citizenship, city, date of birth, DOB, email, ethnicity, fax, first name, full name, gender, given name, last name (+43 more)
Proximity: 300 characters
Should match
2001:0db8:85a3:0000:0000:8a2e:0370:7334— Full IPv6 addressfe80:0000:0000:0000:0000:0000:0000:0001— Link-local IPv62001:0db8:0000:0000:0000:0000:0000:0001— Documentation IPv6
Should not match
2001:0db8:85a3:0000:0000:8a2e:0370— Only 7 groups instead of 8ZZZZ:0db8:85a3:0000:0000:8a2e:0370:7334— Contains invalid hex characters (Z)2001:0db8:85a3:00000:0000:8a2e:0370:7334— Group exceeds 4 hex digits (5 digits in fourth group)template example placeholder record identifier— Template/sample context should be excluded even when anchor words are present
Known false positives
- Common words and phrases related to ipv6 appearing in policy documents, training materials, HR templates, or compliance guidelines without actual personal data. Mitigation: Require corroborative evidence keywords within the proximity window to confirm sensitive data context rather than general discussion.
- In English (as the primary international business language), similar terminology used in formal or administrative contexts (education, professional documentation) that does not constitute sensitive data collection. Mitigation: Layer with additional contextual signals such as structured identifiers, form fields, or database column headers to distinguish sensitive records from general references.
- High-frequency pattern matches in large document corpora due to broad regex anchors. Expected match rate is significantly higher than specific identifier patterns. Mitigation: Tune confidence thresholds for bulk scanning. Consider using this pattern primarily as a pre-filter with secondary validation.