Cédula de Ciudadanía
Detects Cédula de Ciudadanía patterns. Detects this pattern using regex with corroborative keyword evidence.
- Type
- regex
- Engine
- universal
- Confidence
- low
- Confidence justification
- Low confidence: pattern is broad and will match many non-identity sequences. Corroborative keywords are essential to achieve acceptable false positive rates. Context label evidence plus explicit template/example exclusion improves precision for high-risk identifiers.
- Detection quality
- Mixed
- Jurisdictions
- co
- Regulations
- PDPL (CO)
- Frameworks
- ISO 27001, ISO 27701
- Data categories
- pii, government-id
- Scope
- narrow
- Risk rating
- 9
- Platform compatibility
- Purview: Compatible, GCP DLP: Compatible, Macie: Compatible, Zscaler: Compatible, Palo Alto: Compatible, Netskope: Compatible
Pattern
\b\d{6,10}\b
Corroborative evidence keywords
cédula, ciudadanía, national ID, ID number, identification, ID card, license, permit, registration, certificate
Proximity: 300 characters
Should match
1234567890— Ten-digit Colombian ID12345678— Eight-digit Colombian ID123456— Six-digit Colombian ID
Should not match
12345— Too few digits (5)12345678901— Too many digits (11)sample template placeholder number 123456789— Template/sample context should be excluded even when numeric-like values appear
Known false positives
- Variable-length numeric sequences (6-10 digits) are extremely common and will match many non-identity numbers. Mitigation: Require corroborative evidence keywords such as "cédula" or "ciudadanía" within the proximity window. This pattern has a high false positive rate without keyword context.
- In multiple languages, similar terminology used in formal or administrative contexts (education, professional documentation) that does not constitute sensitive data collection. Mitigation: Layer with additional contextual signals such as structured identifiers, form fields, or database column headers to distinguish sensitive records from general references.