Absentee ballot tracking data
Identifies documents containing references to absentee ballot tracking data in Australian contexts. This information type is classified as personally identifiable information under the Privacy Act 1988.
- Type
- regex
- Engine
- boost_regex
- Confidence
- medium
- Confidence justification
- category-aware structural regex with anchor and context constraints replaces phrase-only detection. Added context gating and exclusion rules improve precision and reduce incidental matches.
- Detection quality
- Mixed
- Jurisdictions
- global
- Regulations
- AML/CTF Act (Cth), IPA 2009 (Qld), NDB Scheme (Cth), Privacy Act 1988 (Cth)
- Frameworks
- ISO 27001
- Data categories
- government-id, pii
- Scope
- wide
- Platform compatibility
- Purview: Compatible, GCP DLP: Compatible, Macie: Compatible, Zscaler: Compatible, Palo Alto: Degraded, Netskope: Unsupported
Pattern
(?is)\b(?:absentee\s+ballot|postal\s+vote|ballot\s+tracking|pre[\s-]+poll\s+voting|absent\s+voter|declaration\s+vote|electoral\s+commission|ballot\s+paper|vote\s+count|returning\s+officer)\b
Corroborative evidence keywords
absentee ballot tracking data, absentee, ballot, tracking, data, elections, diplomacy, statecraft, SCADA, PLC, DCS, HMI, Modbus, Modbus TCP, Modbus RTU, DNP3, OPC-UA, OPC Classic, IEC 61850, IEC 60870 (+39 more)
Proximity: 300 characters
Should match
absentee ballot— Primary topic phrase matchpostal vote— Case-insensitive topic phrase matchballot tracking— Alternative topic phrase matchpre-poll voting— Additional topic phrase match
Should not match
unrelated generic text without domain phrases— No relevant topic phrases presentplaceholder value 12345— Random text should not match topic-specific regexvoter sovereign debt— Generic word pair from old broad template should not match
Known false positives
- Common words and phrases related to absentee ballot tracking data appearing in policy documents, training materials, HR templates, or compliance guidelines without actual personal data. Mitigation: Require corroborative evidence keywords within the proximity window to confirm sensitive data context rather than general discussion.
- In Australian English, similar terminology used in formal or administrative contexts (education, professional documentation) that does not constitute sensitive data collection. Mitigation: Layer with additional contextual signals such as structured identifiers, form fields, or database column headers to distinguish sensitive records from general references.
- High-frequency pattern matches in large document corpora due to broad regex anchors. Expected match rate is significantly higher than specific identifier patterns. Mitigation: Tune confidence thresholds for bulk scanning. Consider using this pattern primarily as a pre-filter with secondary validation.
References
- https://www.aec.gov.au/voting/ways_to_vote/
- https://www.oaic.gov.au/privacy/australian-privacy-principles-guidelines