Unique Student Identifier

Name: Unique Student Identifier
Creator: TestPattern Community
License: https://opensource.org/licenses/MIT

Detects Unique Student Identifier (USI) patterns. A 10-character alphanumeric code (uppercase letters and digits) mandatory since January 2023 for higher education in Australia.

Type: regex
Engine: universal
Confidence: low
Confidence justification: Low confidence: a 10-character uppercase alphanumeric pattern is extremely broad and will match tracking numbers, serial codes, product IDs, and many other identifiers. Corroborative evidence keywords such as USI, student identifier, or TAFE are essential for reliable detection.
Detection quality: Partial
Jurisdictions: au
Regulations: AML/CTF Act (Cth), IPA 2009 (Qld), NDB Scheme (Cth), Privacy Act 1988 (Cth)
Frameworks: ISO 27001
Data categories: pii, government-id
Scope: wide
Risk rating: 6
Platform compatibility: Purview: Compatible, GCP DLP: Compatible, Macie: Compatible, Zscaler: Compatible, Palo Alto: Compatible, Netskope: Compatible

Pattern

\b[A-Z0-9]{10}\b

Corroborative evidence keywords

USI, unique student identifier, student identifier, student, transcript, grade, GPA, enrollment, FERPA, FAFSA, financial aid, tuition, degree

Proximity: 300 characters

Should match

ABC1234567 — USI with mixed letters and digits
2K48TF3GHP — Standard USI format
A1B2C3D4E5 — Alternating format USI

Should not match

ABC12345 — Only 8 characters instead of 10
ABC12345678 — 11 characters instead of 10
ABC-123456 — Hyphen breaks the 10-character contiguous alphanumeric run
0412 345 678 — Australian mobile phone number (spaced) is not a 10-char contiguous USI; AU packages also gate phone-shaped literals via the regional exclusion
0000-000000 — Placeholder sentinel value (separated) is not a 10-char contiguous USI and is also excluded by the sentinel filter

Known false positives

Common words and phrases related to unique student identifier appearing in policy documents, training materials, HR templates, or compliance guidelines without actual personal data. Mitigation: Require corroborative evidence keywords within the proximity window to confirm sensitive data context rather than general discussion.
In Australian English, similar terminology used in formal or administrative contexts (education, professional documentation) that does not constitute sensitive data collection. Mitigation: Layer with additional contextual signals such as structured identifiers, form fields, or database column headers to distinguish sensitive records from general references.
High-frequency pattern matches in large document corpora due to broad regex anchors. Expected match rate is significantly higher than specific identifier patterns. Mitigation: Tune confidence thresholds for bulk scanning. Consider using this pattern primarily as a pre-filter with secondary validation.
Australian mobile and fixed-line telephone numbers can satisfy the broad 10-character alphanumeric USI anchor. Mitigation: Use the AU regional phone-number NONE-of gate for Australian packages and keep USI/education context mandatory for deployed policies.