Diagnostic imaging data
Identifies diagnostic imaging data references in healthcare and patient records. Protected health information under applicable data protection regulations.
- Type
- regex
- Engine
- boost_regex
- Confidence
- medium
- Confidence justification
- identifier/document-structure anchored regex with constrained context replaces phrase-only detection.
- Detection quality
- Mixed
- Jurisdictions
- au
- Regulations
- HRIPA (Cth), IPA 2009 (Qld), My Health Records Act 2012 (Cth), NDB Scheme (Cth), Privacy Act 1988 (Cth)
- Frameworks
- ISO 27001, ISO 27701, SOC 2
- Data categories
- healthcare, phi
- Scope
- wide
- Platform compatibility
- Purview: Compatible, GCP DLP: Compatible, Macie: Compatible, Zscaler: Compatible, Palo Alto: Degraded, Netskope: Unsupported
Pattern
(?is)\b(?:diagnostic\s+imaging|medical\s+imaging|CT\s+scan|imaging\s+data|imaging\s+report)\b
Corroborative evidence keywords
diagnostic imaging data, diagnostic, imaging, data, health, biomedical, information, My Health Record, pathology result, diagnostic imaging, discharge summary, prescription record, immunisation history, immunization history, organ donor, clinical trial, medical history, allergy, blood test, X-ray (+4 more)
Proximity: 300 characters
Should match
diagnostic imaging— Primary topic phrase matchmedical imaging— Case-insensitive topic phrase matchCT scan— Alternative topic phrase matchimaging data— Additional topic phrase match
Should not match
unrelated generic text without domain phrases— No relevant topic phrases presentplaceholder value 12345— Random text should not match topic-specific regexpatient biometric— Generic word pair from old broad template should not match
Known false positives
- Medical terminology in health education materials, research publications, clinical guidelines, or public health documents without patient-specific data. Mitigation: Require corroborative evidence keywords confirming patient context. Look for co-occurrence with patient identifiers such as medical record numbers or dates of birth.
- General wellness and fitness content using medical vocabulary without constituting protected health information. Mitigation: Layer with patient identifier patterns or healthcare-specific document structure detection to distinguish clinical records from general health content.
References
- https://www.legislation.gov.au/C2012A00063/latest/text
- https://www.safetyandquality.gov.au/standards/nsqhs-standards