High-Confidence Starter Pack

A curated set of high-confidence, low false-positive detection patterns. The best starting point for organizations new to DLP pattern deployment. Covers major jurisdictions and credential types.

Jurisdictions
au, us, eu, global
Regulations
privacy-act-1988, hipaa, gdpr, pci-dss
Patterns
10

Patterns in this collection

Australian Medicare Number

Detects Australian Medicare card numbers, which are issued to Australian citizens and permanent residents for access to subsidised healthcare under the Medicare system. The number consists of 10 digits: the first digit is between 2 and 6, followed by 8 additional digits and a single check digit. Numbers are commonly formatted as groups of 4-5-1 digits separated by spaces or hyphens.

Type
regex
Confidence
high

Australian Taxation Identifier

Identifies Australian Tax File Numbers (TFNs) — unique nine-digit identifiers issued by the Australian Taxation Office. Uses Func_australian_tax_file_number validator with AllDigitsSameFilter and TextMatchFilter to exclude known test values.

Type
regex
Confidence
high

SIN

Detects SIN patterns. This pattern is based on a Microsoft Purview built-in sensitive information type. Users already running Purview may prefer to enable the built-in SIT directly, or use this version as a starting point for customisation.

Type
regex
Confidence
medium

International Bank Account Number (IBAN)

Detects International Bank Account Numbers (IBANs) used across European and global banking systems. IBANs consist of a two-letter country code, two check digits, and up to 30 alphanumeric characters representing the domestic bank account number. This pattern matches IBANs with or without space or hyphen delimiters between digit groups.

Type
regex
Confidence
high

AWS Access Key

Detects AWS Access Key patterns.

Type
regex
Confidence
high

Electronic Mail Address

Identifies electronic mail addresses in RFC-compatible local-part@domain.tld format. Requires valid top-level domain of 2+ characters. Uses word-boundary anchoring distinct from substring matching.

Type
regex
Confidence
high

Github Pat

Detects Github Pat patterns. This pattern is based on a Microsoft Purview built-in sensitive information type. Users already running Purview may prefer to enable the built-in SIT directly, or use this version as a starting point for customisation.

Type
regex
Confidence
high

Payment Card PAN

Identifies payment card Primary Account Numbers (PANs) for Visa, Mastercard (including 2-series BINs), American Express, and Discover. Uses Func_credit_card Luhn validator with AllDigitsSameFilter.

Type
regex
Confidence
high

National Insurance Number

Detects National Insurance Number patterns. This pattern is based on a Microsoft Purview built-in sensitive information type. Users already running Purview may prefer to enable the built-in SIT directly, or use this version as a starting point for customisation.

Type
regex
Confidence
medium

US Social Security Number

Detects US Social Security Numbers (SSNs) in both formatted (XXX-XX-XXXX) and unformatted (XXXXXXXXX) representations. The pattern excludes invalid SSN ranges including area numbers 000, 666, and 900-999, group numbers 00, and serial numbers 0000, in accordance with SSA assignment rules. SSNs are critical PII used across healthcare, financial, and government contexts.

Type
regex
Confidence
high