India GST Number
Detects India GST Number patterns. This pattern is based on a Microsoft Purview built-in sensitive information type. Due to the numeric format, corroborative evidence keywords are essential for reliable detection.
- Type
- regex
- Engine
- universal
- Confidence
- high
- Confidence justification
- High confidence: the GSTIN format with 2-digit state code, 5-letter PAN, and structured suffix is highly distinctive and unlikely to appear in non-GST contexts. Added context gating and exclusion rules improve precision and reduce incidental matches.
- Detection quality
- Verified
- Jurisdictions
- in
- Regulations
- DPDPA, IT Act 2000 (India)
- Frameworks
- ISO 27001, ISO 27701, PCI-DSS, SOC 2
- Data categories
- financial, business-identifier
- Scope
- narrow
- Risk rating
- 7
- Platform compatibility
- Purview: Compatible, GCP DLP: Compatible, Macie: Compatible, Zscaler: Compatible, Palo Alto: Compatible, Netskope: Compatible
Pattern
\b\d{2}[A-Z]{5}\d{4}[A-Z]\d[A-Z0-9][A-Z0-9]\b
Corroborative evidence keywords
GST, GSTIN, goods and services tax, tax invoice, GST registration, GST filing, GST identification, GST number, GST return, input tax credit, state code, tax identification number, field, column, row, entry, record, value, form, register (+21 more)
Proximity: 300 characters
Should match
22AAAAA0000A1Z5— Standard 15-character GSTIN format07BBBBB1234B1ZA— GSTIN with state code 0729CCCCC5678C1Z3— GSTIN with state code 29
Should not match
22AAAA0000A1Z5— Only 4 letters in PAN section, invalid GSTINAAAAAAA0000A1Z5— No leading state code digits, invalid formattemplate example placeholder record identifier— Template/sample context should be excluded even when anchor words are present
Known false positives
- Generic numeric sequences in non-tax contexts such as reference numbers or account identifiers Mitigation: Require corroborative evidence keywords within the proximity window to distinguish tax identifiers from general numeric data.
- Numbers from other identification schemes with similar digit patterns Mitigation: Layer with jurisdiction-specific detection to prioritise matches in tax-related documents and cross-reference with other identifier types.