Github Pat
Detects Github Pat patterns. This pattern is based on a Microsoft Purview built-in sensitive information type. Users already running Purview may prefer to enable the built-in SIT directly, or use this version as a starting point for customisation.
- Type
- regex
- Engine
- universal
- Confidence
- high
- Confidence justification
- High confidence: structurally constrained pattern with corroborative keyword support reduces false positive rates significantly. Added context gating and exclusion rules improve precision and reduce incidental matches.
- Detection quality
- Verified
- Jurisdictions
- global
- Regulations
- Criminal Code Act 1995 (Cth)
- Frameworks
- CIS Controls, ISO 27001, NIST CSF, PCI-DSS, SOC 2
- Data categories
- credentials, security
- Scope
- specific
- Risk rating
- 10
- Platform compatibility
- Purview: Compatible, GCP DLP: Compatible, Macie: Compatible, Zscaler: Compatible, Palo Alto: Compatible, Netskope: Compatible
Pattern
\b(ghp|gho|ghu|ghs|ghr)_[A-Za-z0-9]{36}\b
Corroborative evidence keywords
api key, api_key, apikey, access key, secret key, private key, auth token, authorization, access token, bearer, conn str, connection string, connectionstring, cookie, credential, database, host, JWT, oauth, passphrase (+37 more)
Proximity: 300 characters
Should match
ghp_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef0123— GitHub PAT with ghp prefixgho_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef0123— GitHub OAuth tokenghs_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef0123— GitHub server-to-server token
Should not match
ghx_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef01— Invalid prefix (ghx instead of ghp/gho/ghu/ghs/ghr)ghp_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef0— Too few characters after prefix (35 instead of 36)ghp_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef012— Too many characters after prefix (37 instead of 36)template example placeholder record identifier— Template/sample context should be excluded even when anchor words are present
Known false positives
- Authentication-related terminology in software documentation, security training materials, or system architecture descriptions without actual credentials. Mitigation: Require proximity to credential-specific patterns (API keys, connection strings, tokens) rather than general security terminology.
- Code snippets and configuration examples containing credential-related keywords or placeholder values in developer documentation. Mitigation: Check for common placeholder patterns (example.com, localhost, 0000) and documentation file types to reduce false positives from technical writing.