S3 URI in Source Code with AWS Credential Context
Detects S3 and S3A URI references in source code or configuration files when accompanied by AWS credential context. S3 URIs alone are high-FP enumeration signals; this pattern only fires at 75+ confidence when AWS credential evidence (access key, secret, AKIA prefix, bucket context) is present within 300 characters. Mirrors Snaffler rule KeepS3UriPrefixInCode.
- Type
- regex
- Engine
- boost_regex
- Confidence
- low
- Confidence justification
- Low confidence alone: s3:// URIs appear in logging config, documentation, and infrastructure code without any credential exposure. Detection is only meaningful when combined with AWS credential evidence (AKIA key IDs, aws_secret_access_key assignments, or bucket policy context) within 300 characters.
- Jurisdictions
- global
- Regulations
- Criminal Code Act 1995 (Cth)
- Frameworks
- CIS Controls, ISO 27001, NIST CSF, PCI-DSS
- Data categories
- credentials, security, cloud
- Scope
- specific
- Platform compatibility
- Purview: Compatible, GCP DLP: Compatible, Macie: Compatible, Zscaler: Compatible, Palo Alto: Compatible, Netskope: Compatible
Pattern
s3a?://[A-Za-z0-9\-+/]{2,40}
Corroborative evidence keywords
aws_access_key_id, aws_secret_access_key, AKIA, bucket, AWS_SECRET
Proximity: 300 characters
Should match
aws_access_key_id=AKIAIOSFODNN7EXAMPLE bucket=s3://my-prod-bucket/data— S3 URI alongside an AWS access key ID - genuine credential contextaws_secret_access_key=wJalrXUtnFEMI/K7MDENG s3://logs-bucket/2026— S3 URI adjacent to an AWS secret access key assignmentAKIAexampleKeyId1234 s3a://hdfs-bucket/warehouse/db.parquet— S3A URI used with a Hadoop S3A connector alongside an AKIA key prefix
Should not match
bucket_name=my-prod-logs storage_class=STANDARD region=us-east-1— AWS bucket configuration without any S3 URI scheme presentaws_access_key_id=AKIAIOSFODNN7EXAMPLE region=us-east-1 endpoint=https://s4.example.com/bucket— AWS credential context but URL uses https not s3 scheme so regex does not match
Known false positives
- High FP rate alone; gated on AWS credential proximity. Any file containing S3 URIs without credential terms (logging config, Spark jobs reading public datasets, infrastructure docs) generates false positives if the gating is bypassed. Mitigation: Enforce 75-tier minimum (evidence required); never fire on pattern alone.
- CI/CD pipeline logs that print s3:// URIs alongside masked credential variables like AWS_SECRET_ACCESS_KEY=***. Mitigation: Masked values (asterisks, REDACTED) should not satisfy the evidence keyword requirement; ensure keyword terms match literal AKIA or secret text.