AWS S3 Deep Dive Footguns¶

Mistakes that cause data exposure, data loss, unexpected costs, or production outages with S3.

1. Public bucket (Block Public Access not enabled)¶

You create a bucket, attach a policy granting s3:GetObject to "Principal": "*", and deploy. Every object in the bucket is now readable by anyone on the internet. Automated scanners find open S3 buckets within minutes. Customer data, credentials, database backups -- all exposed.

# Check if public access is blocked
aws s3api get-public-access-block --bucket my-bucket
# If this returns "NoSuchPublicAccessBlockConfiguration" -- you have a problem

# Fix: enable all four block public access settings
aws s3api put-public-access-block --bucket my-bucket \
  --public-access-block-configuration \
  BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true

# Better: enable at account level so new buckets are protected by default
aws s3control put-public-access-block --account-id 123456789012 \
  --public-access-block-configuration \
  BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true

Fix: Enable Block Public Access at the account level. Audit existing buckets with aws s3api get-public-access-block across all accounts. Use AWS Config rule s3-bucket-public-read-prohibited.

2. No versioning (accidental delete = permanent)¶

You run aws s3 rm s3://prod-bucket/config/app.yaml on the wrong bucket. Without versioning, the object is gone. There is no recycle bin, no undo. If this was your application configuration or a database backup, you are in trouble.

# Check versioning status
aws s3api get-bucket-versioning --bucket prod-bucket
# Empty output or Status=Suspended means no protection

# Enable versioning
aws s3api put-bucket-versioning --bucket prod-bucket \
  --versioning-configuration Status=Enabled

# Now if you delete, it creates a delete marker -- the object is recoverable
aws s3api list-object-versions --bucket prod-bucket --prefix config/app.yaml

Fix: Enable versioning on every production bucket. Set lifecycle rules to expire old versions after a retention period so costs do not spiral.

3. Lifecycle rule deleting unexpectedly¶

You set a lifecycle rule to expire objects after 30 days in a bucket that stores configuration files. 31 days later, your application config is gone and the service crashes on the next deploy. You forgot the rule was there, or the prefix filter was broader than intended.

# Audit all lifecycle rules
aws s3api get-bucket-lifecycle-configuration --bucket my-bucket

# Test with S3 Inventory before applying destructive rules
aws s3api put-bucket-inventory-configuration --bucket my-bucket \
  --id pre-lifecycle-audit \
  --inventory-configuration '{
    "Id": "pre-lifecycle-audit",
    "IsEnabled": true,
    "Destination": { "S3BucketDestination": {
      "Bucket": "arn:aws:s3:::inventory-bucket", "Format": "CSV"
    }},
    "Schedule": { "Frequency": "Daily" },
    "IncludedObjectVersions": "Current",
    "OptionalFields": ["LastModifiedDate", "StorageClass"]
  }'

Fix: Audit lifecycle rules before deployment. Use S3 Inventory to preview which objects would be affected. Separate long-lived data (config, backups) from ephemeral data (logs, temp files) into different buckets or prefixes.

4. S3 costs from API calls (LIST is expensive at scale)¶

Your application lists a bucket prefix on every request to check for new files. The bucket has 10 million objects. Each LIST returns 1,000 objects and costs $0.005 per 1,000 requests. Listing the full prefix takes 10,000 API calls = $0.05 per request cycle. At 100 requests/minute, that is $7,200/month just for LIST operations -- and the actual data transfer costs are separate.

# Check request-level costs via S3 Storage Lens or server access logs
aws s3api get-bucket-logging --bucket my-bucket

# Count objects to estimate LIST costs
aws s3 ls s3://my-bucket/data/ --recursive --summarize | tail -2
# Total Objects: 10,432,591
# Total Size: 2.3 TiB

Fix: Use S3 Event Notifications (SQS, Lambda, EventBridge) instead of polling with LIST. For inventory purposes, use S3 Inventory (daily/weekly CSV of all objects) instead of LIST. If you must list, cache the results.

5. Not using multipart for large files¶

You upload a 5GB file with a single PUT. At 4.8GB, the network blips. The entire upload fails and you start over from zero. Or the upload takes so long that your presigned URL or STS token expires mid-transfer.

# Configure aws CLI multipart thresholds
aws configure set default.s3.multipart_threshold 64MB
aws configure set default.s3.multipart_chunksize 16MB

# aws s3 cp uses multipart automatically above the threshold
aws s3 cp large-file.tar.gz s3://my-bucket/backups/

Fix: Use multipart uploads for anything over 100MB. The AWS CLI does this automatically with aws s3 cp. For SDK uploads, configure the multipart threshold. Multipart allows parallel part uploads and resumability.

6. Bucket name globally unique (squatting)¶

You want mycompany-prod-data but someone else already has it. Bucket names are globally unique across all AWS accounts. Worse: if you delete a bucket, someone else can claim the name. If your application or CloudFormation template hardcodes the bucket name, it now points to someone else's bucket.

# Check if a name is available (try to create it)
aws s3api create-bucket --bucket my-desired-name --region us-east-1
# "BucketAlreadyExists" = taken by another account
# "BucketAlreadyOwnedByYou" = you already have it

Fix: Use account ID or random suffix in bucket names (mycompany-prod-data-123456789012). Never delete production buckets without considering name reclamation risk. Use CloudFormation/Terraform to manage bucket lifecycle.

7. Incomplete multipart uploads accumulating¶

Every aborted or timed-out multipart upload leaves parts in S3. These parts are invisible to aws s3 ls but they consume storage and you are billed for them. A CI/CD pipeline that uploads large artifacts and occasionally fails can accumulate gigabytes of phantom storage.

# List incomplete multipart uploads
aws s3api list-multipart-uploads --bucket my-bucket
# If this returns results, you are paying for invisible storage

# Abort all incomplete uploads older than 7 days
aws s3api list-multipart-uploads --bucket my-bucket \
  --query 'Uploads[?Initiated<`2026-03-12`].[Key,UploadId]' --output text | \
  while read key upload_id; do
    aws s3api abort-multipart-upload --bucket my-bucket \
      --key "$key" --upload-id "$upload_id"
  done

Fix: Add a lifecycle rule to every bucket that aborts incomplete multipart uploads after a few days. This is free insurance.

aws s3api put-bucket-lifecycle-configuration --bucket my-bucket \
  --lifecycle-configuration '{
    "Rules": [{
      "ID": "abort-incomplete-multipart",
      "Status": "Enabled",
      "Filter": { "Prefix": "" },
      "AbortIncompleteMultipartUpload": { "DaysAfterInitiation": 7 }
    }]
  }'

8. SSE-KMS throttling from KMS request limits¶

You enable SSE-KMS encryption on a high-throughput bucket. Every GET and PUT request now calls KMS to decrypt/encrypt the data key. KMS has a per-region request limit (default 5,500-30,000 requests/sec depending on region). At scale, your S3 operations start failing with ThrottlingException from KMS.

# Check if KMS throttling is happening
aws cloudwatch get-metric-statistics \
  --namespace AWS/KMS --metric-name ThrottleCount \
  --start-time "$(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S)" \
  --end-time "$(date -u +%Y-%m-%dT%H:%M:%S)" \
  --period 300 --statistics Sum

Fix: Enable S3 Bucket Keys (BucketKeyEnabled: true). This creates a bucket-level data key that is reused for a time window, reducing KMS calls by up to 99%. Or use SSE-S3 (AES256) if you do not need KMS audit trails.

aws s3api put-bucket-encryption --bucket my-bucket \
  --server-side-encryption-configuration '{
    "Rules": [{
      "ApplyServerSideEncryptionByDefault": {
        "SSEAlgorithm": "aws:kms",
        "KMSMasterKeyID": "alias/my-key"
      },
      "BucketKeyEnabled": true
    }]
  }'

11. DELETE without versioning vs with versioning (delete marker confusion)¶

Without versioning, DELETE permanently removes the object. With versioning, DELETE creates a delete marker -- the object appears deleted but all versions remain, consuming storage. You think you cleaned up a bucket but storage usage does not decrease. Or worse: you think you deleted sensitive data but it is still there as a noncurrent version.

# Check for delete markers
aws s3api list-object-versions --bucket my-bucket --prefix sensitive-data/ \
  --query 'DeleteMarkers[].{Key:Key,VersionId:VersionId,IsLatest:IsLatest}'

# Permanently delete requires specifying the version ID
aws s3api delete-object --bucket my-bucket \
  --key sensitive-data/secret.txt --version-id "abc123"

# Delete ALL versions of an object (permanent removal)
aws s3api list-object-versions --bucket my-bucket --prefix sensitive-data/secret.txt \
  --query '[Versions,DeleteMarkers][].[Key,VersionId]' --output text | \
  while read key vid; do
    aws s3api delete-object --bucket my-bucket --key "$key" --version-id "$vid"
  done

Fix: Understand the difference. Use lifecycle rules with NoncurrentVersionExpiration to automatically clean up old versions. For compliance deletion, you must delete every version explicitly. For cost control, set ExpiredObjectDeleteMarker: true in lifecycle rules to clean up orphaned delete markers.