Skip to content

S3-Compatible Object Storage — Street-Level Ops

Quick Diagnosis Commands

# MinIO health check
curl -s http://minio:9000/minio/health/live    # 200 = OK
curl -s http://minio:9000/minio/health/ready   # 200 = ready, 503 = not ready

# MinIO cluster info
mc admin info local
# Shows: drives (total/active/offline), objects, version, uptime, network

# Drive status
mc admin info local --json | jq '.info.servers[].drives'

# Disk usage
mc du local/my-bucket

# List all buckets and sizes
mc du local/

# Check MinIO logs
mc admin logs local    # tail live logs
journalctl -u minio -n 100

# Check free space
df -h /data

Gotcha: Incomplete Multipart Uploads Consuming All Disk Space

MinIO disk is full but you can't see enough objects to account for the space. The culprit is incomplete multipart uploads — they reserve space but don't appear as regular objects in mc ls.

Rule: Incomplete MPUs accumulate silently. A failed 100 GB upload is invisible to normal listing but consumes 100 GB of disk.

# List incomplete multipart uploads
mc ls --incomplete local/my-bucket
aws --endpoint-url=http://minio:9000 s3api list-multipart-uploads --bucket my-bucket

# Abort all incomplete uploads for a bucket
aws --endpoint-url=http://minio:9000 s3api list-multipart-uploads \
  --bucket my-bucket --query 'Uploads[*].[Key,UploadId]' --output text | \
  while read key uid; do
    aws --endpoint-url=http://minio:9000 s3api abort-multipart-upload \
      --bucket my-bucket --key "$key" --upload-id "$uid"
    echo "Aborted: $key ($uid)"
  done

# Prevent future accumulation: apply lifecycle rule
cat > /tmp/lifecycle.json <<'EOF'
{
  "Rules": [{
    "ID": "abort-incomplete-mpu",
    "Status": "Enabled",
    "Filter": {"Prefix": ""},
    "AbortIncompleteMultipartUpload": {"DaysAfterInitiation": 3}
  }]
}
EOF
mc ilm import local/my-bucket < /tmp/lifecycle.json

Gotcha: ListObjects Killing MinIO Performance

A job runs aws s3 ls s3://my-bucket/ --recursive on a bucket with 50 million objects. MinIO CPU spikes to 100%, all other operations slow down. ListObjects is O(n) over all objects — it scans the entire namespace.

Rule: ListObjects is expensive at scale. Avoid full recursive listings in hot paths.

# Instead of recursive listing, use prefixes to narrow scope
aws --endpoint-url=http://minio:9000 s3 ls s3://my-bucket/logs/2024/01/15/ --recursive

# For counting/size without listing every object
mc du local/my-bucket/logs/2024/01/15/

# If you must list millions of objects, paginate and off-peak
aws --endpoint-url=http://minio:9000 s3api list-objects-v2 \
  --bucket my-bucket --prefix logs/2024/ \
  --max-items 1000 --page-size 1000

# Structure buckets to keep prefix namespaces small (< 1M objects per prefix)
# Use date-based prefixes: logs/YYYY/MM/DD/ instead of logs/

Gotcha: Object Storage Used for Random Small-IO Workloads

Under the hood: S3-compatible APIs are HTTP at the wire level. Every GET is a full HTTP request-response cycle: DNS lookup, TCP handshake, TLS negotiation (if HTTPS), HTTP headers, then payload. A 4 KB random read that takes microseconds on a local SSD costs 10-100ms on object storage.

A developer stores a SQLite database in MinIO and opens it directly via FUSE. Queries are taking 30 seconds. Object storage has 10-100ms per GET/PUT — random 4 KB reads map to hundreds of individual HTTP requests.

Rule: Object storage is optimized for large sequential objects, not random access. Anything that needs sub-millisecond latency or random I/O (databases, OS disks) belongs on block storage.

Good uses: log archives, backups, container images, binaries, static assets, ML training data, metric blocks (Thanos), log chunks (Loki). Bad uses: databases (SQLite, Postgres, MySQL data directory), application working data, anything with frequent small writes.


Pattern: Syncing a Directory to S3 Efficiently

# Sync with mc mirror (MinIO-specific, fast)
mc mirror /local/backups/ local/backups/

# Only upload new/changed files (check by ETag/size)
mc mirror --overwrite /local/backups/ local/backups/

# Delete destination objects not in source (dangerous!)
mc mirror --remove --overwrite /local/backups/ local/backups/

# AWS CLI sync (works with any S3-compatible endpoint)
aws --endpoint-url=http://minio:9000 s3 sync /local/backups/ s3://backups/ \
  --exclude "*.tmp" \
  --include "*.gz"

# Sync with delete
aws --endpoint-url=http://minio:9000 s3 sync /local/backups/ s3://backups/ --delete

# Tune concurrency for large syncs
aws --endpoint-url=http://minio:9000 s3 sync /local/ s3://bucket/ \
  --no-progress \
  --cli-read-timeout 300 \
  --cli-connect-timeout 60

# AWS CLI multipart threshold and concurrency (s3 config)
aws configure set default.s3.multipart_threshold 64MB
aws configure set default.s3.multipart_chunksize 16MB
aws configure set default.s3.max_concurrent_requests 20

Pattern: Rotating Access Keys Safely

Gotcha: If you delete the old key before all consumers have switched to the new one, those consumers will get 403 Access Denied and potentially drop data. Always verify the new key is working in production before removing the old one -- check your application logs, not just mc ls.

MinIO service accounts can have multiple access key pairs. Rotate without downtime by creating the new key before retiring the old.

# 1. Create a new service account for the user
mc admin user svcacct add local appuser
# Returns: Access Key: NEWKEY123, Secret Key: NEWSECRET456

# 2. Update application config to use new key (deploy/restart)

# 3. Verify new key is working
mc alias set testkey http://localhost:9000 NEWKEY123 NEWSECRET456
mc ls testkey/my-bucket

# 4. Remove old key
mc admin user svcacct list local appuser    # find the old access key
mc admin user svcacct rm local OLDKEY123

Scenario: Loki or Thanos Can't Write to MinIO

Loki logs: object storage put error: RequestError: send request failed or Access Denied.

# Step 1: Can we reach MinIO at all?
curl -v http://minio:9000/minio/health/live

# Step 2: Does the bucket exist?
mc ls local/ | grep loki   # or thanos, tempo

# If missing:
mc mb local/loki

# Step 3: Do the credentials work?
mc alias set lokitest http://minio:9000 LOKI_ACCESS_KEY LOKI_SECRET_KEY
mc ls lokitest/loki

# Step 4: Does the user have write permission to the bucket?
mc admin policy attach local readwrite --user lokiuser
# or check existing policies
mc admin user info local lokiuser

# Step 5: Verify Loki/Thanos config points to the right endpoint
# Common mistake: using minio service name instead of external LB
# or using HTTPS endpoint but minio has no TLS configured
# Loki config: insecure: true  when using HTTP

# Step 6: Test a manual upload to the bucket
echo "test" | mc pipe lokitest/loki/test-$(date +%s).txt
mc ls lokitest/loki/ | tail -3

Scenario: MinIO Drive Offline — Degraded Cluster

mc admin info local shows one or more drives as offline. Data may be served via erasure coding but redundancy is reduced.

# Step 1: Check which drives are offline
mc admin info local --json | jq '.info.servers[].drives[] | select(.state != "ok")'

# Step 2: Identify the physical drive
# Drive path in mc output corresponds to the path configured in minio startup
# Check if the disk is mounted
lsblk
df -h /data1 /data2 /data3 /data4

# Step 3: If drive is unmounted (not failed, just unmounted)
mount /dev/sdX /data2
systemctl restart minio   # or send SIGUSR1 to reload
mc admin service restart local

# Step 4: If drive failed, replace hardware
# MinIO heals automatically after drive replacement
mc admin heal local --recursive

# Step 5: Monitor heal progress
mc admin heal local --status

# Step 6: Check remaining redundancy
mc admin info local
# Healthy cluster: all drives online, no offline

Default trap: MinIO buckets do NOT have versioning enabled by default. If you delete an object without versioning, it is gone permanently. Enable versioning on any bucket that stores data you cannot regenerate: mc version enable local/my-bucket.

Emergency: Accidental Bucket Delete

# If versioning was enabled: objects are not gone, just have a delete marker
mc version info local/my-bucket   # check if versioning was enabled

# List delete markers
aws --endpoint-url=http://minio:9000 s3api list-object-versions \
  --bucket my-bucket \
  --query 'DeleteMarkers[*].[Key,VersionId]' --output text | head -20

# Remove delete markers to restore objects (restores them to visible state)
aws --endpoint-url=http://minio:9000 s3api list-object-versions \
  --bucket my-bucket \
  --query 'DeleteMarkers[*].[Key,VersionId]' --output text | \
  while read key vid; do
    aws --endpoint-url=http://minio:9000 s3api delete-object \
      --bucket my-bucket --key "$key" --version-id "$vid"
  done

# If versioning was NOT enabled: objects are gone
# Restore from backup (Velero, restic, etc.)
# MinIO Enterprise: may have snapshot-based recovery

Useful One-Liners

# MinIO health and info
mc admin info local
mc admin prometheus generate local   # Prometheus scrape config

# Find large objects
mc find local/my-bucket --larger 1GiB

# Count objects in a prefix
mc ls --recursive local/my-bucket/logs/ | wc -l

# Delete all objects matching a prefix (careful!)
mc rm --recursive --force local/my-bucket/tmp/

# Copy between MinIO instances (server-side where possible)
mc cp local/source-bucket/file.bin remote/dest-bucket/file.bin

# Mirror between two MinIO clusters
mc mirror local/my-bucket remote/my-bucket

# Check bandwidth usage
mc admin bandwidth local

# Watch bucket events (requires bucket notifications to be configured)
mc watch local/my-bucket

# Get object metadata without downloading
aws --endpoint-url=http://minio:9000 s3api head-object \
  --bucket my-bucket --key large-file.bin

# Verify object integrity (ETag vs expected MD5)
aws --endpoint-url=http://minio:9000 s3api get-object-attributes \
  --bucket my-bucket --key file.bin \
  --object-attributes ETag Checksum ObjectSize

# Presign a download URL (valid 1 hour)
aws --endpoint-url=http://minio:9000 s3 presign s3://my-bucket/file.bin --expires-in 3600

# Check total bucket disk usage
mc du --recursive local/my-bucket | tail -1

# List all service accounts for a user
mc admin user svcacct list local appuser