Skip to content

Audit Logging Primer

Why This Matters

Audit logs answer the question: who did what, when, and from where. Without them, you cannot investigate security incidents, prove compliance, or detect unauthorized changes. Every production system — Linux hosts, Kubernetes clusters, cloud accounts — must produce audit logs that are tamper-evident, centrally collected, and retained per policy.

Linux Audit Framework (auditd)

Fun fact: The Linux audit framework was added to the kernel in version 2.6 (2003) to meet the requirements of Common Criteria (CC) and CAPP (Controlled Access Protection Profile) certification. It operates at the kernel level — not even root can evade audit logging without first disabling the audit subsystem, which itself generates an audit event.

Architecture

The Linux audit system operates at the kernel level:

Kernel → auditd daemon → /var/log/audit/audit.log
                       → log dispatcher → remote syslog / SIEM

auditd captures syscalls, file access, authentication events, and command execution.

Core Components

Component Purpose
auditd Daemon that writes audit records
auditctl Runtime rule management
ausearch Search audit logs
aureport Generate audit reports
/etc/audit/auditd.conf Daemon configuration
/etc/audit/audit.rules Persistent rules (loaded at boot)

Managing Rules

# List current rules
auditctl -l

# Add a file watch
auditctl -w /etc/passwd -p wa -k identity

# Watch a directory for writes and attribute changes
auditctl -w /etc/sudoers.d/ -p wa -k sudo_changes

# Monitor a specific syscall
auditctl -a always,exit -F arch=b64 -S execve -k commands

# Delete all rules (for testing)
auditctl -D

Rule Syntax in /etc/audit/audit.rules

# File watch rules
-w /etc/ssh/sshd_config -p wa -k sshd_config
-w /etc/crontab -p wa -k cron_changes
-w /var/log/audit/ -p wa -k audit_log_access

# Syscall rules
-a always,exit -F arch=b64 -S mount -S umount2 -k mounts
-a always,exit -F arch=b64 -S unlink -S rename -k file_deletion

# Track privileged commands
-a always,exit -F path=/usr/bin/sudo -F perm=x -k privileged_sudo
-a always,exit -F path=/usr/bin/passwd -F perm=x -k privileged_passwd

# Track user/group changes
-a always,exit -F arch=b64 -S setuid -S setgid -k privilege_escalation

Permission flags: r (read), w (write), x (execute), a (attribute change).

Remember: The audit rule permission flags spell "RWXA" — same order as filesystem permissions but with A for attribute changes. The most common production rules use -p wa (write + attribute change) on sensitive files. The -k (key) flag is your search tag — always set it. Without keys, searching audit logs is like searching logs without structured fields.

Under the hood: auditctl -a always,exit -S execve -k commands hooks into the kernel's syscall exit path. Every time any process calls execve() (which runs a new program), the kernel generates an audit record containing the UID, GID, command, arguments, and working directory. This is how you get a complete command history for every user on the system — far more reliable than bash history.

Searching and Reporting

# Search by key
ausearch -k identity

# Search by timestamp range
ausearch --start "01/15/2024" "09:00:00" --end "01/15/2024" "17:00:00"

# Search for failed logins
ausearch -m USER_LOGIN --success no

# Interpret results (translate UIDs, syscall numbers)
ausearch -k identity -i

# Generate reports
aureport --summary        # Overall summary
aureport --auth           # Authentication events
aureport --failed         # Failed events
aureport --login          # User logins

auditd Configuration

Key /etc/audit/auditd.conf settings: log_file, max_log_file, max_log_file_action = ROTATE, num_logs, and critically disk_full_action = HALT — this ensures the system stops rather than running without audit coverage (a common compliance requirement).

Gotcha: disk_full_action = HALT means the system will literally freeze if /var/log/audit/ fills up. This is intentional for PCI/HIPAA — running without audit coverage is considered worse than downtime. But it means a full disk on your audit partition takes down production. Monitor audit disk usage aggressively and set space_left_action = SYSLOG with space_left = 25% for early warning.

Compliance Frameworks

Different standards require specific audit rules:

Framework Key Requirements
PCI DSS Log all access to cardholder data, privileged actions, auth events
HIPAA Log all access to PHI, track user sessions
SOC 2 Log system changes, access events, security incidents
CIS Benchmarks Predefined audit rule sets per OS version

Kubernetes Audit Logging

Audit Policy

K8s API server audit logging is controlled by an audit policy file:

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
  # Log all requests to secrets at Metadata level
  - level: Metadata
    resources:
      - group: ""
        resources: ["secrets"]

  # Log pod exec/attach at RequestResponse level
  - level: RequestResponse
    resources:
      - group: ""
        resources: ["pods/exec", "pods/attach"]

  # Log all changes at Request level
  - level: Request
    verbs: ["create", "update", "patch", "delete"]

  # Default: log at Metadata level
  - level: Metadata

Audit Levels

Remember: K8s audit levels increase in verbosity: "N-M-Rq-RR" — None, Metadata, Request, RequestResponse. Each level includes everything from the previous level plus more data. Use Metadata for most resources (low overhead), Request for mutation tracking, and RequestResponse only for sensitive operations like pods/exec where you need the full response. RequestResponse on high-traffic resources will overwhelm your storage.

Level What is Logged
None Nothing
Metadata Request metadata only (user, timestamp, resource, verb)
Request Metadata + request body
RequestResponse Metadata + request body + response body

API Server Flags

kube-apiserver \
  --audit-policy-file=/etc/kubernetes/audit-policy.yaml \
  --audit-log-path=/var/log/kubernetes/audit.log \
  --audit-log-maxage=30 \
  --audit-log-maxbackup=10 \
  --audit-log-maxsize=100

Centralization and Retention

War story: In many post-incident investigations, the first thing an attacker does after gaining access is delete local logs. If your audit logs only exist on the compromised host, you have no evidence trail. The cardinal rule: audit logs must leave the host in near-real-time, to a destination the attacker cannot reach with the same credentials.

Audit logs must be shipped off-host to prevent tampering:

auditd → audisp-remote → central syslog / SIEM
         or
auditd → file → Filebeat/Fluentd → Elasticsearch/Splunk

Retention periods vary by compliance framework — 1 year is a common minimum. Cloud providers offer audit log services (AWS CloudTrail, GCP Audit Logs, Azure Activity Log) with built-in retention and search.

Default trap: AWS CloudTrail is enabled by default for management events, but it only retains 90 days of history in the Event History console. For long-term retention (required by most compliance frameworks), you must create a Trail that delivers to an S3 bucket. Many teams discover this gap only when they need logs older than 90 days during an investigation.

Common Pitfalls

  • Rules too broad: Auditing every read floods logs — focus on writes and privileged operations
  • No off-host shipping: Attacker deletes local logs; always centralize
  • Ignoring audit log rotation: Disk fills up and auditd halts the system (by design)
  • Missing Kubernetes audit policy: API server does not log audit events by default
  • Not testing rules: Use auditctl to test rules before adding to persistent config

Wiki Navigation