Skip to content

Environment Variables - Primer

Why This Matters

Environment variables are the primary mechanism for configuring processes on Unix systems. Every process inherits a set of key-value pairs from its parent, and this inheritance chain is how configuration flows from the init system to your application. When env vars break -- wrong PATH, missing credentials, locale misconfiguration -- the symptoms range from "command not found" to silent data corruption. Understanding how they propagate through fork/exec, shells, containers, and service managers is foundational to debugging almost anything in a Linux environment.

What Environment Variables Are

Name origin: The term "environment" in Unix dates back to Version 7 Unix (1979), where the environ variable was introduced as an array of strings passed to each process. The word reflects the idea that these variables form the "surroundings" in which a process executes -- its configuration context.

An environment variable is a named string value attached to a process. The kernel maintains an environment block for each process -- an array of null-terminated KEY=VALUE strings. When a process calls fork(), the child inherits an exact copy of the parent's environment. When a process calls execve(), it can pass an explicit environment or inherit the current one.

# View all environment variables for the current shell
env

# View them sorted (easier to scan)
env | sort

# View a specific variable
echo $PATH
printenv PATH

# The difference between env, printenv, and set:
env          # shows exported environment variables
printenv     # same as env, but can query a single var (printenv HOME)
set          # shows ALL shell variables (exported + unexported + functions)

Exported vs Non-Exported Variables

This distinction is critical. A shell variable exists only in the current shell. An exported variable is pushed into the environment and inherited by child processes.

# Shell variable -- NOT inherited by children
MY_VAR="hello"
bash -c 'echo "child sees: $MY_VAR"'
# Output: child sees:

# Exported variable -- inherited by children
export MY_VAR="hello"
bash -c 'echo "child sees: $MY_VAR"'
# Output: child sees: hello

# Export existing variable
MY_VAR="hello"
export MY_VAR

# Set and export in one line
export DB_HOST="db.prod.internal"

# Remove a variable from the environment
unset MY_VAR

# Remove the export flag but keep the shell variable
export -n MY_VAR

Shell Initialization Files

Understanding which file runs when is essential for knowing where to set variables.

Login vs Non-Login Shells

A login shell is your first shell when you log in (SSH, console login, su -, bash --login). A non-login shell is everything else (opening a terminal in a GUI, running bash, subshells).

Login shell startup order (bash):
  1. /etc/profile
  2. /etc/profile.d/*.sh    (sourced by /etc/profile)
  3. First found of: ~/.bash_profile, ~/.bash_login, ~/.profile

Non-login interactive shell:
  1. /etc/bash.bashrc        (on some distros)
  2. ~/.bashrc

Where to Put What

File When It Runs Best For
/etc/environment PAM reads it at login (not a script, just KEY=VALUE lines) System-wide vars that every user needs
/etc/profile Every login shell System-wide login setup
/etc/profile.d/*.sh Sourced by /etc/profile Modular system-wide config (one file per tool)
~/.bash_profile Login shells only User PATH, login-specific setup
~/.bashrc Non-login interactive shells Aliases, functions, prompt, per-terminal config
~/.profile Login shells (if .bash_profile absent) Portable user env (works with sh, dash)

Gotcha: A frequent trap: you add export PATH=... to ~/.bashrc, but SSH non-interactive commands (like ssh host 'some-command') don't source ~/.bashrc -- they only read ~/.bash_profile (and only if it's a login shell). For non-interactive, non-login SSH commands, neither file is sourced unless the SSH server is configured with PermitUserEnvironment or you use ~/.ssh/environment.

Remember: Mnemonic for which file runs when: "Login = Profile, Interactive = RC." .bash_profile is for login shells, .bashrc is for interactive non-login shells. Most people source .bashrc from .bash_profile to avoid maintaining two files.

The most common pattern for ~/.bash_profile:

# ~/.bash_profile
# Source .bashrc so login shells also get aliases/functions
if [ -f ~/.bashrc ]; then
    source ~/.bashrc
fi

# Login-specific env vars
export PATH="$HOME/bin:$HOME/.local/bin:$PATH"

Common Important Variables

System Variables

PATH=/usr/local/bin:/usr/bin:/bin       # Command search path (colon-separated)
HOME=/home/deploy                       # Current user's home directory
USER=deploy                             # Current username
SHELL=/bin/bash                         # User's default shell
TERM=xterm-256color                     # Terminal type (affects ncurses, colors)
LANG=en_US.UTF-8                        # Locale (affects sorting, date format, encoding)
LC_ALL=C                                # Override all LC_* categories
HOSTNAME=web-prod-01                    # Machine hostname
PWD=/var/log                            # Current working directory
OLDPWD=/home/deploy                     # Previous working directory

Developer and Operations Variables

EDITOR=vim                              # Default editor (used by git commit, crontab -e, etc.)
VISUAL=vim                              # Preferred visual editor (takes precedence over EDITOR)
PAGER=less                              # Default pager (used by man, git log, etc.)
LD_LIBRARY_PATH=/opt/custom/lib         # Extra shared library search paths
PYTHONPATH=/opt/myapp/lib               # Extra Python module search paths
DISPLAY=:0                              # X11 display (for GUI apps)
WAYLAND_DISPLAY=wayland-0               # Wayland display socket
SSH_AUTH_SOCK=/tmp/ssh-xxx/agent.123    # SSH agent socket
GPG_TTY=$(tty)                          # Terminal for GPG passphrase prompts

XDG Base Directories

The XDG spec standardizes where applications store data:

XDG_CONFIG_HOME=~/.config               # User configuration files
XDG_DATA_HOME=~/.local/share            # User data files
XDG_STATE_HOME=~/.local/state           # User state files (logs, history)
XDG_CACHE_HOME=~/.cache                 # Non-essential cached data
XDG_RUNTIME_DIR=/run/user/1000          # Runtime files (sockets, PIDs)

The .env File and Dotenv Pattern

The dotenv pattern originated in the 12-factor app methodology. A .env file sits in the project root and contains configuration:

# .env (NOT a shell script -- no export, no variable expansion in most tools)
DATABASE_URL=postgresql://user:pass@db:5432/myapp
REDIS_URL=redis://cache:6379/0
LOG_LEVEL=info
SECRET_KEY=a1b2c3d4e5f6

Loading in bash:

# Source it (works if file uses KEY=VALUE format with no spaces around =)
set -a          # mark all new variables for export
source .env
set +a          # stop auto-exporting

# Or export each line manually
export $(grep -v '^#' .env | xargs)

The envsubst utility substitutes environment variable references in text:

# Template file: config.template
# server:
#   host: ${APP_HOST}
#   port: ${APP_PORT}

export APP_HOST=0.0.0.0
export APP_PORT=8080
envsubst < config.template > config.yaml

# Substitute only specific variables (leave others as-is)
envsubst '$APP_HOST $APP_PORT' < config.template > config.yaml

Environment Variables in Docker

Docker has multiple mechanisms for injecting env vars, each with different behavior:

Dockerfile ENV

# Baked into the image -- present in every container from this image
ENV APP_HOME=/opt/app
ENV NODE_ENV=production

# Build-time only (NOT in the final image environment)
ARG BUILD_VERSION=1.0.0
RUN echo "Building version $BUILD_VERSION"

Runtime Injection

# Single variable
docker run --env LOG_LEVEL=debug myapp

# From host environment (pass through)
export API_KEY=secret123
docker run --env API_KEY myapp

# From a file
docker run --env-file .env myapp

# docker-compose.yml
services:
  web:
    image: myapp
    environment:
      - LOG_LEVEL=debug
      - DATABASE_URL=postgresql://db:5432/app
    env_file:
      - .env
      - .env.local    # later files override earlier ones

Precedence in Docker Compose

From highest to lowest priority: 1. docker compose run --env (CLI override) 2. environment: in compose file 3. env_file: in compose file (later files win) 4. ENV in Dockerfile 5. Host environment variables (if using ${VAR} syntax in compose file)

Environment Variables in systemd

systemd units have their own environment, isolated from the user's shell:

# /etc/systemd/system/myapp.service
[Service]
# Inline variables
Environment="LOG_LEVEL=info"
Environment="DB_HOST=db.prod.internal" "DB_PORT=5432"

# From a file (one KEY=VALUE per line, like .env)
EnvironmentFile=/etc/myapp/env
EnvironmentFile=-/etc/myapp/env.local   # dash prefix = don't fail if missing

# The process environment is MINIMAL by default
# No PATH, no HOME, no USER from any shell profile
# systemd sets a basic PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

Inspect a running service's environment:

# Show resolved environment for a unit
systemctl show myapp.service --property=Environment

# See the actual process environment
PID=$(systemctl show myapp.service --property=MainPID --value)
cat /proc/$PID/environ | tr '\0' '\n' | sort

Environment Variables in Cron

Default trap: The PATH in cron is typically just /usr/bin:/bin -- far shorter than your interactive shell's PATH. This is the #1 reason scripts work interactively but fail in cron. Always use absolute paths in cron jobs, or explicitly set PATH at the top of your crontab.

Cron jobs run with a minimal environment. This is the single most common source of "it works in my shell but not in cron" bugs.

# Default cron environment (varies by distro, but typically):
# SHELL=/bin/sh
# PATH=/usr/bin:/bin
# HOME=/home/username
# LOGNAME=username

# Set variables at the top of crontab
SHELL=/bin/bash
PATH=/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin
MAILTO=ops@example.com

# Or source your profile in the job
* * * * * . $HOME/.profile && /opt/myapp/bin/run-job.sh

Environment Variables in CI/CD

CI systems inject variables for build context and accept user-defined secrets:

# Common CI variables (GitLab CI example)
CI=true
CI_COMMIT_SHA=abc123
CI_COMMIT_BRANCH=main
CI_PIPELINE_ID=12345
CI_JOB_NAME=deploy

# GitHub Actions
GITHUB_SHA=abc123
GITHUB_REF=refs/heads/main
GITHUB_REPOSITORY=org/repo
GITHUB_ACTIONS=true
RUNNER_OS=Linux

Secrets are injected as env vars at runtime, never stored in the repo:

# GitHub Actions
steps:
  - name: Deploy
    env:
      AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
      AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
    run: ./deploy.sh

Secrets as Environment Variables (12-Factor)

The 12-factor app methodology recommends storing config in environment variables. This means secrets like database passwords, API keys, and TLS certificates are injected as env vars rather than config files.

Advantages: - Language and framework agnostic - Clear separation of config from code - Easy to change between deploys without code changes

Risks (covered in footguns.md): - Env vars are visible in /proc/PID/environ - They appear in crash dumps and debug logs - Child processes inherit all of them (over-sharing) - Docker inspect shows them in cleartext

Debug clue: If a process has the wrong environment, inspect it directly: cat /proc/$PID/environ | tr '\0' '\n' | sort. This shows the actual environment the kernel gave the process, bypassing any shell-level confusion about what was exported.

The production pattern is to use a secrets manager (Vault, AWS Secrets Manager, K8s Secrets) and inject secrets as env vars at runtime with tight scoping.


Wiki Navigation