Skip to content

Answer Key: The Container That Exits Immediately

The System

A PDF rendering microservice built with Node.js. It consumes jobs from a queue (1,847 pending), renders PDFs, and serves them to upstream consumers. The service is containerized and deployed via Docker on an x86_64 host.

[Upstream Services] --> [Job Queue (1847 pending)]
                              |
                        [pdf-renderer container]
                              |
                         /app/render (Node.js)
                              |
                        [Rendered PDFs]

Build pipeline: GitLab CI builds the Docker image, pushes to a private registry (registry.corp.io), and the container is started on the target host.

What's Broken

Root cause: Architecture mismatch. The CI runner (gitlab-runner-arm64-02) is an ARM64 machine. When it builds the Docker image using FROM node:20-alpine, it pulls the ARM64 variant of the base image and produces an ARM64 binary. This image was pushed to the registry as pdf-renderer:v3. The target host runs x86_64 (uname -m shows x86_64). When Docker tries to execute /app/render, the kernel cannot run the ARM64 binary and returns "exec format error."

Key clue: docker inspect showing Architecture: arm64 on a host where uname -m returns x86_64. The CI runner log confirms the build happened on an ARM64 runner.

The Fix

Immediate (get the service running)

Option A — Roll back to the previous working image:

docker stop pdf-renderer
docker rm pdf-renderer
docker run -d --name pdf-renderer --restart unless-stopped \
  registry.corp.io/pdf-renderer:v2 /app/render

Option B — Rebuild on an x86_64 runner or use cross-compilation:

# On an x86_64 machine or using buildx:
docker buildx build --platform linux/amd64 \
  -t registry.corp.io/pdf-renderer:v3-fixed \
  --push .

Permanent (fix the CI pipeline)

Ensure the CI pipeline either: 1. Runs on an x86_64 runner (pin the runner tag):

# .gitlab-ci.yml
build:
  tags:
    - amd64
  script:
    - docker build -t registry.corp.io/pdf-renderer:$CI_COMMIT_TAG .
    - docker push registry.corp.io/pdf-renderer:$CI_COMMIT_TAG

  1. Or uses docker buildx for multi-arch builds:
    build:
      script:
        - docker buildx build --platform linux/amd64,linux/arm64 \
            -t registry.corp.io/pdf-renderer:$CI_COMMIT_TAG --push .
    

Verification

# Check the container is running
docker ps --filter name=pdf-renderer

# Verify architecture matches host
docker inspect pdf-renderer --format '{{.Architecture}}'

# Check queue is draining
curl -s http://localhost:9090/metrics | grep pdf_queue_pending_jobs

# Verify logs show normal operation
docker logs -f pdf-renderer

Artifact Decoder

Artifact What It Revealed What Was Misleading
CLI Output Architecture mismatch: arm64 image on x86_64 host; "exec format error" confirms it Exit code 1 is generic — could mean many things without the log message
Metrics Queue depth of 1,847 and oldest job at 2h shows impact severity No Prometheus metrics from the app itself — absence of data is the signal
IaC Snippet Standard multi-stage Node.js Dockerfile — nothing wrong with the Dockerfile itself The Dockerfile is a distraction; the bug is in the build environment, not the build instructions
Log Lines Registry log shows arch=arm64 confirming the wrong architecture was pushed CI log says "built amd64/linux" which is misleading — the runner name reveals it is ARM64

Skills Demonstrated

  • Recognizing "exec format error" as an architecture mismatch
  • Using docker inspect and uname to compare image and host architectures
  • Understanding multi-arch container builds and CI runner selection
  • Interpreting queue depth metrics to assess incident severity
  • Reading CI logs critically (the log message contradicts the runner name)

Prerequisite Topic Packs