Skip to content

Portal | Level: L2: Operations | Topics: CI/CD, GitHub Actions | Domain: DevOps & Tooling

CI/CD Pipeline Architecture

Scope

This document explains CI/CD pipeline architecture from the systems point of view rather than from vendor slogans. It covers:

  • pipeline stages and execution model
  • runners/agents
  • artifacts and dependencies
  • secrets handling
  • environment promotion
  • reproducibility
  • failure modes
  • GitLab and Jenkins mental models
  • what separates a toy pipeline from a production one

Big picture

A CI/CD pipeline is an automated control system that turns source changes into validated artifacts and, optionally, deployed runtime changes.

Generic flow

code change
  -> pipeline trigger
  -> checkout / fetch context
  -> build
  -> test
  -> package artifact
  -> scan / verify / sign
  -> publish artifact
  -> deploy to environment(s)
  -> verify / monitor / possibly roll back

If your pipeline skips artifact discipline and environment gating, it is often just a shell script with delusions of grandeur.

flowchart TD
    A[Code Commit] --> B[Build]
    B --> C[Unit Tests]
    C --> D[Package]
    D --> E[Integration Tests]
    E --> F[Deploy Staging]
    F --> G[Acceptance Tests]
    G --> H[Deploy Production]

Core goals of CI/CD

A serious pipeline should provide:

  • repeatability
  • traceability
  • isolation
  • policy enforcement
  • fast feedback
  • safe promotion
  • controlled deployment
  • observable outcomes

The pipeline is part of the product's reliability story, not just a convenience mechanism.


Key architectural components

1. Source control event model

Common triggers:

  • push to branch
  • merge request / pull request
  • tag creation
  • schedule
  • manual approval
  • external webhook
  • parent/child pipeline or downstream trigger

Why trigger design matters

If everything triggers everything, you create expensive noise. If nothing is gated, you create unsafe automation.


2. Pipeline definition

Examples:

  • .gitlab-ci.yml
  • Jenkinsfile

This defines:

  • stages or DAG relationships
  • jobs
  • scripts/steps
  • variables
  • rules/conditions
  • artifacts
  • dependencies
  • environments
  • approvals

Important difference

Some systems are more stage-oriented by default; some can run true DAGs. The design question is always:

  • what can run in parallel,
  • what must wait,
  • what evidence is required before promotion?

3. Runners / agents / executors

Jobs run somewhere:

  • VM
  • container
  • Kubernetes pod
  • bare-metal agent
  • ephemeral cloud worker

What matters

  • isolation
  • reproducibility
  • available tools
  • performance
  • security of secrets
  • cleanup between jobs

A polluted long-lived agent is a liar. It will make broken builds "work on CI" because yesterday's leftovers are still there.


4. Artifact strategy

The build output should become an explicit artifact:

  • package
  • container image
  • binary
  • archive
  • manifest bundle
  • SBOM
  • signature/provenance metadata

Why artifacts matter

You want to deploy the thing you tested, not rebuild something slightly different later.

This is a core dividing line between mature and amateur pipelines.


5. Secret handling

Pipelines often need:

  • registry credentials
  • cloud credentials
  • signing keys
  • deploy tokens
  • SSH keys
  • API tokens

Good patterns

  • short-lived credentials where possible
  • scoped secrets
  • environment-specific exposure
  • avoid printing secrets
  • avoid baking secrets into images/artifacts
  • masked/protected variables are not magic; job scripts can still exfiltrate them if you let the wrong code run

6. Environments and promotion

Typical environment path:

build
  -> unit/integration tests
  -> dev deploy
  -> staging
  -> production

Promotion models

  • rebuild per environment
  • single immutable artifact promoted across environments

The second model is generally stronger because it preserves tested identity.


GitLab mental model

GitLab CI/CD pipelines are defined in .gitlab-ci.yml and consist of jobs that may be organized by stages, rules, includes, and reusable components.

Useful primitives:

  • stages
  • jobs
  • rules
  • needs
  • artifacts
  • caches
  • environments
  • components/includes

Strong GitLab pattern

Use needs: to build a DAG rather than serializing the whole universe into artificial stages.


Jenkins mental model

Jenkins Pipeline uses a Jenkinsfile and can be:

  • Declarative
  • Scripted

Jenkins is extremely flexible and therefore extremely capable of becoming an archaeological site of plugin-related suffering.

Strong Jenkins pattern

  • keep pipelines as code
  • minimize plugin sprawl
  • use agents intentionally
  • externalize heavy logic into versioned scripts/tools
  • keep credentials and environment boundaries explicit

Pipeline execution model

Serial stage model

Simple but often slow:

build -> test -> package -> deploy

DAG model

Better when tasks can run in parallel:

build
  -> unit tests
  -> lint
  -> scan
  -> package
      -> deploy staging

The architecture should reflect dependency truth, not human laziness.


Build reproducibility

A pipeline should answer:

  • what exact source revision was built?
  • what dependency versions were used?
  • what base image/toolchain was used?
  • can I rebuild or verify it later?

Enablers

  • pinned dependencies where appropriate
  • locked base images / digests
  • deterministic build scripts
  • explicit toolchain versions
  • isolated runners
  • immutable artifacts

Things that ruin reproducibility

  • latest everywhere
  • mutable shared runners with state leakage
  • downloading random tools during build without pinning
  • rebuilding separately for each environment
  • hidden manual steps

Testing layers

Different test types belong at different points:

  • lint/static analysis
  • unit tests
  • integration tests
  • contract tests
  • security scans
  • image scans
  • smoke tests
  • end-to-end tests

Do not put every test in the critical path for every tiny commit if it destroys developer throughput. Build a rational pyramid, not a punishment machine.


Deployment patterns

Common models:

  • rolling update
  • blue/green
  • canary
  • recreate
  • immutable replacement
  • progressive delivery

The pipeline should know whether deploy is:

  • pushing bits,
  • updating desired state,
  • or triggering another orchestrator.

Rollback and recovery

A production pipeline without rollback thinking is negligence disguised as optimism.

Options:

  • revert to prior artifact version
  • redeploy previous release
  • use orchestrator-native rollout undo
  • traffic-shift back in canary/blue-green model
  • feature flag disablement

Rollback is easier when artifacts are immutable and deploy metadata is explicit.


Observability of the pipeline itself

You need visibility into:

  • job duration
  • queue times
  • flaky tests
  • failure distribution
  • deployment frequency
  • MTTR after failed deploy
  • artifact provenance
  • who approved what

Otherwise your delivery system becomes a black box that occasionally screams.


Common production failure patterns

1. Non-reproducible builds

Symptoms:

  • works one day, fails another
  • prod artifact differs from tested artifact
  • runners disagree

Causes:

  • mutable dependencies
  • leaked state on runners
  • implicit environment assumptions

2. Secret leakage

Causes:

  • verbose shell tracing
  • artifact inclusion
  • logs
  • unsafe fork/MR job execution
  • baking credentials into images

3. Slow pipeline nobody trusts

Causes:

  • serialized jobs that could be parallel
  • giant monolithic test stages
  • cache misuse
  • slow runner startup
  • no pipeline architecture discipline

4. Deployment race conditions

Causes:

  • multiple applies to same environment
  • no locking/serialization
  • stale environment assumptions
  • parallel pipelines fighting

5. Pipeline as snowflake

Causes:

  • all logic trapped in UI config
  • plugin maze
  • undocumented manual steps
  • one wizard-admin knows how it works and has now left the company

Good design principles

  • pipeline as code
  • immutable artifacts
  • explicit environment promotion
  • isolated runners
  • least-privilege secrets
  • concurrency control for deploys
  • reusable shared pipeline components
  • separate build from deploy concerns
  • surface evidence for approvals, not vibes
  • make rollback a first-class design concern

Practical debugging workflow

Step 1 - classify failure domain

  • source/build failure
  • test failure
  • environment problem
  • runner issue
  • secret/credential issue
  • deploy orchestration issue
  • artifact issue

Step 2 - verify artifact chain

  • what commit?
  • what artifact ID?
  • what image digest?
  • what was actually deployed?

Step 3 - compare successful vs failed run context

  • runner image/version
  • variables
  • dependency versions
  • cache hit/miss
  • branch/rules path
  • target environment

Step 4 - look for non-determinism

  • network downloads
  • mutable tags
  • timing/order assumptions
  • race conditions

Interview angles

Good questions hidden here:

  • what makes a pipeline production-grade
  • why immutable artifacts matter
  • difference between CI and CD
  • how to secure secrets in pipelines
  • why reproducibility is important
  • how GitLab/Jenkins express pipelines
  • why deployment jobs often need serialization
  • how to design rollback

Strong answers treat the pipeline as an engineered system, not just "a bunch of jobs."


Mental model to keep

A CI/CD pipeline is a controlled evidence chain:

  1. identify source revision
  2. build in a known environment
  3. validate with the right tests
  4. produce an immutable artifact
  5. promote that artifact through gated environments
  6. deploy with traceability and rollback options
  7. observe outcomes

If the pipeline cannot tell you exactly what was built, tested, approved, and deployed, it is not trustworthy.


References


Wiki Navigation

Prerequisites