Skip to content

Kubernetes Operators Cheat Sheet

Name origin: The "Operator pattern" was coined by CoreOS in 2016 to describe software that encodes human operational knowledge into Kubernetes controllers. The idea: if a skilled DBA knows how to set up, scale, backup, and recover a database, an Operator automates that expertise as code. The reconciliation loop is the core concept — continuously compare desired state (the CR) with actual state (the cluster) and take corrective action.

Operator Pattern

User creates/updates CR (Custom Resource)
Controller watches for CR changes
Reconcile loop compares desired vs actual
    ┌────┴────┐
    │ Match?  │
    │ Yes: ✓  │──► Requeue after interval
    │ No:     │──► Create/update/delete resources
    └─────────┘

CRD Quick Reference

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databases.example.com    # plural.group
spec:
  group: example.com
  names:
    kind: Database               # CamelCase
    plural: databases            # lowercase
    singular: database
    shortNames: ["db"]
  scope: Namespaced              # or Cluster
  versions:
  - name: v1
    served: true
    storage: true
    subresources:
      status: {}                 # Enable status subresource
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              replicas:
                type: integer
                minimum: 1
          status:
            type: object
            properties:
              phase:
                type: string

Kubebuilder Commands

# Init project
kubebuilder init --domain example.com --repo github.com/org/operator

# Create API (CRD + Controller)
kubebuilder create api --group app --version v1 --kind Database

# Create webhook
kubebuilder create webhook --group app --version v1 --kind Database \
  --defaulting --programmatic-validation

# Regenerate after type changes
make manifests    # CRD YAML, RBAC, webhook config
make generate     # DeepCopy methods

# Run locally
make run

# Build & deploy
make docker-build docker-push IMG=registry/operator:v1
make deploy IMG=registry/operator:v1

Owner References

// Set owner reference (garbage collection)
ctrl.SetControllerReference(database, pod, r.Scheme)
# Result on child resource
ownerReferences:
- apiVersion: example.com/v1
  kind: Database
  name: my-db
  uid: abc-123
  controller: true
  blockOwnerDeletion: true

Delete parent → children auto-deleted.

Gotcha: Owner references enable Kubernetes garbage collection — when you delete a parent CR, all child resources (Pods, Services, ConfigMaps) are automatically cleaned up. But if you forget to set ctrl.SetControllerReference(), deleting the CR leaves orphaned resources running in the cluster. Always set owner references on every resource your operator creates.

Finalizers

const finalizerName = "databases.example.com/cleanup"

// Add finalizer (during create)
controllerutil.AddFinalizer(db, finalizerName)

// Handle deletion
if !db.DeletionTimestamp.IsZero() {
    // Cleanup external resources
    cleanupExternalDB(db)
    // Remove finalizer to allow deletion
    controllerutil.RemoveFinalizer(db, finalizerName)
    return ctrl.Result{}, r.Update(ctx, db)
}

Debug clue: If a Custom Resource is stuck in "Deleting" state, it almost certainly has a finalizer that is not being removed. Check with kubectl get <resource> -o jsonpath='{.metadata.finalizers}'. If the operator that manages the finalizer is down, you can manually edit the resource to remove the finalizer (as a last resort): kubectl edit <resource> and delete the finalizer entry.

Reconcile Return Values

// Success, don't requeue
return ctrl.Result{}, nil

// Requeue after 30 seconds
return ctrl.Result{RequeueAfter: 30 * time.Second}, nil

// Requeue immediately (rate-limited)
return ctrl.Result{Requeue: true}, nil

// Error (auto-requeue with backoff)
return ctrl.Result{}, err

Status Conditions Pattern

meta.SetStatusCondition(&db.Status.Conditions, metav1.Condition{
    Type:               "Ready",
    Status:             metav1.ConditionTrue,
    Reason:             "AllReplicasReady",
    Message:            "All 3 replicas are running",
    LastTransitionTime: metav1.Now(),
})
r.Status().Update(ctx, db)
Operator For Install
CloudNativePG PostgreSQL kubectl apply -f cnpg-manifest.yaml
Strimzi Kafka Helm or OLM
Prometheus Operator Monitoring kube-prometheus-stack Helm chart
cert-manager TLS certs kubectl apply -f cert-manager.yaml
ArgoCD GitOps kubectl apply -n argocd -f install.yaml
Kyverno Policy Helm chart