Skip to content

Crossplane - Street-Level Ops

What experienced Crossplane operators know that tutorials don't teach.

Quick Diagnosis Commands

# Crossplane pod health
kubectl get pods -n crossplane-system
kubectl logs -n crossplane-system deploy/crossplane --follow
kubectl logs -n crossplane-system deploy/crossplane-rbac-manager --follow

# All managed resources (across all providers)
kubectl get managed
kubectl get managed -o wide
kubectl get managed --all-namespaces

# Managed resources by provider
kubectl get managed -l crossplane.io/provider=upbound-provider-aws

# Composite resources (XR) and claims
kubectl get composite                          # all XRs
kubectl get xpostgresqlinstances               # specific XR type
kubectl get claim -A                           # all claims across namespaces
kubectl get postgresqlinstances -A             # specific claim type

# Resource status and conditions
kubectl describe managed aws_rds_instance.prod-db
kubectl get managed -o json | jq '.items[] | select(.status.conditions[]?.reason == "ReconcileError") | .metadata.name'

# Provider status
kubectl get providers
kubectl describe provider upbound-provider-aws
kubectl get providerrevisions

# Package status (providers, configurations)
kubectl get pkg
kubectl get configurations
kubectl get configurationrevisions

# Events — most useful for debugging
kubectl get events -n crossplane-system --sort-by='.lastTimestamp'
kubectl describe xpostgresqlinstance prod-db  # shows conditions and events
kubectl get events --field-selector involvedObject.name=prod-db -A

# Function status (for composition functions, Crossplane 1.14+)
kubectl get functions
kubectl get functionrevisions

Common Scenarios

Scenario 1: Managed Resource Stuck in "Synced: False"

A managed resource (e.g., an RDS instance, S3 bucket) shows SYNCED: False and never becomes Ready.

Diagnosis:

# Get the full status
kubectl describe rdsinstance.rds.aws.upbound.io prod-db

# Look for conditions
kubectl get rdsinstance prod-db -o json | jq '.status.conditions'
# Common conditions:
# - Ready: True/False
# - Synced: True/False (reconcile loop ran without error)
# - reason: ReconcileError, Creating, Deleting

# Check the actual error
kubectl get rdsinstance prod-db -o json | jq '.status.conditions[] | select(.type=="Synced") | .message'

# Check provider logs for more detail
kubectl logs -n crossplane-system deploy/upbound-provider-aws -f | grep "prod-db"

# Check ProviderConfig auth is working
kubectl describe providerconfig aws-provider-config

# Check if the external resource exists in AWS (may need aws CLI)
aws rds describe-db-instances --db-instance-identifier prod-db --region us-east-1

Common causes and fixes:

1. Authentication failure (ProviderConfig not working):
   - "AccessDenied" in the error message
   - Check IRSA annotation on provider's ServiceAccount
   - kubectl get sa -n crossplane-system | grep provider
   - kubectl describe sa upbound-provider-aws -n crossplane-system
   - Verify role ARN and trust policy allows the OIDC provider

2. Resource already exists in cloud but not in Crossplane state:
   - Import the external resource by setting `crossplane.io/external-name` annotation
   - kubectl annotate rdsinstance prod-db crossplane.io/external-name=actual-rds-id
   - Or delete and recreate the managed resource

3. Invalid spec field:
   - "InvalidParameterCombination" or validation errors in message
   - Fix the spec in the managed resource YAML and re-apply

4. Region mismatch:
   - Resource created in wrong region, or ProviderConfig and resource spec disagree
   - Check spec.forProvider.region matches ProviderConfig's region setting

Scenario 2: Claim Not Propagating to Composite Resource

A user creates a Claim (e.g., PostgreSQLInstance) but no Composite Resource (XR) or managed resources appear.

Diagnosis:

# Check claim status
kubectl describe postgresqlinstance my-db -n my-namespace

# Is there a matching XRD?
kubectl get xrds
kubectl describe xrd xpostgresqlinstances.database.example.com

# Is the XRD in "Established" state?
kubectl get xrd xpostgresqlinstances.database.example.com -o json | jq '.status.conditions'

# Is there a Composition selected?
kubectl get compositions
kubectl describe composition xpostgresqlinstance-aws
# Check: spec.compositeTypeRef must match the XRD

# Check if the claim created an XR at all
kubectl get xpostgresqlinstances   # composite resources (cluster-scoped)
kubectl get xpostgresqlinstances -o json | jq '.items[].metadata.name'

# Look at Crossplane controller logs
kubectl logs -n crossplane-system deploy/crossplane | grep "my-db\|postgresqlinstance"

Fix:

# If XRD is not Established, check for validation errors
kubectl describe xrd xpostgresqlinstances.database.example.com
# Common issue: spec.versions schema validation errors

# If Composition is not selected, check the compositionSelector labels match
kubectl get claim my-db -n my-namespace -o json | jq '.spec.compositionSelector'
kubectl get compositions -l environment=production   # must match

# Force reconcile if stuck
kubectl annotate postgresqlinstance my-db -n my-namespace \
  reconcile.crossplane.io/paused=false --overwrite

Scenario 3: Composition Revision Rollback

A composition update breaks all managed resources. You need to roll back to the previous revision.

# List composition revisions
kubectl get compositionrevisions --sort-by='.metadata.creationTimestamp'

# Check which revision is current
kubectl get composition xpostgresqlinstance-aws -o json | jq '.spec.compositionRevisionRef'

# See what changed in a revision
kubectl get compositionrevision xpostgresqlinstance-aws-abc123 -o yaml | \
  kubectl diff -f -

# Pin a composite resource to a specific revision
kubectl patch xpostgresqlinstance prod-db --type merge \
  -p '{"spec":{"compositionRevisionRef":{"name":"xpostgresqlinstance-aws-abc123"}}}'

# Set Composition update policy to Manual (prevent auto-upgrade to new revisions)
kubectl patch composition xpostgresqlinstance-aws --type merge \
  -p '{"spec":{"compositionRevisionRef":{"policy":"Automatic"}}}'
# Options: Automatic (default) or Manual

Scenario 4: Debugging a Composition (Which Pipeline Step Is Failing)

A composite resource exists but managed resources are not being created. The composition pipeline is failing somewhere.

# Check composite resource status in detail
kubectl describe xpostgresqlinstance prod-db

# Look for pipeline errors (Crossplane 1.14+ with Functions)
kubectl get xpostgresqlinstance prod-db -o json | jq '.status'
# Look for: .status.conditions, .status.connectionDetails

# If using classic Composition (patches/resources), check each managed resource
kubectl get managed -l crossplane.io/composite=prod-db

# Check if the composed resources exist at all
kubectl get managed | grep prod-db

# Check Crossplane logs for composition errors
kubectl logs -n crossplane-system deploy/crossplane -f | grep -E "error|Error|prod-db"

# Validate the Composition spec (is the API version correct?)
kubectl apply -f composition.yaml --dry-run=server

# Test with a minimal claim to isolate the failing resource

Key Patterns

ProviderConfig with IRSA (AWS, EKS)

# ServiceAccount for the provider pod
apiVersion: v1
kind: ServiceAccount
metadata:
  name: upbound-provider-aws
  namespace: crossplane-system
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/crossplane-provider-role
---
# ProviderConfig that references the ServiceAccount credentials
apiVersion: aws.upbound.io/v1beta1
kind: ProviderConfig
metadata:
  name: default
spec:
  credentials:
    source: IRSA
# Corresponding IAM trust policy (must allow the OIDC provider)
# Principal: arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/EXAMPLE
# Condition: StringEquals
#   oidc.eks.us-east-1.amazonaws.com/id/EXAMPLE:sub: system:serviceaccount:crossplane-system:upbound-provider-aws

# Verify IRSA is working
kubectl exec -n crossplane-system deploy/upbound-provider-aws -- \
  aws sts get-caller-identity

Composition with Patches

apiVersion: apiextensions.crossplane.io/v1
kind: Composition
metadata:
  name: xpostgresqlinstance-aws
spec:
  compositeTypeRef:
    apiVersion: database.example.com/v1alpha1
    kind: XPostgreSQLInstance
  resources:
    - name: rds-instance
      base:
        apiVersion: rds.aws.upbound.io/v1beta1
        kind: Instance
        spec:
          forProvider:
            region: us-east-1
            instanceClass: db.t3.micro
            engine: postgres
            engineVersion: "15"
            skipFinalSnapshot: true
            autoGeneratePassword: true
            masterUsername: adminuser
            dbName: mydb
            publiclyAccessible: false
      patches:
        # Claim field → Managed resource field
        - type: FromCompositeFieldPath
          fromFieldPath: spec.parameters.storageGB
          toFieldPath: spec.forProvider.allocatedStorage
        - type: FromCompositeFieldPath
          fromFieldPath: spec.parameters.size
          toFieldPath: spec.forProvider.instanceClass
          transforms:
            - type: map
              map:
                small: db.t3.micro
                medium: db.t3.medium
                large: db.t3.large
        # Tag with composite name
        - type: FromCompositeFieldPath
          fromFieldPath: metadata.name
          toFieldPath: spec.forProvider.tags.Name
        # Status back from managed resource to composite
        - type: ToCompositeFieldPath
          fromFieldPath: status.atProvider.endpoint.address
          toFieldPath: status.endpoint

XRD (Composite Resource Definition)

apiVersion: apiextensions.crossplane.io/v1
kind: CompositeResourceDefinition
metadata:
  name: xpostgresqlinstances.database.example.com
spec:
  group: database.example.com
  names:
    kind: XPostgreSQLInstance
    plural: xpostgresqlinstances
  claimNames:
    kind: PostgreSQLInstance          # what users create
    plural: postgresqlinstances
  versions:
    - name: v1alpha1
      served: true
      referenceable: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                parameters:
                  type: object
                  required: [storageGB, size]
                  properties:
                    storageGB:
                      type: integer
                      minimum: 20
                      maximum: 1000
                    size:
                      type: string
                      enum: [small, medium, large]

external-name Annotation (Adopting Existing Resources)

# If an AWS resource already exists and you want Crossplane to manage it
# Set external-name to the cloud provider's identifier
kubectl annotate rdsinstance.rds.aws.upbound.io prod-db \
  crossplane.io/external-name=prod-db-actual-identifier

# For S3 buckets (external-name = bucket name)
kubectl annotate bucket.s3.aws.upbound.io my-bucket \
  crossplane.io/external-name=my-actual-bucket-name-in-aws

# After annotating, Crossplane will observe (not create) the resource
# Check status to verify it connected to the existing resource
kubectl get rdsinstance prod-db -o json | jq '.status'

Package Management

# Install a provider
kubectl apply -f - <<EOF
apiVersion: pkg.crossplane.io/v1
kind: Provider
metadata:
  name: upbound-provider-aws
spec:
  package: xpkg.upbound.io/upbound/provider-aws-rds:v1.2.0
  installRuntimeConfig:
    spec:
      serviceAccountName: upbound-provider-aws
EOF

# Check package install progress
kubectl get providers -w
kubectl describe provider upbound-provider-aws
# Look for: Installed: True, Healthy: True

# List installed CRDs from a provider
kubectl get crds | grep rds.aws.upbound.io

# Update a provider (change the package version)
kubectl patch provider upbound-provider-aws --type merge \
  -p '{"spec":{"package":"xpkg.upbound.io/upbound/provider-aws-rds:v1.3.0"}}'

# Install a Configuration package (bundles XRDs + Compositions)
kubectl apply -f - <<EOF
apiVersion: pkg.crossplane.io/v1
kind: Configuration
metadata:
  name: platform-ref-aws
spec:
  package: xpkg.upbound.io/upbound/platform-ref-aws:v0.9.0
EOF

Pausing and Resuming Reconciliation

# Pause a managed resource (stop Crossplane from touching the cloud resource)
kubectl annotate managed rdsinstance.prod-db \
  crossplane.io/paused=true

# Resume
kubectl annotate managed rdsinstance.prod-db \
  crossplane.io/paused-

# Pause all resources in a composite (for maintenance)
kubectl get managed -l crossplane.io/composite=prod-db -o name | \
  xargs -I{} kubectl annotate {} crossplane.io/paused=true

Observing Drift

War story: An engineer manually resized an RDS instance in the AWS console during an incident. Crossplane reverted it to the original size on the next reconcile (10 seconds later), causing a second outage. If you must make emergency changes to Crossplane-managed resources, pause the resource first with crossplane.io/paused=true, then update the manifest in git after the incident.

# Crossplane continuously reconciles — if someone changes a resource in AWS,
# Crossplane will revert it on the next reconcile loop (typically every ~10s)

# Check last observed state vs desired
kubectl get rdsinstance prod-db -o json | jq '.status.atProvider'

# If you want to allow drift (not recommended), pause the resource
# Otherwise, manage all changes through the Crossplane manifest

# Force immediate reconcile (by adding/changing an annotation)
kubectl annotate rdsinstance prod-db \
  reconcile.crossplane.io/paused=false --overwrite