Crossplane - Street-Level Ops¶
What experienced Crossplane operators know that tutorials don't teach.
Quick Diagnosis Commands¶
# Crossplane pod health
kubectl get pods -n crossplane-system
kubectl logs -n crossplane-system deploy/crossplane --follow
kubectl logs -n crossplane-system deploy/crossplane-rbac-manager --follow
# All managed resources (across all providers)
kubectl get managed
kubectl get managed -o wide
kubectl get managed --all-namespaces
# Managed resources by provider
kubectl get managed -l crossplane.io/provider=upbound-provider-aws
# Composite resources (XR) and claims
kubectl get composite # all XRs
kubectl get xpostgresqlinstances # specific XR type
kubectl get claim -A # all claims across namespaces
kubectl get postgresqlinstances -A # specific claim type
# Resource status and conditions
kubectl describe managed aws_rds_instance.prod-db
kubectl get managed -o json | jq '.items[] | select(.status.conditions[]?.reason == "ReconcileError") | .metadata.name'
# Provider status
kubectl get providers
kubectl describe provider upbound-provider-aws
kubectl get providerrevisions
# Package status (providers, configurations)
kubectl get pkg
kubectl get configurations
kubectl get configurationrevisions
# Events — most useful for debugging
kubectl get events -n crossplane-system --sort-by='.lastTimestamp'
kubectl describe xpostgresqlinstance prod-db # shows conditions and events
kubectl get events --field-selector involvedObject.name=prod-db -A
# Function status (for composition functions, Crossplane 1.14+)
kubectl get functions
kubectl get functionrevisions
Common Scenarios¶
Scenario 1: Managed Resource Stuck in "Synced: False"¶
A managed resource (e.g., an RDS instance, S3 bucket) shows SYNCED: False and never becomes Ready.
Diagnosis:
# Get the full status
kubectl describe rdsinstance.rds.aws.upbound.io prod-db
# Look for conditions
kubectl get rdsinstance prod-db -o json | jq '.status.conditions'
# Common conditions:
# - Ready: True/False
# - Synced: True/False (reconcile loop ran without error)
# - reason: ReconcileError, Creating, Deleting
# Check the actual error
kubectl get rdsinstance prod-db -o json | jq '.status.conditions[] | select(.type=="Synced") | .message'
# Check provider logs for more detail
kubectl logs -n crossplane-system deploy/upbound-provider-aws -f | grep "prod-db"
# Check ProviderConfig auth is working
kubectl describe providerconfig aws-provider-config
# Check if the external resource exists in AWS (may need aws CLI)
aws rds describe-db-instances --db-instance-identifier prod-db --region us-east-1
Common causes and fixes:
1. Authentication failure (ProviderConfig not working):
- "AccessDenied" in the error message
- Check IRSA annotation on provider's ServiceAccount
- kubectl get sa -n crossplane-system | grep provider
- kubectl describe sa upbound-provider-aws -n crossplane-system
- Verify role ARN and trust policy allows the OIDC provider
2. Resource already exists in cloud but not in Crossplane state:
- Import the external resource by setting `crossplane.io/external-name` annotation
- kubectl annotate rdsinstance prod-db crossplane.io/external-name=actual-rds-id
- Or delete and recreate the managed resource
3. Invalid spec field:
- "InvalidParameterCombination" or validation errors in message
- Fix the spec in the managed resource YAML and re-apply
4. Region mismatch:
- Resource created in wrong region, or ProviderConfig and resource spec disagree
- Check spec.forProvider.region matches ProviderConfig's region setting
Scenario 2: Claim Not Propagating to Composite Resource¶
A user creates a Claim (e.g., PostgreSQLInstance) but no Composite Resource (XR) or managed resources appear.
Diagnosis:
# Check claim status
kubectl describe postgresqlinstance my-db -n my-namespace
# Is there a matching XRD?
kubectl get xrds
kubectl describe xrd xpostgresqlinstances.database.example.com
# Is the XRD in "Established" state?
kubectl get xrd xpostgresqlinstances.database.example.com -o json | jq '.status.conditions'
# Is there a Composition selected?
kubectl get compositions
kubectl describe composition xpostgresqlinstance-aws
# Check: spec.compositeTypeRef must match the XRD
# Check if the claim created an XR at all
kubectl get xpostgresqlinstances # composite resources (cluster-scoped)
kubectl get xpostgresqlinstances -o json | jq '.items[].metadata.name'
# Look at Crossplane controller logs
kubectl logs -n crossplane-system deploy/crossplane | grep "my-db\|postgresqlinstance"
Fix:
# If XRD is not Established, check for validation errors
kubectl describe xrd xpostgresqlinstances.database.example.com
# Common issue: spec.versions schema validation errors
# If Composition is not selected, check the compositionSelector labels match
kubectl get claim my-db -n my-namespace -o json | jq '.spec.compositionSelector'
kubectl get compositions -l environment=production # must match
# Force reconcile if stuck
kubectl annotate postgresqlinstance my-db -n my-namespace \
reconcile.crossplane.io/paused=false --overwrite
Scenario 3: Composition Revision Rollback¶
A composition update breaks all managed resources. You need to roll back to the previous revision.
# List composition revisions
kubectl get compositionrevisions --sort-by='.metadata.creationTimestamp'
# Check which revision is current
kubectl get composition xpostgresqlinstance-aws -o json | jq '.spec.compositionRevisionRef'
# See what changed in a revision
kubectl get compositionrevision xpostgresqlinstance-aws-abc123 -o yaml | \
kubectl diff -f -
# Pin a composite resource to a specific revision
kubectl patch xpostgresqlinstance prod-db --type merge \
-p '{"spec":{"compositionRevisionRef":{"name":"xpostgresqlinstance-aws-abc123"}}}'
# Set Composition update policy to Manual (prevent auto-upgrade to new revisions)
kubectl patch composition xpostgresqlinstance-aws --type merge \
-p '{"spec":{"compositionRevisionRef":{"policy":"Automatic"}}}'
# Options: Automatic (default) or Manual
Scenario 4: Debugging a Composition (Which Pipeline Step Is Failing)¶
A composite resource exists but managed resources are not being created. The composition pipeline is failing somewhere.
# Check composite resource status in detail
kubectl describe xpostgresqlinstance prod-db
# Look for pipeline errors (Crossplane 1.14+ with Functions)
kubectl get xpostgresqlinstance prod-db -o json | jq '.status'
# Look for: .status.conditions, .status.connectionDetails
# If using classic Composition (patches/resources), check each managed resource
kubectl get managed -l crossplane.io/composite=prod-db
# Check if the composed resources exist at all
kubectl get managed | grep prod-db
# Check Crossplane logs for composition errors
kubectl logs -n crossplane-system deploy/crossplane -f | grep -E "error|Error|prod-db"
# Validate the Composition spec (is the API version correct?)
kubectl apply -f composition.yaml --dry-run=server
# Test with a minimal claim to isolate the failing resource
Key Patterns¶
ProviderConfig with IRSA (AWS, EKS)¶
# ServiceAccount for the provider pod
apiVersion: v1
kind: ServiceAccount
metadata:
name: upbound-provider-aws
namespace: crossplane-system
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/crossplane-provider-role
---
# ProviderConfig that references the ServiceAccount credentials
apiVersion: aws.upbound.io/v1beta1
kind: ProviderConfig
metadata:
name: default
spec:
credentials:
source: IRSA
# Corresponding IAM trust policy (must allow the OIDC provider)
# Principal: arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/EXAMPLE
# Condition: StringEquals
# oidc.eks.us-east-1.amazonaws.com/id/EXAMPLE:sub: system:serviceaccount:crossplane-system:upbound-provider-aws
# Verify IRSA is working
kubectl exec -n crossplane-system deploy/upbound-provider-aws -- \
aws sts get-caller-identity
Composition with Patches¶
apiVersion: apiextensions.crossplane.io/v1
kind: Composition
metadata:
name: xpostgresqlinstance-aws
spec:
compositeTypeRef:
apiVersion: database.example.com/v1alpha1
kind: XPostgreSQLInstance
resources:
- name: rds-instance
base:
apiVersion: rds.aws.upbound.io/v1beta1
kind: Instance
spec:
forProvider:
region: us-east-1
instanceClass: db.t3.micro
engine: postgres
engineVersion: "15"
skipFinalSnapshot: true
autoGeneratePassword: true
masterUsername: adminuser
dbName: mydb
publiclyAccessible: false
patches:
# Claim field → Managed resource field
- type: FromCompositeFieldPath
fromFieldPath: spec.parameters.storageGB
toFieldPath: spec.forProvider.allocatedStorage
- type: FromCompositeFieldPath
fromFieldPath: spec.parameters.size
toFieldPath: spec.forProvider.instanceClass
transforms:
- type: map
map:
small: db.t3.micro
medium: db.t3.medium
large: db.t3.large
# Tag with composite name
- type: FromCompositeFieldPath
fromFieldPath: metadata.name
toFieldPath: spec.forProvider.tags.Name
# Status back from managed resource to composite
- type: ToCompositeFieldPath
fromFieldPath: status.atProvider.endpoint.address
toFieldPath: status.endpoint
XRD (Composite Resource Definition)¶
apiVersion: apiextensions.crossplane.io/v1
kind: CompositeResourceDefinition
metadata:
name: xpostgresqlinstances.database.example.com
spec:
group: database.example.com
names:
kind: XPostgreSQLInstance
plural: xpostgresqlinstances
claimNames:
kind: PostgreSQLInstance # what users create
plural: postgresqlinstances
versions:
- name: v1alpha1
served: true
referenceable: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
parameters:
type: object
required: [storageGB, size]
properties:
storageGB:
type: integer
minimum: 20
maximum: 1000
size:
type: string
enum: [small, medium, large]
external-name Annotation (Adopting Existing Resources)¶
# If an AWS resource already exists and you want Crossplane to manage it
# Set external-name to the cloud provider's identifier
kubectl annotate rdsinstance.rds.aws.upbound.io prod-db \
crossplane.io/external-name=prod-db-actual-identifier
# For S3 buckets (external-name = bucket name)
kubectl annotate bucket.s3.aws.upbound.io my-bucket \
crossplane.io/external-name=my-actual-bucket-name-in-aws
# After annotating, Crossplane will observe (not create) the resource
# Check status to verify it connected to the existing resource
kubectl get rdsinstance prod-db -o json | jq '.status'
Package Management¶
# Install a provider
kubectl apply -f - <<EOF
apiVersion: pkg.crossplane.io/v1
kind: Provider
metadata:
name: upbound-provider-aws
spec:
package: xpkg.upbound.io/upbound/provider-aws-rds:v1.2.0
installRuntimeConfig:
spec:
serviceAccountName: upbound-provider-aws
EOF
# Check package install progress
kubectl get providers -w
kubectl describe provider upbound-provider-aws
# Look for: Installed: True, Healthy: True
# List installed CRDs from a provider
kubectl get crds | grep rds.aws.upbound.io
# Update a provider (change the package version)
kubectl patch provider upbound-provider-aws --type merge \
-p '{"spec":{"package":"xpkg.upbound.io/upbound/provider-aws-rds:v1.3.0"}}'
# Install a Configuration package (bundles XRDs + Compositions)
kubectl apply -f - <<EOF
apiVersion: pkg.crossplane.io/v1
kind: Configuration
metadata:
name: platform-ref-aws
spec:
package: xpkg.upbound.io/upbound/platform-ref-aws:v0.9.0
EOF
Pausing and Resuming Reconciliation¶
# Pause a managed resource (stop Crossplane from touching the cloud resource)
kubectl annotate managed rdsinstance.prod-db \
crossplane.io/paused=true
# Resume
kubectl annotate managed rdsinstance.prod-db \
crossplane.io/paused-
# Pause all resources in a composite (for maintenance)
kubectl get managed -l crossplane.io/composite=prod-db -o name | \
xargs -I{} kubectl annotate {} crossplane.io/paused=true
Observing Drift¶
War story: An engineer manually resized an RDS instance in the AWS console during an incident. Crossplane reverted it to the original size on the next reconcile (10 seconds later), causing a second outage. If you must make emergency changes to Crossplane-managed resources, pause the resource first with
crossplane.io/paused=true, then update the manifest in git after the incident.
# Crossplane continuously reconciles — if someone changes a resource in AWS,
# Crossplane will revert it on the next reconcile loop (typically every ~10s)
# Check last observed state vs desired
kubectl get rdsinstance prod-db -o json | jq '.status.atProvider'
# If you want to allow drift (not recommended), pause the resource
# Otherwise, manage all changes through the Crossplane manifest
# Force immediate reconcile (by adding/changing an annotation)
kubectl annotate rdsinstance prod-db \
reconcile.crossplane.io/paused=false --overwrite