Portal | Level: L1: Foundations | Topics: Feature Flags | Domain: DevOps & Tooling
Feature Flags — Primer¶
Why This Matters¶
Feature flags decouple deployment from release. Without flags, deploying new code means releasing it to all users simultaneously — the deploy IS the release. With flags, you deploy the code (dark launch), then control who sees it independently of deployment. This lets you: ship to 1% of users, roll back with a config change (no redeploy), A/B test, give early access to beta users, and kill a feature in production without deploying.
The mental model: deployment moves bits. Release is a business decision. Flags give you that separation.
Flag Types¶
Not all flags are the same. Use the wrong type and you get the wrong lifecycle.
| Type | Purpose | Lifetime | Example |
|---|---|---|---|
| Release flag | Control rollout of new feature | Temporary (weeks) | New checkout flow for 10% of users |
| Experiment flag | A/B test to measure impact | Temporary (days–weeks) | Button color variant |
| Ops flag | Kill switch for a feature under load | Temporary or permanent | Disable recommendations when DB is slow |
| Permission flag | Gate features by user tier | Permanent | Premium users see advanced analytics |
Treat release flags as tech debt. The day you flip it to 100% and remove the old code path, delete the flag. Stale flags in code are time bombs.
War story: Knight Capital Group lost $440 million in 45 minutes on August 1, 2012, partly because old feature flag code was accidentally reactivated during a deployment. Dead code paths behind stale flags are not just clutter — they are latent incidents.
OpenFeature SDK¶
Who made it: OpenFeature became a CNCF sandbox project in 2023 and graduated to incubating in 2024. It was created to prevent vendor lock-in — before OpenFeature, switching from LaunchDarkly to Flagsmith meant rewriting every flag evaluation call in your codebase.
OpenFeature is a CNCF standard for feature flag evaluation. It defines a provider interface so you can swap backends (LaunchDarkly, Flagsmith, Unleash, custom) without changing application code.
Core Concepts¶
- Provider: Backend system (LaunchDarkly, Flagsmith, etc.)
- Client: Your application's interface to the flag system
- Evaluation context: Who is making the request (user ID, region, plan, etc.)
- Hook: Middleware for logging, metrics, error handling
Python SDK¶
from openfeature import api
from openfeature.evaluation_context import EvaluationContext
# Set up provider (LaunchDarkly example)
from openfeature.provider.launchdarkly import LaunchDarklyProvider
api.set_provider(LaunchDarklyProvider(sdk_key="sdk-xxx"))
# Get a client
client = api.get_client()
# Evaluate flags
def handle_checkout(user_id: str, user_plan: str):
ctx = EvaluationContext(
targeting_key=user_id,
attributes={
"plan": user_plan,
"region": "us-east-1",
}
)
# Boolean flag
if client.get_boolean_value("new-checkout-flow", False, ctx):
return new_checkout(user_id)
else:
return legacy_checkout(user_id)
Go SDK¶
import (
"github.com/open-feature/go-sdk/pkg/openfeature"
ldflag "github.com/open-feature/go-sdk-contrib/providers/launchdarkly/pkg"
)
func main() {
// Initialize with LaunchDarkly provider
provider, _ := ldflag.NewProvider("sdk-xxx")
openfeature.SetProvider(provider)
client := openfeature.NewClient("my-service")
ctx := openfeature.NewEvaluationContext(
userID,
map[string]interface{}{
"plan": "premium",
"region": "us-east-1",
},
)
enabled, _ := client.BooleanValue(
context.Background(),
"new-search-algorithm",
false, // default value
ctx,
)
if enabled {
return newSearch(query)
}
return legacySearch(query)
}
Hooks¶
from openfeature.hook import Hook
class MetricsHook(Hook):
def after(self, hook_context, flag_evaluation_details, hints):
# Track every flag evaluation
metrics.increment(
"feature_flag.evaluated",
tags={
"flag": hook_context.flag_key,
"value": str(flag_evaluation_details.value),
"reason": flag_evaluation_details.reason,
}
)
def error(self, hook_context, exception, hints):
logger.error(f"Flag evaluation failed: {hook_context.flag_key}", exc_info=exception)
api.add_hooks([MetricsHook()])
LaunchDarkly¶
LaunchDarkly is the dominant commercial feature flag platform. SDK available in 20+ languages.
SDK Initialization¶
import ldclient
from ldclient.config import Config
# Initialize once at application startup — expensive operation
ldclient.set_config(Config("sdk-xxx"))
client = ldclient.get()
# Always check if initialized
if not client.is_initialized():
logger.warning("LaunchDarkly client failed to initialize — using defaults")
Evaluation Context (User/Context Object)¶
from ldclient import Context
# Single context (most common)
user_context = Context.builder(user_id) \
.kind("user") \
.set("plan", "premium") \
.set("region", "us-east-1") \
.set("email", user_email) \
.build()
# Multi-context (user + organization)
org_context = Context.builder(org_id).kind("organization") \
.set("tier", "enterprise") \
.build()
multi_context = Context.create_multi(user_context, org_context)
# Evaluate
variation = client.variation("new-dashboard", multi_context, False)
Targeting Rules¶
In the LaunchDarkly UI, you configure rules. These are evaluated top-to-bottom:
- Individual targeting: user IDs get explicit flag values
- Rule-based targeting:
if plan = "premium" then return true - Percentage rollout: 10% of users get
true, 90% getfalse - Default variation: what everyone else gets
Percentage Rollouts¶
Under the hood: Percentage rollouts work by hashing the targeting key (user ID) to a value between 0-100. The same key always produces the same hash, so the experience is consistent per-user without server-side session state. This is why changing the targeting key changes which users are in the rollout.
Rollouts are bucketed by the user's targeting key (user ID). This means the same user always gets the same variation — it's deterministic. You can change the bucket attribute to roll out by organization, session, or device instead.
# Example: 10% rollout
# In LD UI: add rollout rule with 10% → true, 90% → false
# In code: just evaluate the flag normally
is_enabled = client.variation("new-search", user_context, False)
# 10% of users will consistently get True
Flag Events and Analytics¶
# Track a conversion event tied to an experiment
client.track("purchase-completed", user_context, metric_value=order_total)
# Custom event with data
client.track("search-used", user_context, data={"query_length": len(query)})
# Flush events before shutdown (important in serverless/short-lived processes)
client.flush()
client.close()
Flagsmith (Self-Hosted Option)¶
Flagsmith is open source with a self-hosted option. Useful when data residency or cost is a concern.
from flagsmith import Flagsmith
# Connect to self-hosted instance
flagsmith = Flagsmith(
environment_key="env-key",
api_url="https://flagsmith.internal.example.com/api/v1/",
)
# Get flags for an identity
flags = flagsmith.get_identity_flags(
identifier=user_id,
traits={"plan": "premium", "region": "us-east-1"},
)
# Evaluate
if flags.is_feature_enabled("new-checkout"):
return new_checkout()
# Get string/number variation value
theme = flags.get_feature_value("ui-theme") # returns "dark" or "light"
Docker Compose for self-hosted Flagsmith:
version: '3'
services:
postgres:
image: postgres:15
environment:
POSTGRES_DB: flagsmith
POSTGRES_USER: flagsmith
POSTGRES_PASSWORD: password
flagsmith:
image: flagsmith/flagsmith:latest
environment:
DJANGO_DB_URL: postgresql://flagsmith:password@postgres/flagsmith
SECRET_KEY: "your-secret-key"
ports:
- "8000:8000"
depends_on:
- postgres
Flag Lifecycle Management¶
The Stale Flag Problem¶
A flag introduced for a rollout should be removed after the rollout is complete. Stale flags accumulate technical debt:
- Dead code paths that can never be tested
- Confusion about what the "normal" code path is
- Performance overhead of evaluating flags that are always true
- Security risk: a flag stuck at false could disable a security feature
Lifecycle Policy¶
- Create: Define flag with an expiry date in the description
- Roll out: Gradual percentage increase (1% → 10% → 50% → 100%)
- Monitor: Watch error rate and latency metrics for each variation
- Retire: Once at 100% and stable for 1 week, mark for cleanup
- Archive: Remove code paths and delete flag from system
LaunchDarkly Flag Cleanup¶
# List flags via API
curl -H "Authorization: api-key-xxx" \
"https://app.launchdarkly.com/api/v2/flags/my-project" \
| jq '.items[] | select(.archived == false) | {key: .key, creationDate: .creationDate}'
# Find flags not evaluated in 30 days (stale)
# LaunchDarkly Code References integration:
# npm install -g @launchdarkly/find-code-refs
ld-find-code-refs \
--accessToken=api-key-xxx \
--projKey=my-project \
--dir=/path/to/repo
Trunk-Based Development + Feature Flags¶
Feature flags enable trunk-based development: all engineers commit to main, no long-lived feature branches.
Without flags: With flags:
feature-branch ─┐ main (always deployable)
feature-branch ─┤ │
feature-branch ─┤ │ new-search-v2 flag
↓ │ → false: old code path
main │ → true: new code path
(big merge hell) │ new-checkout flag
│ → false: old code path
│ → true: new code path
This means engineers can deploy incomplete features — the flag is false in production. Integration testing happens in staging with the flag on. When the feature is ready, flip the flag. No deployment required.
Testing with Feature Flags¶
Always test both code paths. Don't assume the old path still works.
# pytest with feature flags
import pytest
from unittest.mock import patch
@pytest.fixture
def flag_on(monkeypatch):
monkeypatch.setattr(
"myapp.flags.client.variation",
lambda flag_key, context, default: True if flag_key == "new-checkout" else default
)
@pytest.fixture
def flag_off(monkeypatch):
monkeypatch.setattr(
"myapp.flags.client.variation",
lambda flag_key, context, default: False
)
def test_checkout_new_flow(flag_on):
result = checkout(user_id="test-user")
assert result["flow"] == "new"
def test_checkout_legacy_flow(flag_off):
result = checkout(user_id="test-user")
assert result["flow"] == "legacy"
Gradual Rollouts with Kill Switches¶
A kill switch is an ops flag that starts at 100% on and can be instantly set to 0% to disable a feature. Use this for features that interact with external services or have high blast radius.
def get_recommendations(user_id: str) -> list:
ctx = EvaluationContext(targeting_key=user_id)
# Kill switch: if False, skip expensive recommendation engine
if not client.get_boolean_value("recommendations-enabled", True, ctx):
return get_fallback_recommendations(user_id)
try:
return recommendation_engine.get(user_id)
except Exception as e:
# Also automatically degrade on error
logger.error("Recommendation engine failed", exc_info=e)
metrics.increment("recommendations.fallback")
return get_fallback_recommendations(user_id)
Flag-Driven Canary vs Argo Rollouts¶
Both achieve gradual rollouts, but at different layers:
| Approach | Layer | Rollback speed | Complexity |
|---|---|---|---|
| Feature flag | Application | Instant (no deploy) | Low (SDK call) |
| Argo Rollouts canary | Kubernetes/traffic | Minutes (update rollout) | Higher (CRD, analysis) |
Feature flags are better for: business logic, per-user targeting, experiments, ops kill switches.
Argo canary is better for: infrastructure changes (DB schema), new service deployments, changes that affect all code paths.
You can combine them: deploy new pods with Argo (infrastructure canary) AND gate the feature with a flag (application canary) for two independent control planes.
Operational Flags for Circuit Breakers¶
# Combine feature flags with circuit breaker pattern
from circuitbreaker import circuit
@circuit(failure_threshold=5, recovery_timeout=30)
def _call_payment_service(payload: dict) -> dict:
return payment_client.charge(payload)
def process_payment(user_id: str, payload: dict) -> dict:
ctx = EvaluationContext(targeting_key=user_id)
# Ops flag: manual override independent of circuit state
if not client.get_boolean_value("payment-service-enabled", True, ctx):
return {"status": "deferred", "reason": "payment_disabled"}
try:
return _call_payment_service(payload)
except CircuitBreakerError:
# Circuit is open — degrade gracefully
metrics.increment("payment.circuit_open")
return {"status": "deferred", "reason": "circuit_open"}
Quick Reference¶
OpenFeature Flag Types¶
# Boolean
enabled = client.get_boolean_value("flag-key", False, ctx)
# String variation
theme = client.get_string_value("ui-theme", "default", ctx)
# Integer
max_results = client.get_integer_value("search-limit", 10, ctx)
# Float
discount = client.get_float_value("discount-rate", 0.0, ctx)
# Object (JSON)
config = client.get_object_value("search-config", {}, ctx)
Flag Naming Conventions¶
<component>-<feature>-<action> — recommended
checkout-new-flow-enabled
search-v2-enabled
recommendations-enabled — kill switch (no feature prefix)
dark-mode-enabled
premium-analytics-enabled — permission flag
LaunchDarkly API Quick Reference¶
# List flags
GET /api/v2/flags/{projectKey}
# Get specific flag
GET /api/v2/flags/{projectKey}/{featureFlagKey}
# Toggle flag on/off
PATCH /api/v2/flags/{projectKey}/{featureFlagKey}
Body: [{"op": "replace", "path": "/environments/{env}/on", "value": true}]
# Archive flag
PATCH /api/v2/flags/{projectKey}/{featureFlagKey}
Body: [{"op": "replace", "path": "/archived", "value": true}]
Wiki Navigation¶
Prerequisites¶
- CI/CD Pipelines & Patterns (Topic Pack, L1)