Skip to content

Portal | Level: L1: Foundations | Topics: Feature Flags | Domain: DevOps & Tooling

Feature Flags — Primer

Why This Matters

Feature flags decouple deployment from release. Without flags, deploying new code means releasing it to all users simultaneously — the deploy IS the release. With flags, you deploy the code (dark launch), then control who sees it independently of deployment. This lets you: ship to 1% of users, roll back with a config change (no redeploy), A/B test, give early access to beta users, and kill a feature in production without deploying.

The mental model: deployment moves bits. Release is a business decision. Flags give you that separation.


Flag Types

Not all flags are the same. Use the wrong type and you get the wrong lifecycle.

Type Purpose Lifetime Example
Release flag Control rollout of new feature Temporary (weeks) New checkout flow for 10% of users
Experiment flag A/B test to measure impact Temporary (days–weeks) Button color variant
Ops flag Kill switch for a feature under load Temporary or permanent Disable recommendations when DB is slow
Permission flag Gate features by user tier Permanent Premium users see advanced analytics

Treat release flags as tech debt. The day you flip it to 100% and remove the old code path, delete the flag. Stale flags in code are time bombs.

War story: Knight Capital Group lost $440 million in 45 minutes on August 1, 2012, partly because old feature flag code was accidentally reactivated during a deployment. Dead code paths behind stale flags are not just clutter — they are latent incidents.


OpenFeature SDK

Who made it: OpenFeature became a CNCF sandbox project in 2023 and graduated to incubating in 2024. It was created to prevent vendor lock-in — before OpenFeature, switching from LaunchDarkly to Flagsmith meant rewriting every flag evaluation call in your codebase.

OpenFeature is a CNCF standard for feature flag evaluation. It defines a provider interface so you can swap backends (LaunchDarkly, Flagsmith, Unleash, custom) without changing application code.

Core Concepts

  • Provider: Backend system (LaunchDarkly, Flagsmith, etc.)
  • Client: Your application's interface to the flag system
  • Evaluation context: Who is making the request (user ID, region, plan, etc.)
  • Hook: Middleware for logging, metrics, error handling

Python SDK

from openfeature import api
from openfeature.evaluation_context import EvaluationContext

# Set up provider (LaunchDarkly example)
from openfeature.provider.launchdarkly import LaunchDarklyProvider
api.set_provider(LaunchDarklyProvider(sdk_key="sdk-xxx"))

# Get a client
client = api.get_client()

# Evaluate flags
def handle_checkout(user_id: str, user_plan: str):
    ctx = EvaluationContext(
        targeting_key=user_id,
        attributes={
            "plan": user_plan,
            "region": "us-east-1",
        }
    )

    # Boolean flag
    if client.get_boolean_value("new-checkout-flow", False, ctx):
        return new_checkout(user_id)
    else:
        return legacy_checkout(user_id)

Go SDK

import (
    "github.com/open-feature/go-sdk/pkg/openfeature"
    ldflag "github.com/open-feature/go-sdk-contrib/providers/launchdarkly/pkg"
)

func main() {
    // Initialize with LaunchDarkly provider
    provider, _ := ldflag.NewProvider("sdk-xxx")
    openfeature.SetProvider(provider)
    client := openfeature.NewClient("my-service")

    ctx := openfeature.NewEvaluationContext(
        userID,
        map[string]interface{}{
            "plan":   "premium",
            "region": "us-east-1",
        },
    )

    enabled, _ := client.BooleanValue(
        context.Background(),
        "new-search-algorithm",
        false,  // default value
        ctx,
    )

    if enabled {
        return newSearch(query)
    }
    return legacySearch(query)
}

Hooks

from openfeature.hook import Hook

class MetricsHook(Hook):
    def after(self, hook_context, flag_evaluation_details, hints):
        # Track every flag evaluation
        metrics.increment(
            "feature_flag.evaluated",
            tags={
                "flag": hook_context.flag_key,
                "value": str(flag_evaluation_details.value),
                "reason": flag_evaluation_details.reason,
            }
        )

    def error(self, hook_context, exception, hints):
        logger.error(f"Flag evaluation failed: {hook_context.flag_key}", exc_info=exception)

api.add_hooks([MetricsHook()])

LaunchDarkly

LaunchDarkly is the dominant commercial feature flag platform. SDK available in 20+ languages.

SDK Initialization

import ldclient
from ldclient.config import Config

# Initialize once at application startup — expensive operation
ldclient.set_config(Config("sdk-xxx"))
client = ldclient.get()

# Always check if initialized
if not client.is_initialized():
    logger.warning("LaunchDarkly client failed to initialize — using defaults")

Evaluation Context (User/Context Object)

from ldclient import Context

# Single context (most common)
user_context = Context.builder(user_id) \
    .kind("user") \
    .set("plan", "premium") \
    .set("region", "us-east-1") \
    .set("email", user_email) \
    .build()

# Multi-context (user + organization)
org_context = Context.builder(org_id).kind("organization") \
    .set("tier", "enterprise") \
    .build()

multi_context = Context.create_multi(user_context, org_context)

# Evaluate
variation = client.variation("new-dashboard", multi_context, False)

Targeting Rules

In the LaunchDarkly UI, you configure rules. These are evaluated top-to-bottom:

  1. Individual targeting: user IDs get explicit flag values
  2. Rule-based targeting: if plan = "premium" then return true
  3. Percentage rollout: 10% of users get true, 90% get false
  4. Default variation: what everyone else gets

Percentage Rollouts

Under the hood: Percentage rollouts work by hashing the targeting key (user ID) to a value between 0-100. The same key always produces the same hash, so the experience is consistent per-user without server-side session state. This is why changing the targeting key changes which users are in the rollout.

Rollouts are bucketed by the user's targeting key (user ID). This means the same user always gets the same variation — it's deterministic. You can change the bucket attribute to roll out by organization, session, or device instead.

# Example: 10% rollout
# In LD UI: add rollout rule with 10% → true, 90% → false
# In code: just evaluate the flag normally
is_enabled = client.variation("new-search", user_context, False)
# 10% of users will consistently get True

Flag Events and Analytics

# Track a conversion event tied to an experiment
client.track("purchase-completed", user_context, metric_value=order_total)

# Custom event with data
client.track("search-used", user_context, data={"query_length": len(query)})

# Flush events before shutdown (important in serverless/short-lived processes)
client.flush()
client.close()

Flagsmith (Self-Hosted Option)

Flagsmith is open source with a self-hosted option. Useful when data residency or cost is a concern.

from flagsmith import Flagsmith

# Connect to self-hosted instance
flagsmith = Flagsmith(
    environment_key="env-key",
    api_url="https://flagsmith.internal.example.com/api/v1/",
)

# Get flags for an identity
flags = flagsmith.get_identity_flags(
    identifier=user_id,
    traits={"plan": "premium", "region": "us-east-1"},
)

# Evaluate
if flags.is_feature_enabled("new-checkout"):
    return new_checkout()

# Get string/number variation value
theme = flags.get_feature_value("ui-theme")  # returns "dark" or "light"

Docker Compose for self-hosted Flagsmith:

version: '3'
services:
  postgres:
    image: postgres:15
    environment:
      POSTGRES_DB: flagsmith
      POSTGRES_USER: flagsmith
      POSTGRES_PASSWORD: password

  flagsmith:
    image: flagsmith/flagsmith:latest
    environment:
      DJANGO_DB_URL: postgresql://flagsmith:password@postgres/flagsmith
      SECRET_KEY: "your-secret-key"
    ports:
      - "8000:8000"
    depends_on:
      - postgres

Flag Lifecycle Management

The Stale Flag Problem

A flag introduced for a rollout should be removed after the rollout is complete. Stale flags accumulate technical debt: - Dead code paths that can never be tested - Confusion about what the "normal" code path is - Performance overhead of evaluating flags that are always true - Security risk: a flag stuck at false could disable a security feature

Lifecycle Policy

  1. Create: Define flag with an expiry date in the description
  2. Roll out: Gradual percentage increase (1% → 10% → 50% → 100%)
  3. Monitor: Watch error rate and latency metrics for each variation
  4. Retire: Once at 100% and stable for 1 week, mark for cleanup
  5. Archive: Remove code paths and delete flag from system

LaunchDarkly Flag Cleanup

# List flags via API
curl -H "Authorization: api-key-xxx" \
  "https://app.launchdarkly.com/api/v2/flags/my-project" \
  | jq '.items[] | select(.archived == false) | {key: .key, creationDate: .creationDate}'

# Find flags not evaluated in 30 days (stale)
# LaunchDarkly Code References integration:
# npm install -g @launchdarkly/find-code-refs
ld-find-code-refs \
  --accessToken=api-key-xxx \
  --projKey=my-project \
  --dir=/path/to/repo

Trunk-Based Development + Feature Flags

Feature flags enable trunk-based development: all engineers commit to main, no long-lived feature branches.

Without flags:          With flags:
feature-branch ─┐       main (always deployable)
feature-branch ─┤       │
feature-branch ─┤       │  new-search-v2 flag
                ↓       │    → false: old code path
               main     │    → true: new code path
(big merge hell)        │  new-checkout flag
                        │    → false: old code path
                        │    → true: new code path

This means engineers can deploy incomplete features — the flag is false in production. Integration testing happens in staging with the flag on. When the feature is ready, flip the flag. No deployment required.


Testing with Feature Flags

Always test both code paths. Don't assume the old path still works.

# pytest with feature flags
import pytest
from unittest.mock import patch

@pytest.fixture
def flag_on(monkeypatch):
    monkeypatch.setattr(
        "myapp.flags.client.variation",
        lambda flag_key, context, default: True if flag_key == "new-checkout" else default
    )

@pytest.fixture
def flag_off(monkeypatch):
    monkeypatch.setattr(
        "myapp.flags.client.variation",
        lambda flag_key, context, default: False
    )

def test_checkout_new_flow(flag_on):
    result = checkout(user_id="test-user")
    assert result["flow"] == "new"

def test_checkout_legacy_flow(flag_off):
    result = checkout(user_id="test-user")
    assert result["flow"] == "legacy"

Gradual Rollouts with Kill Switches

A kill switch is an ops flag that starts at 100% on and can be instantly set to 0% to disable a feature. Use this for features that interact with external services or have high blast radius.

def get_recommendations(user_id: str) -> list:
    ctx = EvaluationContext(targeting_key=user_id)

    # Kill switch: if False, skip expensive recommendation engine
    if not client.get_boolean_value("recommendations-enabled", True, ctx):
        return get_fallback_recommendations(user_id)

    try:
        return recommendation_engine.get(user_id)
    except Exception as e:
        # Also automatically degrade on error
        logger.error("Recommendation engine failed", exc_info=e)
        metrics.increment("recommendations.fallback")
        return get_fallback_recommendations(user_id)

Flag-Driven Canary vs Argo Rollouts

Both achieve gradual rollouts, but at different layers:

Approach Layer Rollback speed Complexity
Feature flag Application Instant (no deploy) Low (SDK call)
Argo Rollouts canary Kubernetes/traffic Minutes (update rollout) Higher (CRD, analysis)

Feature flags are better for: business logic, per-user targeting, experiments, ops kill switches.

Argo canary is better for: infrastructure changes (DB schema), new service deployments, changes that affect all code paths.

You can combine them: deploy new pods with Argo (infrastructure canary) AND gate the feature with a flag (application canary) for two independent control planes.


Operational Flags for Circuit Breakers

# Combine feature flags with circuit breaker pattern
from circuitbreaker import circuit

@circuit(failure_threshold=5, recovery_timeout=30)
def _call_payment_service(payload: dict) -> dict:
    return payment_client.charge(payload)

def process_payment(user_id: str, payload: dict) -> dict:
    ctx = EvaluationContext(targeting_key=user_id)

    # Ops flag: manual override independent of circuit state
    if not client.get_boolean_value("payment-service-enabled", True, ctx):
        return {"status": "deferred", "reason": "payment_disabled"}

    try:
        return _call_payment_service(payload)
    except CircuitBreakerError:
        # Circuit is open — degrade gracefully
        metrics.increment("payment.circuit_open")
        return {"status": "deferred", "reason": "circuit_open"}

Quick Reference

OpenFeature Flag Types

# Boolean
enabled = client.get_boolean_value("flag-key", False, ctx)

# String variation
theme = client.get_string_value("ui-theme", "default", ctx)

# Integer
max_results = client.get_integer_value("search-limit", 10, ctx)

# Float
discount = client.get_float_value("discount-rate", 0.0, ctx)

# Object (JSON)
config = client.get_object_value("search-config", {}, ctx)

Flag Naming Conventions

<component>-<feature>-<action>    recommended
checkout-new-flow-enabled
search-v2-enabled
recommendations-enabled           kill switch (no feature prefix)
dark-mode-enabled
premium-analytics-enabled         permission flag

LaunchDarkly API Quick Reference

# List flags
GET /api/v2/flags/{projectKey}

# Get specific flag
GET /api/v2/flags/{projectKey}/{featureFlagKey}

# Toggle flag on/off
PATCH /api/v2/flags/{projectKey}/{featureFlagKey}
Body: [{"op": "replace", "path": "/environments/{env}/on", "value": true}]

# Archive flag
PATCH /api/v2/flags/{projectKey}/{featureFlagKey}
Body: [{"op": "replace", "path": "/archived", "value": true}]

Wiki Navigation

Prerequisites