Skip to content

Portal | Level: L2: Operations | Topics: OpenTelemetry, Tracing, Prometheus | Domain: Observability

OpenTelemetry - Primer

Why This Matters

You have logs in Elasticsearch, metrics in Prometheus, and traces in Jaeger. Three vendors, three agents, three config languages, three ways things break at 3 AM. OpenTelemetry (OTel) exists to end this fragmentation. It is the CNCF project that gives you a single, vendor-neutral standard for generating, collecting, and exporting telemetry data — traces, metrics, and logs — from your services.

If you touch infrastructure, OTel changes how you think about observability. Instead of bolting on monitoring after the fact, you instrument once and route signals anywhere. Switch backends without rewriting code. Correlate a slow API response to a database query to a container running hot — across services, languages, and teams.

This is not theoretical. OTel is the second-most-active CNCF project after Kubernetes. If you are not using it yet, you will be.

Timeline: OpenTelemetry was formed in 2019 by merging two competing projects: OpenTracing (tracing standard, 2016) and OpenCensus (Google's metrics+tracing library, 2018). The merge ended a confusing period where library authors had to choose between two incompatible instrumentation APIs. OTel traces and metrics reached GA (stable) in 2023; logs reached stability in 2024.


The Three Signals

OTel unifies three core telemetry signals under one umbrella:

┌─────────────────────────────────────────────────────┐
│                  YOUR APPLICATION                    │
│                                                      │
│   ┌──────────┐  ┌──────────┐  ┌──────────┐         │
│   │  Traces  │  │ Metrics  │  │   Logs   │         │
│   │          │  │          │  │          │         │
│   │ Spans    │  │ Counters │  │ Structured│        │
│   │ Context  │  │ Gauges   │  │ Events   │         │
│   │ Timing   │  │ Histos   │  │ Severity │         │
│   └────┬─────┘  └────┬─────┘  └────┬─────┘         │
│        │              │              │               │
│        └──────────────┼──────────────┘               │
│                       │                              │
│              OTel SDK / API                          │
└───────────────────────┼──────────────────────────────┘
               OTel Collector

Traces

A trace is the full journey of a request through your system. It is composed of spans — each span represents a unit of work (an HTTP handler, a DB query, a cache lookup). Spans carry:

  • Trace ID: Shared across all spans in one request
  • Span ID: Unique to this span
  • Parent Span ID: Links child to parent
  • Attributes: Key-value metadata (http.method, db.system, etc.)
  • Events: Timestamped annotations within a span
  • Status: OK, Error, or Unset
Trace ID: abc123
├── Span: API Gateway (120ms)
│   ├── Span: Auth Service (15ms)
│   ├── Span: Order Service (95ms)
│   │   ├── Span: DB Query (40ms)
│   │   └── Span: Cache Lookup (3ms)
│   └── Span: Response Serialization (5ms)

Metrics

OTel defines three metric instruments:

Instrument What It Measures Example
Counter Monotonically increasing http.server.request.count
Gauge Point-in-time value system.memory.usage
Histogram Distribution of values http.server.request.duration

Metrics in OTel use a push model by default (unlike Prometheus pull), but the collector can bridge both worlds.

Logs

OTel logs are the newest signal and bridge existing log frameworks (log4j, slog, zerolog) into the OTel ecosystem. The key addition: logs gain trace context. A log line is no longer orphaned text — it links back to the span that produced it.

{
  "timestamp": "2026-03-15T14:30:00Z",
  "severity": "ERROR",
  "body": "connection refused to payments-db",
  "trace_id": "abc123",
  "span_id": "def456",
  "resource": {
    "service.name": "order-service",
    "service.version": "2.4.1"
  }
}

Collector Architecture

The OTel Collector is the central nervous system. It receives, processes, and exports telemetry. You can run it as an agent (sidecar/daemonset) or as a gateway (centralized).

┌───────────────────────────────────────────────────────┐
│                   OTel Collector                       │
│                                                        │
│  ┌────────────┐   ┌─────────────┐   ┌─────────────┐  │
│  │ Receivers  │──▶│ Processors  │──▶│  Exporters  │  │
│  │            │   │             │   │             │  │
│  │ - otlp     │   │ - batch     │   │ - otlp      │  │
│  │ - jaeger   │   │ - filter    │   │ - prometheus │  │
│  │ - prometheus│   │ - transform │   │ - jaeger    │  │
│  │ - filelog  │   │ - tail_samp │   │ - loki      │  │
│  │ - hostmetrc│   │ - memory_lim│   │ - debug     │  │
│  └────────────┘   └─────────────┘   └─────────────┘  │
│                                                        │
│  ┌──────────────────────────────────────────────────┐  │
│  │              Extensions                           │  │
│  │  - health_check  - pprof  - zpages  - bearerauth │  │
│  └──────────────────────────────────────────────────┘  │
└───────────────────────────────────────────────────────┘

Receivers

Receivers ingest data. They listen on ports or scrape endpoints:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
  prometheus:
    config:
      scrape_configs:
        - job_name: 'node-exporter'
          scrape_interval: 15s
          static_configs:
            - targets: ['localhost:9100']
  hostmetrics:
    collection_interval: 30s
    scrapers:
      cpu:
      memory:
      disk:
      network:

Processors

Processors transform data in flight. Order matters — they execute sequentially:

processors:
  batch:
    send_batch_size: 1024
    timeout: 5s
  memory_limiter:
    check_interval: 1s
    limit_mib: 512
    spike_limit_mib: 128
  filter:
    error_mode: ignore
    traces:
      span:
        - 'attributes["http.target"] == "/healthz"'
  resource:
    attributes:
      - key: environment
        value: production
        action: upsert

Exporters

Exporters send data to backends. You can fan out to multiple:

exporters:
  otlp:
    endpoint: tempo.monitoring:4317
    tls:
      insecure: false
  prometheus:
    endpoint: 0.0.0.0:8889
  debug:
    verbosity: detailed

Pipelines — Wiring It Together

Pipelines connect receivers to processors to exporters, per signal:

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [otlp, debug]
    metrics:
      receivers: [otlp, prometheus, hostmetrics]
      processors: [memory_limiter, batch]
      exporters: [prometheus]
    logs:
      receivers: [otlp]
      processors: [memory_limiter, filter, batch]
      exporters: [otlp]

SDK Instrumentation

The OTel SDK is what your application code uses to produce telemetry. There are two layers:

  • API: Stable interfaces. Safe to depend on in libraries.
  • SDK: The implementation. Configured in your application entrypoint.

Auto-Instrumentation

Most languages have auto-instrumentation that patches common libraries:

# Python — zero-code instrumentation
# pip install opentelemetry-distro opentelemetry-exporter-otlp
# opentelemetry-bootstrap -a install

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource

resource = Resource.create({
    "service.name": "order-service",
    "service.version": "2.4.1",
    "deployment.environment": "production",
})

provider = TracerProvider(resource=resource)
processor = BatchSpanProcessor(OTLPSpanExporter(endpoint="http://collector:4317"))
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)

Manual Instrumentation

For custom business logic:

tracer = trace.get_tracer("order-service")

with tracer.start_as_current_span("process_order") as span:
    span.set_attribute("order.id", order_id)
    span.set_attribute("order.total", total_amount)

    try:
        result = charge_payment(order_id)
        span.set_attribute("payment.status", "success")
    except PaymentError as e:
        span.set_status(StatusCode.ERROR, str(e))
        span.record_exception(e)
        raise

Semantic Conventions

Semantic conventions are standardized attribute names. They ensure that http.request.method means the same thing whether it comes from Go, Python, or Java.

Key convention namespaces:

Namespace Example Attributes
http. http.request.method, http.response.status_code
db. db.system, db.statement, db.operation
rpc. rpc.system, rpc.method, rpc.service
messaging. messaging.system, messaging.operation
server. server.address, server.port
service. service.name, service.version, service.namespace
deployment. deployment.environment
container. container.id, container.image.name
k8s. k8s.pod.name, k8s.namespace.name

Use them. If you invent httpMethod instead of http.request.method, every dashboard and alert that depends on the standard name breaks.

Gotcha: Semantic conventions change between OTel versions. The HTTP conventions underwent a major rename in 2023 (e.g., http.method became http.request.method, http.status_code became http.response.status_code). If you upgrade your SDK and your dashboards break, check the semantic convention migration guides. Pin your convention version in your instrumentation code.


Sampling

At scale, you cannot export every span. Sampling reduces volume while preserving signal.

Head Sampling

Decide at trace creation whether to sample:

from opentelemetry.sdk.trace.sampling import TraceIdRatioBased

# Sample 10% of traces
sampler = TraceIdRatioBased(0.1)
provider = TracerProvider(sampler=sampler, resource=resource)

Pros: Simple, low overhead. Cons: You might drop the one interesting trace.

Tail Sampling (Collector-Side)

Decide after the trace completes, based on its content:

processors:
  tail_sampling:
    decision_wait: 10s
    num_traces: 100000
    policies:
      - name: error-traces
        type: status_code
        status_code:
          status_codes: [ERROR]
      - name: slow-traces
        type: latency
        latency:
          threshold_ms: 2000
      - name: baseline
        type: probabilistic
        probabilistic:
          sampling_percentage: 5

Tail sampling keeps all errors and slow requests, plus 5% of everything else. It requires the collector to buffer complete traces, which costs memory.

Remember: Mnemonic: "Head sampling is cheap but blind. Tail sampling is smart but hungry." Head sampling decides before seeing the trace, so it is fast but may drop interesting traces. Tail sampling decides after the trace completes, so it can keep errors and outliers, but it must buffer all spans in memory until the decision is made.

Under the hood: Tail sampling in the collector requires that all spans for a single trace arrive at the same collector instance. In a multi-instance gateway deployment, you need a load-balancing exporter that routes by trace ID. Without this, the collector sees incomplete traces and makes bad sampling decisions.


Deployment Models

Model 1: Agent (DaemonSet)          Model 2: Gateway
┌─────────┐  ┌─────────┐           ┌─────────┐  ┌─────────┐
│ App Pod  │  │ App Pod  │           │ App Pod  │  │ App Pod  │
│ ┌─────┐  │  │ ┌─────┐  │           └────┬────┘  └────┬────┘
│ │OTel │  │  │ │OTel │  │                │              │
│ │Agent│  │  │ │Agent│  │                └──────┬───────┘
│ └──┬──┘  │  │ └──┬──┘  │                       │
└────┼────┘  └────┼────┘                  ┌──────▼───────┐
     │              │                      │  OTel Gateway │
     └──────┬───────┘                      │  (Collector)  │
            │                              └──────┬───────┘
     ┌──────▼───────┐                             │
     │   Backend    │                      ┌──────▼───────┐
     └──────────────┘                      │   Backend    │
                                           └──────────────┘

Model 3: Agent + Gateway (recommended for production): - DaemonSet agents handle local collection and basic processing - Gateway handles tail sampling, enrichment, and fan-out to multiple backends - If the gateway goes down, agents can buffer briefly


Resource Detection

Resources describe the entity producing telemetry. OTel can auto-detect:

processors:
  resourcedetection:
    detectors: [env, system, docker, ec2, gcp, azure, k8s]
    timeout: 5s
    override: false

This automatically populates attributes like host.name, cloud.provider, k8s.pod.name without manual configuration.


Key Takeaways

  1. OTel gives you one SDK, one collector, one wire format for all three signals
  2. The collector is a pipeline: receivers -> processors -> exporters
  3. Instrument with the API, configure with the SDK — libraries use API, apps configure SDK
  4. Semantic conventions are not optional — they are what make cross-service correlation work
  5. Tail sampling at the collector keeps errors and outliers while controlling volume
  6. Start with auto-instrumentation, add manual spans for business logic
  7. Resource attributes are the glue — they tell you where telemetry came from

Wiki Navigation

Prerequisites