Skip to content

Elasticsearch Operations

  • lesson
  • elasticsearch-ops
  • shard-allocation
  • disk-watermarks
  • mapping-explosions
  • cluster-health
  • l2 ---# Elasticsearch: Operations for People Who Didn't Choose It

Topics: Elasticsearch ops, shard allocation, disk watermarks, mapping explosions, cluster health Level: L2 (Operations) Time: 45–60 minutes Prerequisites: None (you inherited an ES cluster and need to keep it alive)


The Mission

You didn't choose Elasticsearch. Someone before you deployed it for logging, or search, or analytics. Now the cluster is yellow, disk is filling, and queries are timing out. The documentation is 3,000 pages. You need the 30 pages that matter.


Cluster Health: Green, Yellow, Red

curl -s http://localhost:9200/_cluster/health | jq .
# → {
#     "status": "yellow",        ← NOT green
#     "number_of_nodes": 3,
#     "unassigned_shards": 5     ← shards without a home
#   }
Status Meaning Action
Green All primary and replica shards allocated None (this is good)
Yellow All primaries allocated, some replicas missing Find why replicas aren't allocated
Red Some primary shards missing — DATA LOSS RISK Emergency: find and fix missing primaries

Yellow is the most common non-green state. It usually means: you have number_of_replicas: 1 but only 1 node (replicas can't be on the same node as the primary).

# Why are shards unassigned?
curl -s 'http://localhost:9200/_cluster/allocation/explain' | jq .
# → "allocate_explanation": "cannot allocate because all found copies are stale or corrupt"
# → Or: "the node is above the high watermark"

Disk Watermarks: Why Elasticsearch Stops Writing

ES has three disk thresholds:

Watermark Default What happens
Low 85% disk used No new shards allocated to this node
High 90% disk used Shards start relocating AWAY from this node
Flood stage 95% disk used Indices become READ-ONLY. Writes rejected.
# Check disk usage per node
curl -s 'http://localhost:9200/_cat/allocation?v'
# → node   shards disk.used disk.avail disk.percent
# → node-1    142     85gb      15gb          85     ← at low watermark!
# → node-2    138     90gb      10gb          90     ← at HIGH watermark!

When flood stage triggers, your indices go read-only. Logs stop being indexed. Searches work but nothing new gets written.

# Unlock read-only indices (after freeing disk space!)
curl -X PUT 'http://localhost:9200/_all/_settings' \
  -H 'Content-Type: application/json' \
  -d '{"index.blocks.read_only_allow_delete": null}'

Gotcha: Unlocking indices without freeing disk space just triggers flood stage again within minutes. Free space first (delete old indices, increase disk), then unlock.


Index Lifecycle: Stop Storing Everything Forever

# Check index sizes
curl -s 'http://localhost:9200/_cat/indices?v&s=store.size:desc' | head -10
# → index                     status store.size
# → logs-2025-01              open   120gb     ← 14 months old!
# → logs-2025-02              open   115gb
# → logs-2025-03              open   118gb

If you're storing logs for 14 months and only query the last 7 days, you're wasting 90% of your disk.

ILM (Index Lifecycle Management)

PUT _ilm/policy/logs-policy
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "50gb",
            "max_age": "1d"
          }
        }
      },
      "warm": {
        "min_age": "7d",
        "actions": {
          "shrink": { "number_of_shards": 1 },
          "forcemerge": { "max_num_segments": 1 }
        }
      },
      "delete": {
        "min_age": "30d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

Hot (current, fast SSD) → Warm (7 days, cheaper disk, merged) → Delete (30 days).


The Mapping Explosion

# Check field count per index
curl -s 'http://localhost:9200/logs/_mapping' | jq '.. | objects | keys | length' | sort -rn | head -1
# → 5000 ← this index has 5,000 unique fields!

Every unique field name creates a mapping entry. Logging systems that ingest arbitrary JSON (Kubernetes labels, user-defined fields) can create thousands of unique fields. This consumes memory, slows queries, and eventually crashes the node.

Fix: Set a field limit and use strict mappings:

PUT logs/_settings
{
  "index.mapping.total_fields.limit": 1000
}

Gotcha: Setting dynamic: "strict" on mappings rejects documents with unknown fields. This breaks log ingestion if your applications add new fields without updating the mapping. Use dynamic: "false" instead — it accepts the document but doesn't index unknown fields (they're stored but not searchable).


Essential Operations

# Cluster health
curl -s localhost:9200/_cluster/health?pretty

# Node stats (CPU, memory, disk per node)
curl -s localhost:9200/_cat/nodes?v&h=name,cpu,heap.percent,disk.used_percent

# Index sizes (sorted by size)
curl -s localhost:9200/_cat/indices?v&s=store.size:desc | head

# Shard allocation
curl -s localhost:9200/_cat/shards?v&s=state | head -20

# Unassigned shard explanation
curl -s localhost:9200/_cluster/allocation/explain?pretty

# Delete old indices
curl -X DELETE localhost:9200/logs-2025-01

# Force merge (reduce segments, free disk)
curl -X POST localhost:9200/logs-2025-06/_forcemerge?max_num_segments=1

Flashcard Check

Q1: Cluster status is yellow. What does this mean?

All primary shards are allocated but some replicas are missing. Data is safe but not redundant. Common cause: single-node cluster with replicas configured.

Q2: Disk at 95%. What happens?

Flood stage watermark triggers. All indices become read-only. Writes are rejected. Free disk space, then unlock indices.

Q3: Index has 5,000 unique fields. Why is this a problem?

Each field consumes memory for mapping metadata. Thousands of fields slow queries, waste memory, and can crash nodes. Set total_fields.limit.


Cheat Sheet

Task Command
Cluster health curl localhost:9200/_cluster/health?pretty
Node stats curl localhost:9200/_cat/nodes?v
Index sizes curl localhost:9200/_cat/indices?v&s=store.size:desc
Shard status curl localhost:9200/_cat/shards?v
Unassigned why curl localhost:9200/_cluster/allocation/explain?pretty
Delete index curl -X DELETE localhost:9200/INDEX_NAME
Unlock read-only curl -X PUT localhost:9200/_all/_settings -d '{"index.blocks.read_only_allow_delete":null}'

Disk Watermarks

Level Threshold Effect
Low 85% No new shards allocated
High 90% Shards relocate away
Flood 95% Indices go READ-ONLY

Takeaways

  1. Yellow = replicas missing, not data loss. Green = everything allocated. Red = data at risk. Yellow is usually a single-node configuration issue.

  2. Disk watermarks are non-negotiable. At 95%, writes stop. Monitor disk and set up ILM to delete old indices automatically.

  3. Mapping explosions kill clusters. Set total_fields.limit. Use dynamic: false for log indices to accept but not index unknown fields.

  4. ILM is essential for log clusters. Hot → Warm → Delete. Without it, you store everything forever and the disk fills.


Exercises

  1. Run a single-node cluster and check health. Start Elasticsearch in a container: docker run -d --name es -p 9200:9200 -e "discovery.type=single-node" -e "xpack.security.enabled=false" elasticsearch:8.12.0. Run curl -s localhost:9200/_cluster/health | jq . and note the status. It should be green or yellow. If yellow, use curl -s localhost:9200/_cluster/allocation/explain | jq . to find out why replicas are unassigned on a single-node cluster.

  2. Index documents and inspect shard allocation. Create an index with curl -X PUT localhost:9200/test-index. Index a few documents: curl -X POST localhost:9200/test-index/_doc -H 'Content-Type: application/json' -d '{"msg":"hello"}'. Check shard allocation with curl -s localhost:9200/_cat/shards/test-index?v. Note the shard count, state, and which node they live on. Try setting replicas to 0: curl -X PUT localhost:9200/test-index/_settings -H 'Content-Type: application/json' -d '{"number_of_replicas":0}' and check cluster health again.

  3. Simulate the flood-stage watermark. On your test cluster, check disk usage with curl -s localhost:9200/_cat/allocation?v. Lower the watermark thresholds to trigger them on your current disk usage (e.g., curl -X PUT localhost:9200/_cluster/settings -H 'Content-Type: application/json' -d '{"transient":{"cluster.routing.allocation.disk.watermark.flood_stage":"1b"}}'). Try to index a document and observe the read-only block. Unlock with curl -X PUT localhost:9200/test-index/_settings -H 'Content-Type: application/json' -d '{"index.blocks.read_only_allow_delete":null}'. Reset watermarks to defaults afterward.

  4. Find big indices and calculate retention savings. Run curl -s localhost:9200/_cat/indices?v&s=store.size:desc and identify the largest indices. For a log-based index pattern, calculate how much disk you would save by reducing retention from 90 days to 30 days. Write the ILM policy JSON (hot/delete only) that would implement this retention. Clean up your test container with docker rm -f es.


  • The Disk That Filled Up — when ES disk watermarks cause write failures
  • When the Queue Backs Up — when log ingestion backs up because ES is slow
  • The Monitoring That Lied — when ES metrics don't reflect user experience