Elasticsearch Footguns¶

Mistakes that cause cluster outages, data loss, or silent performance degradation.

1. Setting heap above 30.5GB¶

You set -Xmx32g because more heap is better, right? Wrong. Above ~30.5GB the JVM disables compressed ordinary object pointers (compressed oops). Your effective heap shrinks. GC pauses spike. The cluster becomes unresponsive under load.

Fix: Never exceed -Xmx30g. Set heap to 50% of available RAM, capped at 30g. Always set -Xms equal to -Xmx.

Under the hood: Below ~30.5GB, the JVM uses "compressed ordinary object pointers" (compressed oops), storing 64-bit pointers as 32-bit offsets. Above this threshold, every object reference doubles in size, and the effective heap shrinks despite more RAM being allocated. A 32GB heap can hold fewer objects than a 30GB heap. This is a JVM behavior, not an Elasticsearch bug.

2. Too many shards on a small cluster¶

You create daily indices with 5 primary shards and 1 replica. After a year you have 3,650 shards on a 3-node cluster. Each shard consumes heap, file descriptors, and thread pool capacity. Cluster master becomes unstable, allocation takes minutes, and queries time out.

Fix: Target 20-50GB per shard. Use ILM rollover instead of daily indices. Shrink old indices to 1 shard. Monitor total shard count per node (stay under 1000).

Remember: Shard sizing rule of thumb: 20-50GB per shard, max 1000 shards per node, max 20 shards per GB of heap. A 3-node cluster with 30GB heap each can handle ~600 shards comfortably. Daily indices with 5 shards + 1 replica = 10 shards/day = 3,650/year. That blows through 1000/node in 3 months.

3. Force-merging active write indices¶

You run _forcemerge?max_num_segments=1 on an index still receiving writes. The merge competes with indexing for I/O and memory. Indexing throughput drops to near zero. Bulk queues fill up and start rejecting.

Fix: Only force-merge indices that are read-only (no longer receiving writes). Freeze or set index.blocks.write: true before merging.

4. Ignoring disk watermarks¶

Elasticsearch has three disk watermarks: low (85%), high (90%), and flood stage (95%). When flood stage is hit, ES sets all indices on that node to read-only. Your logging pipeline starts throwing errors. Clearing the block requires manual intervention even after freeing space.

Fix: Monitor disk usage and alert at 75%. Set up ILM to delete old indices automatically. After hitting flood stage, free space, then clear the block: PUT _all/_settings {"index.blocks.read_only_allow_delete": null}.

Gotcha: After hitting flood stage (95%), Elasticsearch sets index.blocks.read_only_allow_delete: true on affected indices. This block persists even after freeing disk space — you must manually clear it. Until you do, your logging pipeline keeps throwing write errors and piling up in agent buffers.

5. Deleting an index to "fix" a mapping problem¶

You need to change a field type from text to keyword. You delete the index, create it with the new mapping, and realize the data is gone. Elasticsearch does not support changing field types on existing indices.

Fix: Create a new index with the correct mapping and _reindex data into it. Use index aliases to swap them atomically. Never delete an index without confirming a backup exists.

6. No index templates¶

You create indices ad-hoc without templates. Different indices end up with different shard counts, different mappings, and different settings. Queries across them are slow or return inconsistent results because field types conflict.

Fix: Always define index templates for your index patterns. Templates enforce consistent shards, replicas, mappings, and ILM policies across all matching indices.

7. Running queries without size limits¶

A developer runs _search without specifying size and gets 10 results (the default). They increase size to 100000. The coordinating node tries to aggregate 100k docs from every shard, runs out of heap, and trips the circuit breaker. Other queries start failing.

Fix: Use search_after or scroll API for large result sets. Set index.max_result_window to a reasonable cap (default 10000 is fine). Never set size to an unbounded value.

8. Snapshot repo on the same disk as data¶

Your snapshot repository points to a local directory on the data node. When you need to restore after a disk failure, the snapshots are gone too. This is not a backup — it is a copy on the same failure domain.

Fix: Use S3, GCS, or Azure Blob for snapshot repositories. Verify snapshots are restorable by running test restores regularly.

9. Bulk indexing with default refresh interval¶

You bulk-load 50 million documents with the default 1-second refresh interval. Every second ES creates a new segment, consuming I/O and triggering merges. The bulk load that should take 20 minutes takes 4 hours.

Fix: Set refresh_interval to 30s or -1 during bulk loads. Reset to 1s when done. Also increase index.translog.flush_threshold_size for large bulk operations.

Under the hood: Each refresh creates a new Lucene segment. With 1-second refresh during bulk load, you create 1 segment per second. Each segment needs memory for its data structures. The merger struggles to keep up, I/O bandwidth is consumed by merging rather than indexing, and throughput drops dramatically. Setting -1 defers all refreshes until you explicitly call _refresh.

10. Not securing Elasticsearch¶

Elasticsearch ships with no authentication on port 9200 by default (pre-8.x). Anyone on the network can read, modify, or delete all data. Ransomware bots actively scan for open ES instances and wipe indices.

Fix: Enable X-Pack security (free in 8.x+). Require TLS. Bind to private interfaces only. Never expose port 9200 to the public internet.

War story: In 2020, the "Meow" bot wiped ~4,000 unsecured databases (1,395 Elasticsearch, 383 MongoDB) by scanning for open port 9200, calling /_cat/indices, and deleting every index. No ransom demand — just destruction. Earlier campaigns held data for ransom (1,200+ ES instances in 2017). Elasticsearch pre-8.x ships with zero authentication on port 9200 by default.