Quiz: Elasticsearch¶

15 questions

L1 (8 questions)¶

1. What is the difference between shards and replicas in Elasticsearch?

Show answer

Shards: horizontal partitions of an index — data is distributed across shards for parallelism. Replicas: copies of primary shards on different nodes for redundancy and read throughput. An index with 5 primary shards and 1 replica has 10 total shards (5 primary + 5 replica).

2. Elasticsearch cluster health shows yellow. What does it mean?

Show answer

Yellow = all primary shards are allocated but some replica shards are not. Common cause: single-node cluster (no other node for replicas) or node failure/maintenance. Data is safe (primaries exist) but there is no redundancy. Green = all shards allocated. Red = some primaries are unassigned.

3. What are Elasticsearch disk watermarks?

Show answer

Low watermark (85%): ES stops allocating new shards to that node. High watermark (90%): ES relocates shards away from that node. Flood stage (95%): ES makes indices on that node read-only. Monitor disk usage proactively. Clear old indices or add nodes before hitting watermarks.

4. What are common Elasticsearch mapping mistakes?

Show answer

1. Dynamic mapping on — random fields auto-create fields (mapping explosion).
2. Using text type for fields you only filter on (use keyword).
3. Not setting explicit mappings upfront.
4. Changing field types on an existing index (requires reindex).
5. Too many fields per index (default limit: 1000).

5. What is the difference between text and keyword field types?

Show answer

text: analyzed (tokenized, lowercased) — use for full-text search (match queries). keyword: exact value, not analyzed — use for filtering, sorting, aggregations. A common pattern: multi-field mapping with both (field.raw as keyword). Wrong choice = poor search results or wasted resources.

6. How do you manage index lifecycle in Elasticsearch?

Show answer

Use Index Lifecycle Management (ILM). Define phases: hot (actively written/queried), warm (read-only, less frequent), cold (infrequent access, cheaper storage), delete. ILM automates rollover (size/age triggers), force-merge, shrink, and deletion. Essential for log/metrics use cases.

7. What is the difference between an index and an alias in Elasticsearch?

Show answer

Index: the actual data store. Alias: a pointer (name) that maps to one or more indices. Aliases enable zero-downtime reindexing, time-based index management, and query routing. Applications should always use aliases, not raw index names, so you can swap the underlying index transparently.

8. What is the Elasticsearch refresh interval and when should you change it?

Show answer

refresh_interval (default 1s) controls when new documents become searchable. Lower = near-real-time but more overhead. For bulk ingestion, set to 30s or -1 (disable) to improve throughput, then restore after. For search-heavy workloads, 1s is appropriate.

L2 (7 questions)¶

1. Cluster health is red. What do you do?

Show answer

1. Check which indices are red: GET _cluster/health?level=indices.
2. Find unassigned shards: GET _cat/shards?v&h=index,shard,prims,state,unassigned.reason.
3. Common causes: disk watermark exceeded, node lost, corrupted shard.
4. For disk: clear space or raise watermark temporarily.
5. For lost nodes: bring them back or reroute shards.

2. Indexing throughput has degraded significantly. What do you investigate?

Show answer

1. Bulk queue rejections (thread pool stats).
2. Merge throttling — too many segments being merged.
3. GC pressure — heap too small or too many field data structures.
4. Disk I/O saturation.
5. Mapping explosion (too many fields).
6. refresh_interval too low (default 1s). For bulk loads, set refresh_interval to 30s or -1 temporarily.

3. Elasticsearch heap usage is consistently above 75%. What do you do?

Show answer

1. Check fielddata usage — use doc_values instead of fielddata where possible.
2. Reduce shard count per node (each shard uses heap for metadata).
3. Avoid large aggregations that load data into heap.
4. Set indices.fielddata.cache.size limit.
5. Don't set heap above 50% of RAM or 31GB (compressed OOPs threshold).

4. How do you reindex data in Elasticsearch without downtime?

Show answer

1. Create the new index with updated mappings.
2. Use the _reindex API to copy data.
3. Use an alias pointing to the old index.
4. When reindex completes, atomically switch the alias to the new index.
5. Delete the old index. The alias swap is atomic — clients see no downtime.

5. Search queries are slow on a large index. What do you check?

Show answer

1. Number of shards — too many shards = overhead per query.
2. Shard size — ideal is 10-50GB per shard.
3. Query complexity — avoid wildcards on large text fields.
4. Slow log (index.search.slowlog.threshold).
5. Node resources (CPU, RAM, disk I/O).
6. Use query profiling API to find bottlenecks.

6. How do you size an Elasticsearch cluster?

Show answer

1. Estimate daily data volume and retention period.
2. Target 10-50GB per shard.
3. Replica count based on availability needs.
4. Each node handles ~20-25 shards per GB of heap.
5. Set heap to 50% of RAM (max 31GB).
6. Separate master-eligible, data, and coordinating nodes for large clusters.
7. Plan for burst indexing capacity.

7. How do you troubleshoot unassigned shards?

Show answer

1. GET _cluster/allocation/explain to see why a shard is unassigned.
2. Common reasons: disk watermark, node count < replica count, shard allocation filtering, awareness attributes.
3. Fixes: clear disk space, add nodes, reduce replica count, remove allocation filters.
4. Manual reroute as last resort.