Core Concepts

Documents

The smallest unit of data you store in Elasticsearch. Think of it as a JSON object, like:

{
  "id": "123",
  "title": "Lord of the Rings",
  "author": "J.R.R. Tolkien",
  "genre": "Fantasy"
}

Index

An index is like a database in SQL terms. Each index contains many documents and maintains an inverted index for super-fast searching. An inverted index is essentially a mapping from term → documents.

Term	Docs
"lord"	[1]
"rings"	[1]
"fantasy"	[1]
"hobbit"	[2]

Now when you search “lord rings,” Elasticsearch doesn’t scan every document—it jumps straight to where those words exist. ⚡

Shards

To scale horizontally, an index is split into shards. Each shard is an independent Lucene index that handles part of the data. You can imagine shards like slices of pizza 🍕 — each stores a portion of the documents, and together they make a full pie (index). You can configure:

Primary shards (the main data)
Replica shards (copies for redundancy and load balancing)

Cluster & Nodes

Cluster: The entire Elasticsearch setup, containing multiple nodes
Node: A single server (physical or virtual) running Elasticsearch
Each node holds one or more shards

So the cluster is like a beehive, nodes are bees, and shards are the honeycombs storing nectar (data). 🍯🐝

Analyzer

Before data is stored or queried, Elasticsearch uses analyzers to process text.