1. Routing
how does Elasticsearch know where to store docs?
how are documents found once indexed?
=> through routing
-
Routing : process of resolving a shard for a doc
-
elasticsearch uses hashing to select shard
- changing routing equation is possible
-
shard_num = hash(_routing) % num_primary_shards
2. How Elasticsearch Reads Data
- coordinating node receives request
- find which replication goup to look for using routing
- select shard using Adaptive replica selection
- node returns response
3. How Elasticsearch Writes Data
- coordinating node receives request
- find shard to write using routing
- data is sent to primary shard
- primary shard validates data and save
- data is locally shared among replication group
-
when replication fails, elasticsearch goes through recovery process
- ex) when pri shard fail while replication
- new pri is elected, but data between shards are not the same
- primary terms and sequence numbers are used to resolve
-
Primary terms : way to distinguish between old and new primary shards
- counter for how many times the pri shard changed
- primary terms is appended to write operation
- can check if pri shard changed since request
-
Sequence number : way to track operation
- counter incremented for each operation
- appended to write operation
- pri shard increases sequence number
-
for large indexes, Elasticsearch uses checkpoints
- replica has local check point
- replication group has global checkpoint
4. Document Versioning
-
only stores recent document
-
store version number of document
-
internal vs external versioning
- internal : default
- external : when maintaining versions outside of elasticsearch ex. RDBMS
-
above versioning is replaced by a better way
4-1. Optimistic Concurrency Control
prevent old doc overwriting new doc
- update should fail if data changed, then try again
- used to be handled by doc version, but now handled using Primary terms and Sequence number
POST /{index name}/_update/{doc name}?if_primary_term={X}&if_seq_no={Y}
- if primary_term and seq_no are not the same, error is thrown
- error should be handled in application level and retried