sangmin7648.log

sangmin7648.log

Sharding & Replication

이상민·2021년 4월 24일

0

Complete Guide to ElasticSearch

목록 보기

4/17

1. Sharding

way to divide indexes into smaller pieces

each piece is called a shard
sharding is done at index level
main purpose is to horizontally scale data volume

2. How Sharding Works

Index is divided into shards
Node can have multiple shards
Each shard is an Apache Lucene index
One shard can store two billion docs

sharding also improves performance by parallelizing queries
default number of shards in a index is 1
increase shards with Split API
decrease shards with Shrink API

3. Replication

fault tolerance machanism for ElasticSearch

ensures data availability on a different node if a node goes down

configured at index level
create copies of shards called replica shards
a shard that has been replicated is called primary shard
a primary shard and its replica shards are referred to as replication group
number of replicas can be configured at index creation

4. How Replication Works

replica shards are never stored on the same node as primary shards
multiple nodes must exist for replication to work
- if only one node exist, replication is disable even if configured until additional node is added
number of replicas depends on availability, criticality of data, etc.

Replicas can also improve performance
- replica and primary can be used simultaneously
- ElasticSearch auto routes request to best shard
- if multiple replica shards are stored on the same node, CPU can parallelize
default number of replica per shard is 1

5. Replication vs Snapshot

Replication saves "live data" and ensure that there is no data loss
Snapshot save a specific state
Replication cannot be used to revert changes that are already applied

6. Creating Index

create index from Kibana Console
```
PUT /{index_name}
```
- cluster health is yellow(warning) if replica shard not allocated
- add node to allocate replica shard
check shards info from Kibana Console
```
GET /_cat/shards
```
Kibana by default is set to auto_expand_replicas which dynamically change number of replicas depending on the number of nodes in cluster

편하게 읽기 좋은 단위의 포스트를 추구하는 개발자입니다

이전 포스트

Basic Architecture of ElasticSearch

다음 포스트

Adding Nodes and Node Roles

0개의 댓글

관련 채용 정보