当前位置:网站首页>Elastricearch's fragmentation principle of the second bullet
Elastricearch's fragmentation principle of the second bullet
2022-06-23 19:37:00 【InfoQ】
So what is the main piece ? What is replica fragmentation ?
- shard( The primary shard ): What we mentioned above is actually the main partition , The main partition is the container of data , The document is saved in the main partition , The main partition is allocated to each node in the cluster . Every shard It's all one lucene index.
- replica( A copy of the shard ): A copy is a piece of Copy , Synchronously store the data content of the main partition . In order to achieve high availability ,Master Nodes avoid placing primary and replica shards on the same node , So the maximum number of slices of the copy is N-1( among N For the node number ).
PUT /myIndex
{
"settings" : {
"number_of_shards" : 5,
"number_of_replicas" : 1
}
}
Why do you want to slice ? Why should we distinguish between the main and the auxiliary parts ?

- green : The cluster is healthy and in good condition , All functions are complete and normal , All shards and copies work .
- yellow : Alert status , All main segments function normally , But at least one copy doesn't work . At this time, the cluster can work normally , But high availability can be affected to some extent .
- Red : The cluster is not working properly . One or some tiles and their copies are not available , At this time, the query operation of the cluster can also be executed , But the result will be inaccurate . Error will be reported for write requests allocated to this partition , It will eventually lead to the loss of data .
How to write index : The synchronization principle of primary partition and replica partition

shard = hash(routing) % number_of_primary_shards_id_version
- Client to ES1 node ( Coordinate nodes ) Send write request , Through the route calculation formula, the value is 0, Then the current data should be written to the main partition S0 On .
- ES1 The node forwards the request to S0 The node where the main partition is located ES3,ES3 Accept request and write to disk .
- Copy the data to two copies at the same time R0 On , Among them, optimistic concurrency is used to control data conflicts . Once all copy shards are reported successful , The node ES3 Success will be reported to the coordination node , Coordinate nodes (ES1) Report success to client .
The more copies are divided, the better ?
- (1) Multiple replica It can improve the throughput and performance of search operations , However, if you only add more replica fragmentation on the same number of nodes, it will not improve the performance , Because each shard gets fewer resources from the node , At this time, you need to add more hardware resources to improve throughput .
- (2) More replica fragmentation increases data redundancy , Data integrity guaranteed , However, according to the principle of interaction between the upper main and sub segments , Data synchronization between partitions will occupy a certain amount of network bandwidth , Affect efficiency , So the number of slices and copies of index is not the more the better .
- The purpose of data fragmentation is to improve the capacity of data that can be processed and to expand it horizontally , Making replicas for sharding is to improve the stability of the cluster and the amount of concurrency .
- Copy is multiplication , The more consumption, the greater , But the more secure . Fragmentation is division , More segments , The less and more fragmented the single slice data is .
- More copies , The higher the availability of the cluster , But because each slice is equivalent to a Lucene The index file of , It will occupy a certain file handle 、 Memory and CPU.
边栏推荐
- NLP 论文领读|改善意图识别的语义表示:有监督预训练中的各向同性正则化方法
- 如何在Microsoft Exchange 2010中安装SSL证书
- 打新债有何要求 打新债安全吗
- 【One by One系列】IdentityServer4(三)使用用户名和密码
- How to make a list sort according to the order of another list
- LeetCode 1079. movable-type printing
- NAACL 2022 Findings | 字节提出MTG:多语言文本生成数据集
- 打新债 要求 打新债安全吗
- 墨天轮访谈 | IvorySQL王志斌—IvorySQL,一个基于PostgreSQL的兼容Oracle的开源数据库
- Advanced network accounting notes (IV)
猜你喜欢

开源 SPL 重新定义 OLAP Server

ElastricSearch第二弹之分片原理

A review of comparative learning

LeetCode 每日一题——30. 串联所有单词的子串

20set introduction and API

Programmable, protocol independent software switch (read the paper)

官宣.NET 7 预览版5

Learn the basic principles of BLDC in Simulink during a meal
![Develop small programs and official account from zero [phase I]](/img/02/77386ba3fe50b16018f77115b99db6.png)
Develop small programs and official account from zero [phase I]

宝安区航城街道领导一行莅临联诚发参观调研
随机推荐
Matrix analysis notes (II)
A review of comparative learning
考PMP有用吗?
金鱼哥RHCA回忆录:DO447管理用户和团队的访问--用团队有效地管理用户
Matrix analysis notes (I)
How long do you need to prepare for the PMP Exam?
[comparative learning] koa JS, gin and asp Net core - Middleware
Game asset reuse: a new way to find required game assets faster
Leetcode daily question - 30 Concatenate substrings of all words
logstash启动 -r 参数
Matrix analysis notes (III-1)
Is it safe to make new debt
Robust extraction of specific signals with time structure (Part 1)
Principles of microcomputer Chapter 6 notes arrangement
混沌工程,了解一下
【云动向】华为云云商店品牌全新发布 4大亮点都在这儿
Idea console displays Chinese garbled code
Save: software analysis, verification and test platform
区块哈希竞猜游戏系统开发(dapp)
Advanced network accounting notes (VIII)