当前位置:网站首页>General scheme for improving reading and writing ability of online es cluster

General scheme for improving reading and writing ability of online es cluster

2022-06-24 03:17:00 house. zhang

The problem background :

Business is using ES Cluster read ES data , If at the same time ES Cluster write task , Will meet RT The rising situation , There will be some jitter , Especially in the computing framework, the degree of concurrency is greatly increased ES When the cluster writes, jitter occurs , At present, the big data computing cluster reduces concurrent writes . In the future, we still hope to increase the degree of concurrency , Speed up writing , Expected to be right ES Cluster read performance challenges

The current situation :

At present, it is used online 5 platform 64C 128G 1THDD, The machine configuration is relatively high , The use is relatively stable , Some jitter occurs when the cluster reads and writes a lot at the same time , It didn't happen FGC Etc , The average latency is on the order of milliseconds . The amount of data occupied by the cluster index is about 300-500G. The default configuration is used for cluster construction , No, right ES Node roles are distinguished , That is to say 5 Two nodes can undertake Master / Data / Ingest / Coordinating / Machine Learning Responsibility for , With the increase of data and business volume, it is estimated that there will be challenges in the future .

According to the current monitoring data ES The overall situation is relatively stable , Whether to expand the capacity and adjust the deployment architecture, or make a comprehensive evaluation according to the business usage and cluster performance monitoring data .

As shown in the figure below, the data monitoring data :

chart : In the past 7 Days according to cloud query response time

Data access monitoring data P99 The response time is mostly in 30ms within , In rare cases, more than 100ms situation , combination ES Monitoring cluster data ( See the final reference links and data for details ) The existing cluster architecture can be maintained temporarily , In the later stage, continue to observe the monitoring data to split the role and expand the capacity of the cluster .

ES Cluster deployment :

Basic knowledge of

Usually ES Node types with the following roles in the cluster Master / Data / Ingest / Coordinating / Machine Learning

The roles of each role are as follows :

  1. Master node , In charge of the management of fragmentation 、 Cluster management , Cluster state management , If you leave the node alone Master, From the perspective of high availability and avoiding brain crack , Generally, three sets are configured in the production process , The cluster will automatically select 1 Taiwan is the main node .
  2. Data Node node : This node is mainly responsible for data storage , It plays a crucial role in data expansion . Reading and writing data will find the corresponding Data Node node .
  3. Coordinating Node node : The coordination node is mainly responsible for coordinating the requests of the client , Distribute the received request to the appropriate node , And put the results together . For example, the client requests to query the data of an index , The coordination node will distribute the request to the... That holds the relevant data DataNode node , Find the corresponding slice , The results of the query are collected and returned to . And each node plays a role by default Coordinating Node Responsibility for .
  4. Ingest Node: Ingest node Specially preprocess the indexed documents , Occurs before indexing real documents , Play the role of data processing .

A node will play these roles by default , In the development environment, the amount of data is usually small, and a node is usually deployed ES colony . In the production environment, it needs to be based on the amount of data , Throughput of writes and queries , Choose the right deployment method , Usually, if there are enough resources, the best practice is to set up a single role node , As shown in the figure below :

Node parameter configuration

Role configuration suggestions

role

cpu

Memory

disk

Master

Low configuration

Low configuration

Low configuration

Data Node

High configuration

High configuration

High configuration

ingest

High configuration

Medium configuration

Low configuration

Coordinating

in / High configuration

in / High configuration

Low configuration

Future plans :

Separate several nodes and deploy them into ingest role Hang a... In the front LB It mainly undertakes some data access operations , Independent several coordinatiing node Hang a... In the front LB It is mainly used for data processing, query, aggregation and reading , When there are a lot of complex queries and aggregations in the system , increase Coordinating node , Easy to increase query performance .

summary

With the increase of business volume and data volume , At present ES The cluster uses the default configuration , No, right ES How to distinguish between node roles , In the future, it is estimated that it will be under certain pressure and challenges , Currently, according to the monitoring data and ES Cluster monitoring , Temporarily meet business needs , The subsequent cluster architecture needs to be adjusted , Split roles and responsibilities . Some nodes are deployed in a mixed manner , Or completely independent , When the amount of data is too large , Disk capacity cannot meet the demand , You can add data nodes , When there are a lot of complex queries and aggregations in the system , increase Coordinating node , Increase query performance , At the same time, it can be right Coordinating and ingest、Data Split nodes , Further reduce the pressure borne by the node .

Reference data :

Index creation and data reading latency in the past six hours ( Unit millisecond )

Search and write data delay in the past six hours ( Unit millisecond )

Index creation and write data latency in the past 24 hours ( Unit millisecond )

原网站

版权声明
本文为[house. zhang]所创,转载请带上原文链接,感谢
https://yzsam.com/2021/10/20211012150451897h.html