当前位置：网站首页>Don't you know the evolution process and principle of such a comprehensive redis cluster model?

Don't you know the evolution process and principle of such a comprehensive redis cluster model?

2022-06-25 07:13:00 【Python's path to becoming a God】

List of articles

Here is weihubeats, Feel good about the official account. Minor technique , The article first . Reject marketing number , Reject the title party

background

Whether it's an interview or a job , We'll all meet Redis Cluster problem , So we will Redis At present, you can have a deep understanding of the various clustering methods supported

A master-slave mode

The simplest way to cluster , Read write separation is adopted between master and slave libraries

Read operations : Main library 、 It can be received from the library
Write operations : First, execute... In the main library , And then through RDB(log) Mode synchronization to slave Library

Why use read-write separation . Mainly for simple implementation , The read-write separation method can avoid solving the problem of concurrent processing of multiple data written at the same time .

A simple example ： If you do not use read-write separation , Three libraries receive written data at the same time , So how to ensure the consistency of the last query data in the last database ？ Is it strong consistency or weak consistency ？ The implementation cost is very high , There will also be additional performance overhead , These problems do not need to be considered when using read-write separation

Master slave data synchronization

The problem of data synchronization must be considered in distributed systems

Synchronous or asynchronous replication ?

Asynchronous replication

Specific synchronization process

Establish connection between master library and slave Library . The main library sends a connection request , Send from the slave to the master psync command , Indicates that you want to synchronize data , The master library starts replication according to this command parameter .psync The command contains the runID And replication progress offset Two parameters

runnID: Every Redis A random instance will be generated automatically when the instance is started ID, Used to identify this instance . First send psync The command does not know the of the main library runnID So use ? Instead of
offset: Replication progress , Start to -1. Represents the first copy

Master library received psync After the command, return to FULLRESYNC The corresponding identification is copied in full for the first time . The master database copies all the current data to the slave database
The main library executes locally bgsave Generate RDB file , And then RDB The file is sent to the slave library , Received... From library RDB Empty the current database first , Then load RDB The data in the file . The master library can still handle client requests during data synchronization . The new request from the client will be written to the special... In the memory replication buffer in
At the very beginning RDB After the synchronization is completed, the newly written data of the main database will be synchronized, that is replication buffer Data in . In this way, the data synchronization is completed

Lord - from - Slave mode

An obvious disadvantage of master-slave is that each master-slave synchronization requires the master library to generate a full amount of RDB file , Transmit at the same time RDB file . These two operations are still time-consuming , If the number of slave libraries is large, it will occupy a lot of resources of the master library , Keep the threads of the main library busy fork The child thread generates RDB file , Affect the normal request processing of the main thread . So the Lord is derived - from - Slave mode

This mode is simply that when we deploy, we can set the synchronization of one slave library to synchronize data from another slave Library , There is no need for all slave databases to synchronize data from the master database . So our previous architecture diagram evolved into the following

But the main problem of this cluster is what to do if the main database hangs , Then the whole cluster will be unavailable , We can only intervene manually , So we have evolved the following new cluster mode : The sentry cluster

The sentry cluster

** sentry (sentinel) In order to solve the problem of no failover in the master-slave mode , That is, there is no cluster mode caused by automatic master-slave switching . We know that all write operations in our master-slave mode are based on Master, So in Master After downtime , All writes are not available , It can only be handled manually , This is unacceptable in high availability , So the master-slave mode is not really high availability , Because there is no automatic failover , It just shares the pressure of reading data from the main database . Here, let's take a brief look sentry (sentinel)** How do clusters handle Master Fault problem

When it comes to failover, we have to talk about the old problems of Distributed Systems

Fault detection
Cleft brain problem

These two questions were introduced in detail in my last article , There will be no further explanation . Let's start with how the sentry works

How the sentry works

** sentry (sentinel)** You can think of it as one Agent, His two main jobs are : Elector 、 monitor ( The heartbeat detection 、 Fault detection )

Let's first look at how fault detection is implemented

Fault detection

sentry (sentinel) When the process is running , Periodically to all master-slave Libraries ( Whole redis colony ) send out PING command , Check if they are still online . If there is no corresponding sentry in the set time from the warehouse PING command , The sentry marked it as Offline status ; The same is true for the main library , However, the offline of the main database will also trigger the process of actively switching the main database

Elector

The main selection process has the following three steps :

Fault detection ： Judge whether the main library is offline ( We need to consider the problem of brain fissure )

There are two ways to judge whether the main database is offline

1. Subjective offline ：** sentry (sentinel)** Will use PING Test yourself and the main library （Master）、 From the network situation between Libraries , Determine whether the instance is alive , If PING Command timeout , The sentinel marks the instance as subjective offline

2. Objective offline ： Multiple ** sentry (sentinel)** Judge the main database （Master） Downtime , Mark the primary node as an objective offline node

The objective offline standard is when there is N When a sentry , It's better to have N/2+1 An instance judges that the main database is a subjective offline database , Finally, it is judged that the main database is an objective offline . This method is mainly to reduce misjudgment . Because sentinels also need to ensure high availability , Generally, sentinels start with three sets , The communication between sentinels depends on Redis The publish and subscribe function of （pub/sub）, Know the of other sentinel machines through the main library IP And port

Elector ： You need to consider which slave library to choose as the master library

** sentry (sentinel)** Which slave library needs to be selected as the new master library , Need a score , The slave library with the highest score is used as the new master library , The scoring steps are as follows

The first round : User pass slave-priority Configuration item , Set different priorities for slave Libraries . If there is the highest priority , Select the slave library with the highest priority as the master library . without , Then start the second round of scoring

The second round ： The slave database closest to the data synchronization of the old master database becomes the new master database , The specific judgment basis is to rely on slave_repl_offset This parameter value records the copy progress

The third round : If the second round scores the same , Then use ID The slave library with the smallest number is used as the new master library , This ID The number is randomly generated in the front ID

notice ： Send the new master library to the slave Library

One sentry (sentinel) Successfully to a master Failed over (failover), It will bring about master Inform others of the latest configuration of sentry (sentinel), The rest of the ** sentry (sentinel) Then update the corresponding master Configuration of . meanwhile sentry (sentinel)** The information of the new master library will be sent to other slave libraries , Connect them to the new main library , And copy the data .** sentry (sentinel)** It will also broadcast the information of the new main database to the client , Let them send the request operation to the new main library

Cluster slicing

It can be seen that the previous clustering methods have not been improved Redis Its own storage space , It just increases the pressure of availability and concurrency . If we want to store Key quite a lot , It takes a lot of memory , Using the above cluster method, we can never break through the single machine Redis Memory limit of , You can always upgrade only a single Redis Instance resource allocation （ Vertical expansion ）, But the hardware configuration is always limited . So we have derived slice clusters , Used to support Horizontal scaling

Redis Cluster principle

Redis Cluster Hashi trough is mainly used （Hash Slot） The way . First, divide the whole cluster into 16384 Hash slot , Then you will need to store key And instances are mapped to these hash slots . The specific mapping steps are as follows

First, according to the key key according to CRC16 The algorithm calculates a 16bit Value , Then use this 16bit The value of is right 16384 Take the mold . such key It falls to 0~16383 These hash slots are .

How do instances correspond to these hash slots ？ If we have N An example , Then the hash slot corresponding to each instance is 16384/N In this way, the number of hash slots for each instance is calculated .

How does the client locate where the data is located Redis

The client can use the key Calculate which hash slot it falls on . In addition, the client also needs to know which instance corresponds to which hash slots . When the client accesses any instance, it will get the hash slot information of all instances , At the same time, cache it locally , The corresponding Hashi slot can be obtained through calculation each time , The instance corresponding to the hash slot , You can get the right data

summary

This time we mainly discussed Redis There are three ways to cluster . We can see that not all of these clustering methods are available at once , We evolved slowly , There is the original master and slave , Evolution oriented - from - from , To the sentry , The last slice . To solve different problems . In fact, we can also see an evolution process of architecture here . Of course, this article only briefly talks about various cluster methods and principles , There are still many details not discussed , We can have a chance to discuss together in the future

原网站

版权声明
本文为[Python's path to becoming a God]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202201234082298.html