当前位置:网站首页>[dry goods knowledge] redis: from the application to the bottom, one article will help you
[dry goods knowledge] redis: from the application to the bottom, one article will help you
2022-06-21 19:58:00 【Linux server development】
1、 Basic types and underlying implementation

1.1、String
purpose :
For simplicity key-value Storage 、setnx key value Implement distributed locks 、 Counter ( Atomicity )、 Distributed, globally unique ID.
Bottom :C In language String use char[] An array of said , Source used in SDS(simple dynamic string) encapsulation char[], This is Redis The smallest unit of storage , One SDS Maximum storage 512M Information .
struct sdshdr{
unsigned int len; // Mark char[] The length of
unsigned int free; // Mark char[] The number of unused elements in
char buf[]; // Pit for storing elements
}Redis Yes SDS Encapsulation again generates RedisObject, The core has two functions :
That is the 5 What kind of .
There is a pointer inside to point to SDS.
When you execute set name sowhat When , Actually Redis Will create two RedisObject object , Keyed RedisObject and It's worth it RedisOjbect Among them, they are type = REDIS_STRING, and SDS What is stored separately is name Follow sowhat String .
also Redis Bottom pair SDS There are the following optimizations :
SDS Modified size > 1M when The system will allocate more space to pre allocate space .
SDS It is inert to release space , you free Space , But the system records the data and can be used directly next time you want to use it . No new application space .
1.2、List

Check the source code layer adlist.h You'll find that the bottom layer is a Double ended linked list , The maximum length of the list is 2^32-1. These are the only combinations used .
lpush + lpop = stack First in first out stack lpush + rpop = queue Fifo queue lpush + ltrim = capped collection Collection co., LTD. lpush + brpop = message queue Message queue
It can be used to do simple message queue , And when the amount of data is small, the unique compressed list may be used to improve the performance . Of course, we still need to be professional RabbitMQ、ActiveMQ etc.
1.3、Hash
Hashing is good for storing some related data together , For example, the user's shopping cart . This type is still quite common in daily use .
There's a little bit of clarity here :Redis Only one of them K, One V. among K Absolutely a string object , and V It can be String、List、Hash、Set、ZSet Any one .
hash At the bottom of the system is the use of dictionaries dict Structure , The whole presents layer by layer encapsulation . From small to large as follows :
1.3.1、dictEntry
Real data nodes , Include key、value and next node .

1.3.2、dictht
1、 data dictEntry An array of types , Of each array item It may all point to a linked list .
2、 The length of the array size.
3、sizemask be equal to size - 1.
4、 At present dictEntry How many nodes are included in the array .

1.3.3、dict
1、dictType type , Including some custom functions , These functions make key and value Be able to store
2、rehashidx It's actually a marker quantity , If -1 It shows that there is no expansion at present , If not for -1 Record the expansion location .
3、dictht Array , Two Hash surface .
4、iterators Records the iterator in progress in the current dictionary

The structure is as follows :

1.3.4、 Incremental expansion
Why? dictht ht[2] It's two ? The purpose is to expand the capacity without affecting the front end CURD, Slowly take the data from ht[0] Transferred to the ht[1] in , meanwhile rehashindex To record the transfer , When the transfer is complete , take ht[1] Change to ht[0] Use .
rehashidx = -1 It shows that there is no expansion at present ,rehashidx != -1 It means the number of the expanded array .
The size of the expanded array is greater than used*2 Of 2 Of n The minimum of the power , Follow HashMap similar . Then go through the array one by one and adjust rehashidx Value , For each dictEntry[i] Then traverse the linked list one by one, and the data will be Hash And then remap it to dictht[1] Inside . also dictht[0].use Follow dictht[1].use It's dynamic .

The focus of the whole process is rehashidx, This is the subscript position that the first array is moving , If the current memory is not enough , Or the operating system is busy , The expansion process can be stopped at any time .
After stopping, if you operate on the object , What does that look like ?
1、 If new , Then add the second array directly , Because if you add to the first array , I still have to move here in the future , There's no need to waste time
2、 If delete , to update , Inquire about , First find the first array , If not , Then query the second array .

1.4、Set
If you understand Java in HashSet yes HashMap A simplified version of, so this Set It should be understood . It's the same routine . Here you can think that there is no Value Of Dict. Look at the source code t.set.c You can understand the essence .
int setTypeAdd(robj *subject, robj *value) {
long long llval;
if (subject->encoding == REDIS_ENCODING_HT) {
// See that the underlying call is still dictAdd, It's just the third parameter = NULL
if (dictAdd(subject->ptr,value,NULL) == DICT_OK) {
incrRefCount(value);
return 1;
}
....1.5、ZSet
Range lookup Our natural enemy is Ordered set , Look at the bottom redis.h And you'll find out Zset It uses a jump table comparable to a binary tree to achieve order . Jump list is a combination of multi-layer linked lists , The skip table is divided into many layers (level), Each layer can be regarded as an index of data , The meaning of these indexes is to speed up the speed of table skipping lookup .
The data at each level is ordered , The upper level data is a subset of the next level data , And the first layer (level 1) It contains all the data ; The higher the level is. , The more jumping , The less data it contains . And randomly insert a data, whether the data will be a skip table index, completely random, just like playing dice .
A skip table contains a header , When it looks for data , It's from the top down , Search from left to right . Now find out the value is 37 For example, the node of , To compare the jump list with the general linked list .
No skip table query For example, I query data 37, If there is no index above, the route is as follows :

There is a skip table query There is a skip table query 37 The route is as follows :

Application scenarios :
Points leaderboard 、 Time sorted news 、 Delay queue .
1.6、Redis Geo
His core idea is to think of the earth as a sphere , then GEO utilize GeoHash Convert two-dimensional latitude and longitude to string , To achieve location division and query of specified distance .

1.7、HyperLogLog
HyperLogLog : It's a probabilistic data structure , It uses a probabilistic algorithm to calculate the approximate cardinality of a set . The origin of its algorithm is Bernoulli process + Points barrels + Harmonic mean . Specific implementation can see HyperLogLog Explain .
function : Do base statistics within the allowable range of error ( Cardinality is the number of different values in a set ) Is very useful , Every HyperLogLog Can be calculated close to 2^64 Cardinality of different elements , And the size only needs to be 12KB. The error rate is about 0.81%. So if you use it as UV The statistics are good .
HyperLogLog Bottom It's divided up 2^14 A barrel , That is to say 16384 A barrel . Every (registers) In the barrel is a 6 bit Array of , Here is a simple operation, that is, ordinary people may directly waste a byte as a bucket 2 individual bit Space , however Redis The bottom floor only uses 6 Then through the front and back splicing to achieve the ultimate use of memory , In the end is 16384*6/8/1024 = 12KB.
【 Article Welfare 】 In addition, Xiaobian also sorted out some C++ Back-end development interview questions , Teaching video , Back end learning roadmap for free , You can add what you need : Click to join the learning exchange group ~ Group file sharing
Xiaobian strongly recommends C++ Back end development free learning address :C/C++Linux Server development senior architect /C++ Background development architect
https://ke.qq.com/course/417774?flowToken=1013189

1.8、bitmap
BitMap The original meaning is to use a bit to map the state of an element . Because a bit can only represent 0 and 1 Two kinds of state , therefore BitMap The states that can be mapped are finite , But the advantage of using bit is that it can save a lot of memory space .
stay Redis in BitMap The bottom layer is based on string type , You can put Bitmaps Think of it as an array of bits , Each unit of an array can only store 0 and 1, The subscript of the array is in Bitmaps It's called offset ,BitMap Of offset Value limit 2^32 - 1.

User check-in
key = year : user id offset = ( Today is the day of the year ) % ( The number of days this year )
Count active users
Use date as key, And then the user id by offset Set different offset by 0 1 that will do .
PS : Redis Its communication protocol is based on TCP Application layer protocol RESP(REdis Serialization Protocol).
1.9、Bloom Filter
The result of using the bloom filter : What does not exist must not exist , What exists does not necessarily exist .
The bloon filter principle :
When an element is added to a collection , adopt K Hash functions map this element to... In a group of bits K A little bit ( Effectively reduce the probability of conflict ), Set them as 1. Search time , We just need to see if these points are all 1 We'll know if it's in the set : If any of these points is 0, The inspected element must not be in ; If it's all 1, Then the inspected element is likely to be . This is the basic idea of bloon filter .
If you want to play, you can use Google Of guava Let's play .

1.10 Publish subscribe
redis Provides a release 、 The message mechanism of subscription mode , Among them, the subscriber and the publisher do not directly communicate , Publisher to the specified channel (channel) Release the news , Every client that subscribes to the channel can receive messages . But it's more professional MQ(RabbitMQ RocketMQ ActiveMQ Kafka) It's not worth mentioning , This function is called the ball .

2、 Persistence
because Redis The data is in memory , When the power is cut off , So persistence to disk is a must ,Redis Provides RDB Follow AOF Two modes .
2.1、RDB
RDB Persistence mechanism , It's right Redis Periodic persistence of data execution in . More suitable for cold standby . advantage :
1、 Compressed binary text , For backup 、 Copy in full , For disaster recovery loading RDB Recover data much faster than AOF The way , Suitable for large-scale data recovery .
2、 If the business does not require high data integrity and consistency ,RDB It's a good choice . Data recovery ratio AOF fast .
shortcoming :
1、RDB Is a periodically spaced snapshot file , Data integrity and consistency are not high , because RDB Maybe it went down on the last backup .
2、 Backup takes up memory , because Redis It will be independent during backup fork A subprocess , Write data to a temporary file ( At this time, the data in memory is twice the original ), Finally, replace the backup file with the temporary file . So we have to consider about twice the data expansion .
Pay attention to manual triggering and COW:
1、SAVE Call directly rdbSave , Blocking Redis The main process , Resulting in the inability to provide services .
2、BGSAVE be fork Make a sub process , The subprocess is responsible for calling rdbSave , Send a signal to the main process after the save is completed . stay BGSAVE The client's request can still be processed during execution .
3、Copy On Write Mechanism , The backup is the data in memory at the beginning of that time , Only copy the modified memory page data , Not all memory data .
4、Copy On Write If the parent-child process writes a lot, it will cause paging errors .

2.2、AOF
AOF Mechanism to log each write command , With append-only Is written to a log file , Because this mode is just an additional way , So there's no disk addressing overhead , So soon , It's kind of like Mysql Medium binlog.AOF More suitable for hot standby .
advantage :
AOF It's going through a background thread one second at a time fsync operation , Don't be afraid of data loss .
shortcoming :
1、 For the same number of data sets ,AOF Files are usually larger than RDB file .RDB Speed ratio when recovering large data sets AOF It's faster to recover .
2、 According to different synchronization strategies ,AOF The operation efficiency is often slower than RDB. All in all , The efficiency of synchronization policy per second is relatively high .
AOF The whole process is divided into two steps : The first step is to write commands in real time , There may be 1 Second data loss . The command is appended to aof_buf And then sync to AO disk , If you write to the disk in real time, it will bring very high disk IO, Affect overall performance .
The second step is right aof File rewriting , The aim is to reduce AOF File size , It can be triggered automatically or manually (BGREWRITEAOF), yes Fork Out of child process operation , period Redis The service is still available .

1、 During rewriting , Because the main process is still responding to commands , To ensure the integrity of the final backup ; It will still write the old AOF in , If the rewrite fails , Can guarantee the data not to lose .
2、 In order to write the write information of the response during rewriting to the new file , Therefore, one is reserved for the child process as well buf, Prevent new writing file Lost data .
3、 Rewriting is to directly generate the corresponding command for the data in the current memory , No need to read the old AOF Document analysis 、 Command merge .
4、 Whether it's RDB still AOF Write a temporary file first , And then through rename Complete the document replacement work .
About Fork The advice of :
1、 Reduce fork The frequency of , For example, it can be triggered manually RDB Generate snapshot 、 And AOF rewrite ;
2、 control Redis Maximum memory used , prevent fork It takes too long ;
3、 The configuration is more powerful , The reasonable configuration Linux Memory allocation strategy , Avoid physical memory shortage fork Failure .
4、Redis In execution BGSAVE and BGREWRITEAOF On command , The load factor of the hash table >=5, And when you don't execute these two commands >=1. The goal is to minimize write operations , Avoid unnecessary memory write operations .
5、 Expansion factor of hash table : Number of nodes saved in hash table / Hash table size . The factor determines whether to extend the hash table .
2.3、 recovery
It will be checked before starting AOF( More complete data ) Does the file exist , Try loading if it doesn't exist RDB.

2.4、 Suggest
Since it is used alone RDB It's going to lose a lot of data . Alone with AOF, Data recovery has not RDB Come fast , So when something goes wrong, I use RDB recovery , then AOF It's only by doing data completion that we can say the king's way .
3、Redis Why so fast
3.1、 Based on memory implementation :
The data is stored in memory , Compared to disk IO It's a hundred times faster , The operation speed is very fast .
3.2、 Efficient data structure :
Redis The underlying multiple data structures support different data types , such as HyperLogLog It even 2 I don't want to waste a byte .
3.3、 Rich and reasonable coding :
Redis The ground floor provides Rich and reasonable coding , The five data types adapt to different encoding formats according to the length and the number of elements .
1、String: Autostore int type , Not int Type use raw code .
2、List: String length and the number of elements is less than a certain range ziplist code , Otherwise, it will be transformed into linkedlist code . 3、Hash:hash The length of the key and value string in the key value pair saved by the object is less than a certain value and key value pair .
4、Set: Save elements as integers and the number of elements is less than a certain range intset code , Any condition is not satisfied , Then use hashtable code .
5、Zset: The number of saved elements is less than the fixed value, and the member length is less than the fixed value ziplist code , Any condition is not satisfied , Then use skiplist code .
3.4、 The right thread model :
I/O The multiplexing model simultaneously listens for client connections , Multithreading requires context switching , This is fatal for an in memory database .

3.5、 Redis6.0 After the introduction of multi-threaded speed :
Need to know Read and write on the Internet read/write The system takes time >> Redis It takes time to run ,Redis The bottleneck of the network mainly lies in IO Consume , Optimization has two main directions :
1、 Improve the Internet IO performance , Typical implementations, such as using DPDK To replace the kernel network stack 2、 Use multithreading to make the most of multicore , Typical implementations are Memcached.
This way of protocol stack optimization follows Redis Not much to do with , Supporting multithreading is the most effective and convenient way to operate . therefore Redis There are two main reasons for multithreading :
1、 You can make the most of the server CPU resources , At present, the main thread can only use one core 2、 Multithreaded tasks can be shared Redis Sync IO Read and write load
About multithreading :
Redis 6.0 edition Multithreading is turned off by default io-threads-do-reads no
Redis 6.0 edition When multithreading is turned on The number of threads should also be Set carefully .
Multithreading can double performance , But multithreading is only used to process network data reading and writing and protocol parsing , The execution command is still executed in a single thread sequence .
4、 common problem
4.1、 Cache avalanche
Avalanche definition :
Redis Medium and large quantities key Failure at the same time causes all requests to hit MySQL. and MySQL Can't carry, leading to a large area of collapse .
Avalanche solutions :
1、 The expiration time of cached data plus a random value , Prevent a large number of data expiration at the same time . 2、 If the cache database is a distributed deployment , Distribute the hot data evenly in different cache databases . 3、 Set hotspot data never to expire .
4.2、 Cache penetration
The definition of penetration :
Cache penetration yes Data that is not in the cache or database , such as ID Default >0, Hackers have been request ID= -12 The database will be under too much pressure , Serious database crash .
Penetration solutions :
1、 Back end interface layer added User authentication verification , Parameter verification, etc . 2、 Single IP The number of visits per second exceeds the threshold and is directly pulled IP, Shut up in a little dark house 1 God , In obtaining IP I was blackmailed when I was in the agency pool . 3、 Data not available from cache , In the database, there is no access to , At this time, you can also key-value Write as key-null The failure time can be 15 Seconds to prevent malicious attacks . 4、 use Redis Provided Bloom Filter The characteristics are also OK.
4.3、 Cache breakdown
Breakdown definition :
The phenomenon : This is a big focus on concurrency key Visit , When this Key At the moment of failure , Continuous large concurrency breaks through the cache , Direct request database .
Breakdown solves :
Set hotspot data never to expire With the mutex lock, it can be done
4.4、 Double write consistency
Double write : Both cache and database update data , How to ensure data consistency ?
1、 Update the database first , Update the cache again
safety problem : Threads A Update the database -> Threads B Update the database -> Threads B Update cache -> Threads A Update cache . Lead to dirty reading . Business scenario : Read less and write more scenes , Update the database frequently and cache is useless . What's more, if the cache is superimposed, the result will waste performance .
2、 Delete cache first , Update the database
A Request write to update cache . B It is found that the cache is not written to the cache after querying the old value . A Write data to the database , At this time, the cache is inconsistent with the database .
therefore FackBook Put forward Cache Aside Pattern
invalid : The application starts with cache Take the data , Didn't get , Then take the data from the database , After success , Put it in the cache . hit : Applications from cache Take data from , Take it back . to update : Store the data in the database first , After success , And then invalidate the cache .
4.5、 Split brain
Cleft brain is because of the Internet , Lead to master node 、slave node and sentinel The cluster is in an unused network partition , At this time because sentinel The cluster cannot perceive master The existence of , So will slave The node is promoted to master node There are two different master Nodes are like a brain split into two . Actually in Hadoop 、Spark This happens in clusters , It's just that the solutions are different ( use ZK With forced killing ).
In the cluster cleft problem , If the client is still based on the original master Nodes continue to write data so new master Nodes will not be able to synchronize this data , When the network problem is solved sentinel The cluster will be the original master The node is reduced to slave node , Now it's time to start again master It will cause a lot of data loss if we synchronize the data in .
Redis The solution is redis There are two parameters in the configuration file of
min-replicas-to-write 3 Means to connect to master At least slave Number
min-replicas-max-lag 10 Express slave Connect to master Maximum delay time If connected to master Of slave Number < The first parameter And ping Delay time <= The second parameter, then master Will refuse to write the request , After the two parameters are configured, if cluster cleft occurs, the original master When a node receives a write request from the client, it can reduce the data loss after data synchronization .
【 Article Welfare 】 In addition, Xiaobian also sorted out some C++ Back-end development interview questions , Teaching video , Back end learning roadmap for free , You can add what you need : Click to join the learning exchange group ~ Group file sharing
Xiaobian strongly recommends C++ Back end development free learning address :C/C++Linux Server development senior architect /C++ Background development architect

4.6、 Business
MySQL There are a lot of things to do in China , And in the Redis There are only three steps in the transaction in :

Specific conclusions about the matter :
1、redis Business is a one-off 、 Sequence 、 Exclusive execution of a series of commands in a queue .
2、Redis Transactions have no concept of isolation levels : Bulk operation is sending EXEC The command is put into the queue cache , It's not actually executed , In other words, there is no query in the transaction. You need to see the update in the transaction , Out of transaction queries cannot see .
3、Redis There is no guarantee of atomicity :Redis A single command in is atomic , But transactions don't guarantee atomicity .
4、Redis All code in a compiled error transaction is not executed , Wrong use of instructions . Runtime exceptions are caused by incorrect commands , Other commands can be executed normally .
5、watch Instructions are similar to optimistic locks , At transaction commit , If watch Monitoring multiple KEY In any of the KEY The value of has been changed by another client , Then use EXEC When the transaction is executed , The transaction queue will not be executed .
4.7、 Correct development steps
Before going online :Redis High availability , Master-slave + sentry ,Redis cluster, Avoid total collapse . On line : Local ehcache cache + Hystrix Current limiting + Downgrade , avoid MySQL I can't carry it . After the launch :Redis Persistent adoption RDB + AOF To ensure that the data is automatically loaded from the disk after the breakpoint , Quick recovery of cached data .
5、 Distributed lock
In daily development, we can use synchronized 、Lock Implement concurrent programming . however Java The locks in the can only be guaranteed to be in the same JVM In process execution . What about locks in a distributed cluster environment ? There are generally two options in daily life .
5.1、 Zookeeper Implement distributed locks
You need to know a little bit about the basics zookeeper knowledge :
1、 Persistent node : Client disconnected zk Don't delete persistent Types of nodes
2、 Temporary node : Client disconnected zk Delete ephemeral Types of nodes
3、 Sequential node : The node will automatically generate something like 0000001 The number of is the order
4、 Notification of node changes : When the client registers to listen for changes in nodes , The callback method will be called
The general flow is as follows , Note that each node only monitors the status of the node in front of it , So as to avoid herding . About the template code Baidu can .

shortcoming :
Create and delete nodes frequently , Plus registration watch event , about zookeeper There is a lot of pressure on the cluster , It's not as good as Redis Implementation of distributed lock .
5.2、 Redis Implement distributed locks
Its principle is relatively simple ,Redis It's a single threaded processor itself , It has the feature of mutual exclusion , adopt setNX,exist Wait for the command to complete a simple distributed lock , Handle the time-out lock release logic .
SETNX
SETNX yes SET if Not eXists Abbreviation , The daily instructions are SETNX key value, If key Nonexistence set Successfully returns 1, If this key There is already a return 0.
SETEX
SETEX key seconds value It means Will value value Related to key , And will key How many seconds is the lifetime set to . If key Already exist ,setex The command will override the old value . also setex It's an atomicity (atomic) operation .
Lock :
It's usually a string that identifies uniqueness, such as UUID coordination SETNX Realize locking .
Unlock :
It's used here LUA Script ,LUA It can be guaranteed to be atomic , The idea is to judge Key Whether and input parameters are equal , If yes, delete , Return to success 1,0 It's failure .
shortcoming :
This lock is not reentrant , And if you are solid, you should consider all kinds of edges and corners , So we can understand the general process of thinking , Engineering or open source toolkit .
5.3、 Redisson Implement distributed locks
Redisson Is in Redis Based on a service , Based on NIO Of Netty frame , Not only as Redis The underlying driver client , It can also transform the original RedisHash,List,Set,String,Geo,HyperLogLog The data structure is encapsulated as Java The most familiar mapping in the (Map), list (List), Set (Set), Universal object bucket (Object Bucket), Geospatial objects bucket (Geospatial Bucket), Cardinality estimation algorithm (HyperLogLog) Isostructure .
Here we only use a few instructions about distributed locks , His general underlying principle :

Redisson Lock and unlock The general flow chart is as follows :

6、Redis Expiration policy and memory elimination strategy
6.1、Redis The expiration strategy of
Redis in Expiration strategy There are usually three kinds of :
1、 Timed expiration :
Each set expiration time of key All need to create a timer , When it comes to the expiration time, it will immediately respond to key Scavenging . This strategy can immediately clear expired data , It's very memory friendly ; But it will take up a lot of CPU Resources to process overdue data , This affects the response time and throughput of the cache .
2、 Inertia expires :
Only when visiting one key when , Only then can we judge that key Has it expired , If it expires, it will be removed . This strategy can maximize savings CPU resources , But it's not very memory friendly . In extreme cases, there may be a large number of overdue key Not visited again , So it won't be removed , Take up a lot of memory .
3、 Expire regularly :
Every once in a while , Will scan a certain number of databases expires A certain number of in the dictionary key, And remove the expired key. The strategy is a compromise between the first two . By adjusting the time interval of timing scanning and the limited time of each scanning , Can make in different situations CPU And memory resources to achieve the best balance effect . expires The dictionary will save all the expired key The expiration date of , among key Is a pointer to a key in the key space ,value Is the millisecond precision of the key UNIX The expiration time indicated by the time stamp . Bond space refers to the Redis All keys stored in the cluster .
Redis Expiration strategy adopted : Lazy deletion + Delete periodically .memcached Expiration strategy adopted : Lazy deletion .
6.2、6 Memory elimination strategy
Redis The memory knockout strategy in Redis When you run out of memory for caching , How to deal with data that needs to be newly written and needs to apply for additional space .
1、volatile-lru: From the set of data for which the expiration time has been set (server.db[i].expires) Select the least recently used data in
2、volatile-ttl: From the set of data for which the expiration time has been set (server.db[i].expires) To select the data to be expired
3、volatile-random: From the set of data for which the expiration time has been set (server.db[i].expires) In the arbitrary selection of data elimination
4、allkeys-lru: From the data set (server.db[i].dict) Select the least recently used data in
5、allkeys-random: From the data set (server.db[i].dict) In the arbitrary selection of data elimination
6、no-enviction( deportation ): Exclusion data , Do not delete .
What you often ask in an interview is LRU 了 , Familiar LinkedHashMap It is also realized in LRU Algorithm , The implementation is as follows :
class SelfLRUCache<K, V> extends LinkedHashMap<K, V> {
private final int CACHE_SIZE;
/**
* How much data can be cached when passed in
* @param cacheSize Cache size
*/
public SelfLRUCache(int cacheSize) {
// true Let go linkedHashMap Sort by access order , Recently visited on the head , The oldest visitors are at the end .
super((int) Math.ceil(cacheSize / 0.75) + 1, 0.75f, true);
CACHE_SIZE = cacheSize;
}
@Override
protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
// When map When the amount of data in is greater than the specified number of caches , Automatically delete the oldest data .
return size() > CACHE_SIZE;
}
}6.2、 summary
Redis The selection of the out of date memory policy will not affect the out of date memory key To deal with . The memory knockout strategy is used to process data that needs to request extra space when memory is low , The expiration policy is used to process expired cache data .
7、Redis Cluster high availability
Single machine problem, machine failure 、 Capacity of the bottleneck 、QPS bottleneck . in application ,Redis Multi machine deployment will involve redis Master slave copy 、Sentinel Sentinel mode 、Redis Cluster.
Pattern | advantage | shortcoming |
|---|---|---|
Stand-alone version | Simple architecture , Easy to deploy | Machine fault 、 Capacity of the bottleneck 、QPS bottleneck |
Master slave copy | high reliability , Read / write separation | Fault recovery is complex , The write and save of the main database is limited by a single machine |
Sentinel sentry | Cluster deployment is simple ,HA | The principle is complicated ,slave There is a waste of resources , Can't solve the problem of separation of reading and writing |
Redis Cluster | Data dynamic storage solt, Scalable , High availability | The client dynamically perceives back-end changes , Batch operation supports querying |
7.1、redis Master slave copy
In this mode High availability and separation of read and write , Will be used The incremental synchronization Follow Full amount of synchronization Two mechanisms .
7.1.1、 Full amount of synchronization

Redis Full scale replication generally occurs in Slave Initialization phase , At this time Slave Need to put Master Make a copy of all the data on :
1、slave Connect master, send out psync command . 2、master Received psync Named after the , Start execution bgsave Command to generate RDB File and use the buffer to record all write commands executed since . 3、master Send snapshot file to slave, And continue to record the write commands executed during the send .4、slave Discard all old data after receiving the snapshot file , Load the received snapshot . 5、master After sending the snapshot, start to slave Send write command in buffer . 6、slave Finish loading the snapshot , Start receiving command requests , And execution from master Write command to buffer .
7.1.2、 The incremental synchronization
Also called instruction synchronization , It is the instruction that the slave library replays to the master library .Redis The instructions are stored in a ring queue , Because the memory capacity is limited , If the standby machine can't get up all the time , It's impossible to store all the memory instructions , in other words , If the standby machine is not synchronized , It may be covered by instructions .
Redis Delta copy means Slave When it starts to work normally after initialization master Write operations that occur are synchronized to slave The process of . The process of incremental replication is mainly master Every time a write command is executed, it will send to slave Send the same write command .

7.1.3、Redis Master slave synchronization policy :
1、 Master slave just connected , Do full synchronization ; When the full synchronization is over , Incremental synchronization . Of course , If necessary ,slave Full synchronization can be initiated at any time .redis Strategy is , in any case , You will first attempt incremental synchronization , If you don't succeed , Require full synchronization from slave .2、slave In sync master When it comes to data, if slave Don't be afraid to lose the connection ,slave Lost reconnection after reconnection . 3、 In general, the separation of reading and writing is realized through the master-slave , But if master How to guarantee after you hang up Redis Of HA Well ? introduce Sentinel Conduct master The choice of .
7.2、 High availability sentry mode

Redis-sentinel It is an independent process , commonly sentinel colony The number of nodes is at least three and odd , It can monitor multiple master-slave colony ,sentinel Node found master Can be automatically switched after downtime .Sentinel You can monitor any number of master servers and the slave servers under the master servers , And when the monitored primary server is offline , Automatically perform failover operations . Here we need to pay attention to sentinel Also have single-point-of-failure problem . A general list of sentinel uses :
Cluster monitoring : Cycle monitoring master Follow slave node . Notice of news : When it found out there was redis If the instance fails , It will send a message to the Administrator Fail over : This is divided into subjective offline ( A single sentinel found master It's broken down ). Objective offline ( Multiple sentinels make choices and find that they reach quorum It's time to switch ). Configuration center : If there is a failover , It will inform will master Write the new address in the configuration center and tell the client .
7.3、Redis Cluster
RedisCluster yes Redis Distributed solutions , stay 3.0 After the release of the program , Effectively solved Redis Distributed requirements .

7.3.1、 Zoning rules

Common zoning rules
Node redundancy :hash(key) % N
Consistent Hashing : Consistent hash ring
Virtual slot hash :CRC16[key] & 16383
RedisCluster Virtual slot partition mode is adopted , The implementation details are as follows :
1、 Adopt the idea of decentralization , It uses virtual slots solt Partition covers all nodes , The same process of getting data , The nodes communicate with each other using lightweight protocols Gossip To reduce bandwidth consumption, so the performance is very high ,
2、 Automatic load balancing and high availability , automatically failover It also supports dynamic scaling , The authorities have played as well as they can 1000 Nodes Implementation complexity is low .
3、 Every Master You also need to configure master-slave , The interior is also in sentry mode , If half of the nodes find an exception, they will jointly decide to change the state of the exception node .
4、 If the cluster master No, slave node , be master When you fail, the whole cluster will enter fail state , Because of the cluster slot Incomplete mapping . If the cluster is more than half master Hang up , The cluster will go in fail state .
5、 The official recommendation Cluster deployment at a minimum 3 Table above master node .
8、Redis Current limiting
Often take the Beijing Xierqi subway or take the Beijing West Railway Station, often encounter a situation is if there are a lot of people , The subway staff will take a small card and ask you to check in later , This is the real life measures to deal with the huge flow of people .
When developing high concurrency systems , There are three weapons to protect the system : cache 、 Degradation and current limiting . So how to limit the flow ? seeing the name of a thing one thinks of its function , Limiting flow is limiting flow , It's like your broadband package 1 individual G Of traffic , When it's used up, it's gone . Through current limiting , We can control the system very well qps, So as to achieve the purpose of protection system .
1、 be based on Redis Of setnx、zset
1.2、setnx
For example, we need to be in 10 Seconds limit 20 A request , So we're in setnx You can set the expiration time when 10, When requested setnx The number of 20 Then the current limiting effect is achieved .
shortcoming : For example, when Statistics 1-10 seconds , Unable to statistics 2-11 In seconds , If you need statistics N Seconds M A request , So our Redis You need to keep N individual key Problems, etc. .
1.3、zset
In fact, the most important part of current limiting is sliding window , As mentioned above 1-10 How to become 2-11. In fact, the starting value and the end value are respectively +1 that will do . We can make the request into a zset Array , When every request comes in ,value Keep it unique , It can be used UUID Generate , and score Can be represented by the current timestamp , because score We can use it to calculate the number of requests within the current timestamp . and zset Data structures also provide range Methods make it easy for us to get 2 How many requests are in timestamps ,
shortcoming : Namely zset The data structure will be bigger and bigger .
2、 Leaky bucket algorithm
The idea of leaky bucket algorithm : Compare water to a request , The leaky bucket is compared to the system's capacity limit , The water goes into the leaky bucket first , The water in the leaky bucket flows out at a certain rate , When the rate of outflow is less than the rate of inflow , Because of the limited capacity of the leaky bucket , The water that enters subsequently overflows directly ( Refuse request ), In order to achieve current limiting .

3、 Token bucket algorithm
The principle of token bucket algorithm : It can be understood as the registration of a hospital , Only after you get the number can you have a diagnosis .

The details are roughly :
1、 All requests need to get an available token before they can be processed . 2、 According to the current limit , Set to add tokens to the bucket at a certain rate . 3、 Set the maximum bucket capacity , When the bucket is full, the newly added token is discarded or rejected . 4、 After the request arrives, first get the token in the token bucket , Take the token to carry out other business logic , After processing the business logic , Delete the token directly . 5、 Token bucket has a minimum of , When the token in the bucket reaches the minimum limit , The token will not be deleted after the request is processed , So as to ensure sufficient current limiting .
engineering :
1、 Custom annotation 、aop、Redis + Lua Achieve current limiting . 2、 recommend guava Of RateLimiter Realization .
9、 Common knowledge points
String fuzzy query with Keys May cause thread blocking , As far as possible with scan Instruction to get data without blocking and then to repeat .
In case of multiple operations, remember to use pipeLine Send all orders at once , Avoid frequent sending 、 The network overhead of receiving , Lifting performance .
bigkeys It can scan redis The big of the middle key, The bottom layer is to use scan Command to traverse all the keys , Execute for each key according to its type STRLEN、LLEN、SCARD、HLEN、ZCARD These commands get the length or the number of elements . The drawback is to try it online and the number is not necessarily large ,
Remember to open online applications Redis Slow query log oh , The basic idea is to follow MySQL similar .
Redis Because the memory allocation policy and data addition and deletion will lead to memory fragmentation , You can restart the service or execute it activedefrag yes Memory rearrangement to solve this problem .

1、Ratio >1 Indicates that there are memory fragments , The bigger, the more serious . 2、Ratio < 1 Indicates that virtual memory is in use , Virtual memory is actually a hard disk , Performance is much lower than memory , This is to enhance the memory of the machine to improve performance . 3、 Generally speaking ,mem_fragmentation_ratio The value of 1 ~ 1.5 Between is relatively healthy .
Reference material

Recommend a zero sound education C/C++ Free open courses developed in the background , Personally, I think the teacher spoke well , Share with you :C/C++ Background development senior architect , The content includes Linux,Nginx,ZeroMQ,MySQL,Redis,fastdfs,MongoDB,ZK, Streaming media ,CDN,P2P,K8S,Docker,TCP/IP, coroutines ,DPDK Etc , Learn now
original text :Redis: From application to bottom , Yiwen will help you
边栏推荐
- [force deduction 10 days SQL introduction] Day1
- CloudCompare&PCL 点云点匹配(基于欧式距离)
- How does the easycvr intelligent edge gateway hardware set power on self start?
- Gradle download and installation configuration
- [high frequency interview questions] the difficulty is 1.5/5. Common two point double pointer interview questions
- The R language uses the DOTPLOT function of epidisplay package to visualize the frequency of data points in different intervals in the form of point graph, uses the by parameter to specify the groupin
- How to create network redundancy for network managed national production reinforced switch
- Linux MySQL command
- A simple architecture design logic | acquisition technology
- 记一些PAT题目(一)
猜你喜欢

Shang Silicon Valley Shang Silicon Valley | what is Clickhouse table engine memory and merge

Medical expense list can be entered at a second speed, and OCR recognition can help double the efficiency

机器学习之聚类和降维与度量技术

Gradle下载与安装配置

播放量高达4000w+,情侣如何靠撒狗粮出圈?

Source code analysis of ArrayList

Kubernetes 跨 StorageClass 迁移 Persistent Volumes 完全指南

yolov5训练自己的数据集报错记录

MySQL-CentOS安装MySQL8

自定义代码模板
随机推荐
MFC interface library bcgcontrolbar v33.0 - Desktop alert window, grid control upgrade
Kubernetes migration of persistent volumes across storageclasses Complete Guide
如何使用DevExpress WPF在WinUI中创建第一个MVVM应用?
linux-mysql-命令
婴儿名字[连通分量之邻接矩阵与DFS]
决策树的实现和调优(sklearn,GridSearchCV)
Cloudcompare & PCL calculates transformation matrix based on matching points
Dynamic programming [II] (linear DP)
Cloudcompare & PCL point cloud point matching (based on European distance)
出院小结识别api接口-医疗票据OCR识别/出院诊断记录/电子病历/理赔服务
全国产加固以太网交换机选择技巧
删除倒数第k个节点-链表专题
508. Most Frequent Subtree Sum
yolov5训练自己的数据集报错记录
Gradle download and installation configuration
CPDA|数据分析师需要具备哪些基本功?
With a playback volume of up to 4000w+, how do couples get out of the ring by scattering dog food?
Nepal graph has settled in Alibaba cloud computing nest to help enterprises build a super large-scale map database on the cloud
Double pointer 1day8 of daily practice of Li Kou
How to temporarily modify samesite=none and secure in Chrome browser