当前位置：网站首页>Redis data loss problem

Redis data loss problem

2022-07-23 18:07:00 【Grab】

common Redis Data loss

DBA/RD Misoperation execution flushall/flushdb Such orders .
Expired key deleted .
Elimination strategy delete data .
Because the client buffer memory is used too much , Cause a large number of keys to be LRU Eliminate .
Auto restart after main library failure , Possible data loss .
The problem of network partition , It may cause short-term write data loss .
Data loss due to asynchronous replication .
Data loss due to cleft brain .

Because the client buffer memory is used too much , Cause a large number of keys to be LRU Eliminate

The memory size of the client buffer is difficult to limit , The amount of memory they consume will be calculated in used_memory Inside ; If not used properly , This leads to excessive buffer memory usage , achieve maxmemory Limit ;（ Caching scenarios ） Will cause a large number of keys to be eliminated , At worst, all keys will be cleaned up , Buffer keyless can be eliminated , Write failure . Equivalent to the entire buffer failure , Great impact on business .

Auto restart after main library failure , Possible data loss .

Point in time T1, The main library is closed due to failure , Because there is a daemon for automatic restart , Point in time T2 The main library is pulled up again , because (T2-T1) The time interval is too small , Not up to Redis The master-slave switching judgment duration of the cluster or sentinel ; In this way, the main database is found from the database runid Changed or disconnected , The main database will be fully synchronized rdb clear , And clean up your own data . To ensure performance ,Redis The main database often does not set data persistence , So at what time T2 Start the main library , It is likely to be an empty instance （ Or a long time ago rdb file ）. The time interval between such problems , Generally less than 1 minute , The monitoring alarm may not be perceptible .

The problem of network partition , It may cause short-term write data loss .

This kind of problem occurs, and data loss is rare , Network partition ,Redis The cluster or sentinel is judging the time window of failover , Data written to the original master database during this period ,5 second ~15 Write volume in seconds .

Data loss due to asynchronous replication .

because master->slave Replication of is asynchronous , So maybe some data hasn't been copied to slave,master It's down. , At this point, some of the data is lost .

Data loss due to cleft brain .

Some master The machine is suddenly out of the normal network , With others slave The machine can't be connected , But actually master Still running . At this point the sentry might think master It's down. , Then open the election and the others slave Switch to master. There will be two in the cluster master, It's called cleft brain .

At this time, although some slave It's switched to master, But maybe client There's no time to switch to the new master, Still keep writing to the old master Data may have been lost . So old master When we recover again , Will be treated as a slave Hang on to the new master Up , Your data will be cleared , From the new master Copy the data .

Solution ：
By means of redis.conf Configure and control synchronization time to reduce data loss .

#  It requires at least 1 individual slave, Data replication and synchronization delays cannot exceed 10 second 
min-slaves-to-write 1

#  If it's all slave, The latency of data replication and synchronization exceeds 10 Second , that master Will refuse to accept any request 
min-slaves-max-lag 10

原网站

版权声明
本文为[Grab]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/204/202207231459005617.html