当前位置:网站首页>Solution to cache inconsistency

Solution to cache inconsistency

2022-06-22 22:33:00 Weixiaobao

One 、 Preface

Caching due to its high concurrency and high performance characteristics , Has been widely used in projects . On the read cache side , The business process is shown in the figure below :

But in terms of updating the cache , Data inconsistency may occur , So after updating the database , It's update cache , Or delete the cache . Or delete the cache first , Update the database , There's a lot of controversy , In this article, we will briefly compare the advantages and disadvantages of several methods .

Two 、 Text

Let me give you an explanation , In theory , Set the expiration time for the cache , Is a solution that guarantees ultimate consistency . Under this scheme , We can set an expiration time for the data stored in the cache , All writes are subject to the database , Just do your best for the cache operation . That is, if the database write succeeds , Cache update failed , So just reach the expiration time , The subsequent read requests will naturally read the new value from the database and backfill the cache . therefore , The next line of thinking does not rely on setting an expiration time for the cache .
ad locum , We discuss Three Update strategy :

  1. Update the database first , Update the cache again
  2. So let's delete the cache , Update the database
  3. Update the database first , Delete the cache

(1) Update the database first , Update the cache again

This kind of scheme is universally opposed , Because of the following thread unsafe problems :

If Zhang San and Li Si update at the same time , The following may occur

  • Zhang San updated the database
  • Li Si updated the database
  • Li Si updated the cache
  • Zhang San updated the cache

This leads to Li Si updating the cache earlier than Zhang San , The data is dirty .

(2) Delete cache first , Update the database

The reason why this plan will lead to inconsistencies is that . At the same time, there is a request for Zhang San to update , Another request Li Si to perform the query operation . Then the following will happen :

  • Zhang San delete cache
  • Li Si found that the cache does not exist
  • Li Si queries the database to get the old value
  • Li Si writes the old value to the cache
  • Zhang San writes the new value into the database

The above situation will also lead to data inconsistency

So what's the solution ? We can use the delayed double deletion strategy

  • So let's delete the cache
  • Write the database again ( It's the same here as above )
  • Sleep for a certain time , Delete cache again

Do it , You can cache dirty data caused by a certain period of time , Delete again

So how should we determine the time

For the above case , Readers should assess their own project's time consumption of the read data business logic . The sleep time for writing the data is then based on the time spent reading the data's business logic , Add a few hundred ms that will do . The purpose of this is , Is to make sure that the read request ends , Write requests can remove cached dirty data caused by read requests .

If you use mysql What about the read-write separation architecture ?

ok, under these circumstances , The reasons for the inconsistent data are as follows , Or two requests , A request A Update operation , Another request B Query operation .

  • (1) request A Write operation , Delete cache
  • (2) request A Write the data to the database ,
  • (3) request B Query cache discovery , The cache has no value
  • (4) request B Go to the library and look up , At this time , Master slave synchronization is not complete yet , So the query is the old value
  • (5) request B Write the old value to the cache
  • (6) The database completes master-slave synchronization , Changes from library to new value

The above situation , That's why the data is inconsistent . Again, use the double-delete delay strategy . It's just , The sleep time is changed to be based on the delay time of master-slave synchronization , Add a few hundred ms.

Here, some students will ask again , Delete for the second time , What if the deletion fails ? If the second deletion fails , There are also cache and database inconsistencies .

Let's first look at the third scheme :

(3) Update the database first , Delete the cache So is there a concurrency problem in this case ? May also be , Let's take a look at the following steps :

Suppose there are two requests , A request A Do query operation , A request B Do update operation , So this is going to happen (1) The cache just failed (2) request A Query the database , Get an old value (3) request B Writes the new value to the database (4) request B Delete cache (5) request A Writes the old value found to the cache ok, If this happens , Dirty data does happen .

However , What's the probability of that happening ? There is a congenital condition for this to happen , It's the steps (3) Write database operations than steps (2) The read database operation takes less time , It's possible to make steps (4) Before the steps (5). But , Think about it , Database reads are much faster than writes ( Why else do read and write separation , The point of doing read/write separation is because the read operation is faster , Less resources ), So step (3) It takes more time than steps (2) shorter , It's very difficult for this to happen .

hypothesis , Some people have to be aggressive , Obsessive compulsive disorder , We have to figure out what to do ?

First , Setting an effective time for the cache is one option . secondly , Adopt a strategy (2) The asynchronous delay deletion strategy given in , Ensure that the read request is completed , Delete again .

Are there any other reasons for the discrepancy ? yes , we have , This is also a cache update strategy (2) And cache update policies (3) There is a problem , What if the delete cache fails , There's going to be inconsistencies . For example, a request to write data , And then it's written to the database , The delete cache failed , So there's going to be an inconsistency . This is also a cache update strategy (2) The last question left in .

How to solve ?

Just provide a guaranteed retry mechanism , Here are two scenarios .

First option :

However , There is a drawback to this scheme , A large number of intrusions into line of business code . So we have plan two , In scheme two , Start a subscription program to subscribe to the database binlog, Get the data you need to operate on . In the application , Let's do another procedure , Get the information from the subscriber , Do the delete cache operation .

Second option :

原网站

版权声明
本文为[Weixiaobao]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/173/202206222033413954.html

随机推荐