当前位置:网站首页>Distributed cache breakdown
Distributed cache breakdown
2022-06-24 06:33:00 【User 1348170】
One . What is cache breakdown
Before we talk about cache breakdown , Let's first recall the logic of loading data from the cache , As shown in the figure below
therefore , If the hacker deliberately queries a data that does not exist in the cache every time , This causes every request to be queried in the storage layer , In this way, the cache is meaningless . In case of heavy traffic, the database may hang up . This is cache breakdown . The scene is shown in the following figure :
When we normal people log in to the home page , Are all based on userID To hit the data , However, the purpose of hackers is to destroy your system , Hackers can randomly generate a bunch of userID, Then connect these requests to your server , These requests do not exist in the cache , Will pass through the cache , Connect directly to the database , Thus, the database connection is abnormal .
Two . Solution
Here we give three solutions , According to the actual situation of the project , Choose to use .
Before talking about the following three schemes , Let's recall redis Of setnx Method SETNX key value
take key The value of the set value , If and only if key non-existent . If a given key Already exist , be SETNX Don't do anything .SETNX yes 『SET if Not eXists』( If it doesn't exist , be SET) Abbreviation .
Available version :>= 1.0.0 Time complexity : O(1) Return value : Set up the success , return 1. Setup failed , return 0 .
The effect is as follows
redis> EXISTS job # job non-existent (integer) 0 redis> SETNX job "programmer" # job Set up the success (integer) 1 redis> SETNX job "code-farmer" # Try to override job , Failure (integer) 0 redis> GET job # Not covered "programmer"
Use mutexes
This method is a common practice , namely , On the basis of key To obtain the value When the value is empty , Lock it first , And then load it from the database , Loading finished , Release the lock . If other threads find that getting lock failed , Sleep 50ms Post retry .
As for the type of lock , Used and contracted for stand-alone environment Lock Type is OK , Cluster environments use distributed locks ( redis Of setnx)
Cluster environment redis The code for is as follows :
String get(String key) {
String value = redis.get(key);
if (value == null) {
if (redis.setnx(key_mutex, "1")) {
// 3 min timeout to avoid mutex holder crash
redis.expire(key_mutex, 3 * 60)
value = db.get(key);
redis.set(key, value);
redis.delete(key_mutex);
} else {
// Rest of other threads 50 Try again in milliseconds
Thread.sleep(50);
get(key);
}
}
} advantage :
- Simple thinking
- Guarantee consistency
shortcoming
- Code complexity increases
- There is a risk of deadlock
Build cache asynchronously
Under this scheme , Build cache with asynchronous strategy , Threads will be fetched from the thread pool to build the cache asynchronously , So that all requests will not be directly connected to the database . The scheme redis Maintain a timeout, When timeout Less than System.currentTimeMillis() when , Then update the cache , Otherwise go straight back to value value . Cluster environment redis The code is as follows :
String get(final String key) {
V v = redis.get(key);
String value = v.getValue();
long timeout = v.getTimeout();
if (v.timeout <= System.currentTimeMillis()) {
// Asynchronous update background exception execution
threadPool.execute(new Runnable() {
public void run() {
String keyMutex = "mutex:" + key;
if (redis.setnx(keyMutex, "1")) {
// 3 min timeout to avoid mutex holder crash
redis.expire(keyMutex, 3 * 60);
String dbValue = db.get(key);
redis.set(key, dbValue);
redis.delete(keyMutex);
}
}
});
}
return value;
}advantage :
- Best sexual price , Users don't have to wait
shortcoming
- Cache consistency cannot be guaranteed
The bloon filter
1. principle The great use of the bloon filter is , Can quickly determine whether an element is in a set . Therefore, it has the following three usage scenarios :
- Web crawler right URL De duplication of , Avoid climbing the same URL Address
- anti-spam , Judging whether a mailbox is spam or not from billions of spam lists ( Empathy , Spam messages )
- Cache breakdown , Put the existing cache into the bloom filter , When hackers access the nonexistent cache, they can quickly return to avoid cache and DB Hang up .
OK, Next, let's talk about the principle of Bloom filter
Its internal maintenance is all 0 Of bit Array , It should be noted that , Bron filter has a concept of miscalculation rate , The lower the miscalculation rate , The longer the array , The more space it takes . The higher the error rate, the smaller the array , The less space it takes .
hypothesis , According to the misjudgment rate , We make a 10 Bit bit Array , as well as 2 individual hash function (f1,f2), As shown in the figure below ( The number of bits and of the generated array hash The number of functions , We don't have to care about how it is generated , A mathematical paper has been professionally proved ).
Suppose the input set is (N1,N2), After calculation f1(N1) The value obtained is 2,f2(N1) The value obtained is 5, Subscript the array to 2 And the following table 5 Set the position of 1, As shown in the figure below
Empathy , After calculation f1(N2) The value obtained is 3,f2(N2) The value obtained is 6, Subscript the array to 3 And the following table 6 Set the position of 1, As shown in the figure below
This is the time , We have a third number N3, We can judge N3 Whether to assemble or not (N1,N2) in , it f1(N3),f2(N3) The calculation of
- If the values happen to be in the red position in the figure above , We think that ,N3 In collection (N1,N2) in
- If one of the values is not in the red position in the figure above , We think that ,N3 Don't set (N1,N2) in
The above is the calculation principle of Bloom filter , Now let's do a performance test ,
2、 Performance testing The code is as follows :
(1) Create a new one maven engineering , introduce guava package
<dependencies>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>22.0</version>
</dependency>
</dependencies> (2) It takes time to test whether an element belongs to a million element set
package bloomfilter;
import com.google.common.hash.BloomFilter;
import com.google.common.hash.Funnels;
import java.nio.charset.Charset;
public class Test {
private static int size = 1000000;
private static BloomFilter<Integer> bloomFilter = BloomFilter.create(Funnels.integerFunnel(), size);
public static void main(String[] args) {
for (int i = 0; i < size; i++) {
bloomFilter.put(i);
}
long startTime = System.nanoTime(); // Get start time
// Judge whether the million numbers contain 29999 The number of
if (bloomFilter.mightContain(29999)) {
System.out.println(" Hit ");
}
long endTime = System.nanoTime(); // Get the end time
System.out.println(" Program running time : " + (endTime - startTime) + " nanosecond ");
}
}The output is as follows
Hit Program running time : 219386 nanosecond
in other words , Judge whether a number belongs to a million level set , as long as 0.219ms You can do it , Excellent performance .
(3) Some concepts of misjudgment rate First , Let's not set the display of misjudgment rate first , Conduct a test , The code is as follows
package bloomfilter;
import java.util.ArrayList;
import java.util.List;
import com.google.common.hash.BloomFilter;
import com.google.common.hash.Funnels;
public class Test {
private static int size = 1000000;
private static BloomFilter<Integer> bloomFilter = BloomFilter.create(Funnels.integerFunnel(), size);
public static void main(String[] args) {
for (int i = 0; i < size; i++) {
bloomFilter.put(i);
}
List<Integer> list = new ArrayList<Integer>(1000);
// Deliberately take 10000 Values that are not in the filter , See how many are thought to be in the filter
for (int i = size + 10000; i < size + 20000; i++) {
if (bloomFilter.mightContain(i)) {
list.add(i);
}
}
System.out.println(" The number of misjudgments :" + list.size());
}
} The output is as follows Number of misjudged pairs :330
If the above code shows , We deliberately take 10000 Values that are not in the filter , But there are others 330 One is thought to be in the filter , This shows that the misjudgment rate is 0.03. namely , Without making any settings , The default misjudgment rate is 0.03. Here's the source code to prove :
Let's take a look at , The rate of miscalculation is 0.03 when , Low level maintenance bit The length of the array is shown in the following figure
take bloomfilter The construction method of is changed to private static BloomFilter<Integer> bloomFilter = BloomFilter.create(Funnels.integerFunnel(), size,0.01);
namely , At this time, the error rate is 0.01. under these circumstances , Low level maintenance bit The length of the array is shown in the following figure
thus it can be seen , The lower the miscalculation rate , The longer the array maintained by the bottom layer , The more space it takes . therefore , The actual value of misjudgment rate , It depends on the load that the server can bear , It's not a slap in the head .
3、 The actual use redis The pseudocode is shown below
String get(String key) {
String value = redis.get(key);
if (value == null) {
if(!bloomfilter.mightContain(key)){
return null;
}else{
value = db.get(key);
redis.set(key, value);
}
}
return value;
} advantage :
- Simple thinking
- Guarantee consistency
- Strong performance
shortcoming
- Code complexity increases
- Another collection needs to be maintained to store cached Key
- The bloom filter does not support deleting values
边栏推荐
- SQL server memory management on cloud
- Analysis on the influence of "network security policy issued successively" on Enterprises
- What I regret most when I learn programming!
- A cigarette of time to talk with you about how novices transform from functional testing to advanced automated testing
- How to select cloud game platforms? Just pay attention to two points
- What is the difference between level 1, level 2 and level 3 domain names? How to register domain names
- Could not read username for xxxxx
- Risk management - Asset Discovery series - public web asset discovery
- Discussion on NFT Technology
- Double non students, self-taught programming, counter attack Baidu one year after graduation!
猜你喜欢
![[fault announcement] one stored procedure brings down the entire database](/img/7c/e5adda73a077fe4b8f04b59d1e0e1e.jpg)
[fault announcement] one stored procedure brings down the entire database

A cigarette of time to talk with you about how novices transform from functional testing to advanced automated testing

Enter the software test pit!!! Software testing tools commonly used by software testers software recommendations

Manual for automatic testing and learning of anti stepping pits, one for each tester

ServiceStack. Source code analysis of redis (connection and connection pool)
Oracle case: ohasd crash on AIX

创客教育给教师发展带来的挑战

Technology is a double-edged sword, which needs to be well kept

解读AI机器人产业发展的顶层设计
Fault analysis | using --force to batch import data leads to partial data loss
随机推荐
What is Druid
Get the short video! Batch download of Kwai video (with source code)
Discussion on NFT Technology
Easyrtc call error `failed to execute'send'on'rtcdatachannel'
Little transparent apprentice's way to go ashore
What is an enterprise mailbox domain name? How to register an enterprise mailbox domain name
How fast? Good province!
MySQL forgets root password cracking root password cracking all user passwords, shell script
Double non students, self-taught programming, counter attack Baidu one year after graduation!
Basic concepts of complex networks
How accurate are the two common methods of domain name IP query
Coding and codesign: make design and development easier
Introduction to QWidget attribute table in QT Designer
Kangaroo cloud: the overall architecture and key technical points of building a real-time computing platform based on Flink
Rhel8 series update image Yum source is Tencent cloud Yum source
Several methods for reinstalling the system:
Excellent tech sharing | research and application of Tencent excellent map in weak surveillance target location
Use of SAP QM inspection points
How to build a website after having a domain name? Can you ask others to help register the domain name
How to record the domain name reliably? What are the consequences of not filing a domain name?