当前位置:网站首页>Bloom filter
Bloom filter
2022-06-25 19:09:00 【Daiyuanpei】
One 、 Introduce
The bloom filter is a binary array , Its function is to judge whether a number exists in this array ,0 Does not exist ,1 Indicates presence , As shown in the figure below :
Two 、 add to
If you want to deposit “ Hello ” This string , First of all, it will go through n A hash function calculates , The accountant calculates different hash values , Then map these hash values into the array , The binary number of the corresponding position is modified to 1, As shown in the figure below :
Subscript to be 3、5、7 The position of was changed to 1
3、 ... and 、 Inquire about
For example, query “ Hello ” This string , First, through n A hash function calculates that the corresponding array position is 3、5、7, The binary number of these three positions must be 1 To express “ Hello ” This string does not exist , As long as one of these three positions is not 1, Then this string does not exist .
Four 、 Delete
The bloon filter has a false deletion in the deletion operation .
For example, now there are two strings “ Hello ” and “hello”, If you go through a series of hash operations , Finally, it is concluded that the position of both in the array is subscript 2 The location of , Then this subscript position represents two data at the same time ,“ Hello ” and “hello”, As shown in the figure below :
When you want to delete “ Hello ” When this string , Will subscript 2 The position of the binary number is changed to 0, Said to delete , But it will also delete “hello” This string , Therefore, the bloom filter has the phenomenon of false deletion .
5、 ... and 、 Advantages and disadvantages
1、 advantage
1、 The bloom filter consists of binary arrays , So it takes up very little space
2、 Query and insert operations are fast , The position calculated according to the hash value can be found in the array
The corresponding time complexity is O(n),n Depends on the number of hash functions , If there is only one hash function , So time complexity is O(1)
3、 Confidentiality is good. , All stored are binary data , Don't store the data itself
2、 shortcoming
1、 There is a false deletion during the deletion operation
2、 Due to different data, the calculated hash value may be the same , So there are misjudgments
hypothesis “hello” and “ Hello ” The hash value calculated from the two data is the same , But at this time, only... Exists in the bloom filter “ Hello ” This data , When you want to query “hello” when , When the calculated hash value looks for a position in the array, it is found that the position is 1, The bloom filter will tell “hello” This data exists , But it doesn't really exist .
6、 ... and 、 Miscalculation rate
When using Bloom filter, you can set a parameter error rate :
1、 The lower the misjudgment rate , The less likely it is to misjudge , But the more hash functions , The more space it takes up
2、 The greater the misjudgment rate , The more likely it is to misjudge , But the fewer hash functions , The smaller the occupied space
Why use more than one hash function , And to use multiple hash functions ?
1、 If there is only one hash function , Then the hash values calculated from different data have a great probability of being the same , It is easy to misjudge
2、 If you use multiple hash functions , Each hash function uses a different hash algorithm , In this case , The calculated subscript values are not easy to be the same
So the more hash functions , The more hash values are calculated , The lower the error rate , But the larger the corresponding space
7、 ... and 、 solve Redis Cache penetration problem
1、 Cache penetration refers to , The requested data does not exist in Redis in , Causes the request to reach the database directly , The database cannot carry too much traffic, resulting in downtime .
2、 The bloom filter can be used to store all the data that can be accessed key, There is no the key Directly filtered , There is key Then further query the cache and database
When the request arrives, first look for Redis The bloom filter , If you don't hit , Go straight back to ( If the data is not in the bloom filter, the probability is not in the database ), In this way, a large number of invalid requests will not reach the database
If it hits , Just to go Redis And the database
边栏推荐
- [C language practice - print the upper triangle and its deformation (with blank version)]
- shell-跳出循环-shift参数左移-函数的使用
- Analysis on China's aluminum foil output, trade and enterprise leading operation in 2021: dongyangguang aluminum foil output is stable [figure]
- Tcp/ip test questions (III)
- PHP Chinese regular
- Oriental Wealth function (the most complete edition of Childe Yong)
- Current situation of China's hydraulic cylinder industry in 2020 (with application fields, policies and regulations, supply and demand status and enterprise pattern) [figure]
- Server journey from scratch - Yu Zhongxian integrated version (IP access server, LNMP compilation and installation, Lua environment and socket expansion)
- SEO outsourcing reliable company, enterprise SEO outsourcing company which reliable?
- English name of each stage of software version
猜你喜欢
Analysis on the development trend of China's intense pulsed light equipment industry in 2021: the market scale is growing, and the proportion of imported brands is large [figure]
What is an operator?
[today in history] June 25: the father of notebook was born; Windows 98 release; First commercial use of generic product code
Analysis of global tea production, consumption and import and export trade: China's tea production ranks first in the world [figure]
Svn introduction and Usage Summary
Apifox简单了解——WEB端测试的集大成者
On location and scale in CNN
158 Bar _ Modèle Power bi utilise Dax + SVG pour créer des diagrammes d'affaires presque toutes les possibilités
最新數據挖掘賽事方案梳理!
Ali visual AI training camp -day03- construction of electronic photo album (face and expression recognition)
随机推荐
PHP database connection version1.1
[elt.zip] openharmony paper Club - memory compression for data intensive applications
Genicam gentl standard ver1.5 (1)
Leetcode-101-symmetric binary tree
ECS 7-day practical training camp (Advanced route) -- day04 -- build a portal using ECs and polardb
Google cloud SSH enable root password login
Guangzhou Sinovel interactive creates VR Exhibition Hall panoramic online virtual exhibition hall
Network security detection and prevention test questions (V)
Guangzhou Sinovel interactive VR panorama brings development to all walks of life
Embark on a new journey and reach the world with wisdom
QQ机器人闪照转发/撤回消息转发【最新beta2版本】
MySQL prompt performance_ Schema missing table
Favorite PHP debugging methods
Elastic high-performance computing on the cloud supports the rapid development of the life science industry, reducing costs and increasing efficiency
Paddleocr learning (II) paddleocr detection model training
Web development solution to cross domain problems
Solve the problem that sublime Text3 package control cannot install plug-ins
On Oracle full stack virtual machine -- graalvm
QQ robot: self forbidden words management of group members [latest beta2 version]
PHP Chinese regular