当前位置:网站首页>Bloom filter
Bloom filter
2022-06-25 19:09:00 【Daiyuanpei】
One 、 Introduce
The bloom filter is a binary array , Its function is to judge whether a number exists in this array ,0 Does not exist ,1 Indicates presence , As shown in the figure below :

Two 、 add to
If you want to deposit “ Hello ” This string , First of all, it will go through n A hash function calculates , The accountant calculates different hash values , Then map these hash values into the array , The binary number of the corresponding position is modified to 1, As shown in the figure below :

Subscript to be 3、5、7 The position of was changed to 1
3、 ... and 、 Inquire about
For example, query “ Hello ” This string , First, through n A hash function calculates that the corresponding array position is 3、5、7, The binary number of these three positions must be 1 To express “ Hello ” This string does not exist , As long as one of these three positions is not 1, Then this string does not exist .
Four 、 Delete
The bloon filter has a false deletion in the deletion operation .
For example, now there are two strings “ Hello ” and “hello”, If you go through a series of hash operations , Finally, it is concluded that the position of both in the array is subscript 2 The location of , Then this subscript position represents two data at the same time ,“ Hello ” and “hello”, As shown in the figure below :

When you want to delete “ Hello ” When this string , Will subscript 2 The position of the binary number is changed to 0, Said to delete , But it will also delete “hello” This string , Therefore, the bloom filter has the phenomenon of false deletion .
5、 ... and 、 Advantages and disadvantages
1、 advantage
1、 The bloom filter consists of binary arrays , So it takes up very little space
2、 Query and insert operations are fast , The position calculated according to the hash value can be found in the array
The corresponding time complexity is O(n),n Depends on the number of hash functions , If there is only one hash function , So time complexity is O(1)
3、 Confidentiality is good. , All stored are binary data , Don't store the data itself
2、 shortcoming
1、 There is a false deletion during the deletion operation
2、 Due to different data, the calculated hash value may be the same , So there are misjudgments
hypothesis “hello” and “ Hello ” The hash value calculated from the two data is the same , But at this time, only... Exists in the bloom filter “ Hello ” This data , When you want to query “hello” when , When the calculated hash value looks for a position in the array, it is found that the position is 1, The bloom filter will tell “hello” This data exists , But it doesn't really exist .
6、 ... and 、 Miscalculation rate
When using Bloom filter, you can set a parameter error rate :
1、 The lower the misjudgment rate , The less likely it is to misjudge , But the more hash functions , The more space it takes up
2、 The greater the misjudgment rate , The more likely it is to misjudge , But the fewer hash functions , The smaller the occupied space
Why use more than one hash function , And to use multiple hash functions ?
1、 If there is only one hash function , Then the hash values calculated from different data have a great probability of being the same , It is easy to misjudge
2、 If you use multiple hash functions , Each hash function uses a different hash algorithm , In this case , The calculated subscript values are not easy to be the same
So the more hash functions , The more hash values are calculated , The lower the error rate , But the larger the corresponding space
7、 ... and 、 solve Redis Cache penetration problem
1、 Cache penetration refers to , The requested data does not exist in Redis in , Causes the request to reach the database directly , The database cannot carry too much traffic, resulting in downtime .
2、 The bloom filter can be used to store all the data that can be accessed key, There is no the key Directly filtered , There is key Then further query the cache and database
When the request arrives, first look for Redis The bloom filter , If you don't hit , Go straight back to ( If the data is not in the bloom filter, the probability is not in the database ), In this way, a large number of invalid requests will not reach the database
If it hits , Just to go Redis And the database
边栏推荐
- On Oracle full stack virtual machine -- graalvm
- JVM|运行时数据区(堆空间)
- 中金财富安全吗? 开户需要多久
- Web development solution to cross domain problems
- 网络安全检测与防范 测试题(一)
- shell-跳出循环-shift参数左移-函数的使用
- Detailed explanation of oauth2 - Introduction (I)
- QQ机器人闪照转发/撤回消息转发【最新beta2版本】
- Server journey from scratch - Yu Zhongxian integrated version (IP access server, LNMP compilation and installation, Lua environment and socket expansion)
- 想知道新股民怎样炒股票开户?在线开户安全么?
猜你喜欢

Ruffian Heng embedded semimonthly issue 57

What should I pay attention to in GoogleSEO content station optimization?

Laravel validation rule followed Role of auth:: id()

【C语言练习——打印上三角及其变形(带空格版)】

Cutting feet to fit shoes - talking about the ramp reconstruction on the track

Divine reversion EA

PHP Chinese regular
![Current situation of China's hydraulic cylinder industry in 2020 (with application fields, policies and regulations, supply and demand status and enterprise pattern) [figure]](/img/2e/439b5dce9634d4015430c9cf06c5de.jpg)
Current situation of China's hydraulic cylinder industry in 2020 (with application fields, policies and regulations, supply and demand status and enterprise pattern) [figure]

Why are life science enterprises on the cloud in succession?

Kotlin Compose 终结toDo项目 点击可以编辑修改todo
随机推荐
2017 reading (word memory)
158_模型_Power BI 使用 DAX + SVG 打通制作商業圖錶幾乎所有可能
User management and permissions
Solve the problem that sublime Text3 package control cannot install plug-ins
Svn introduction and Usage Summary
TCP/IP 测试题(二)
Apifox simple understanding -- the integrator of web side testing
MySQL view explanation
Laravel validation rule followed Role of auth:: id()
广州华锐互动VR全景为各行各业带来发展
请问通达信开户安全吗?
Google cloud SSH enable root password login
ECS 7-day practical training camp (Advanced route) -- day04 -- build a portal using ECs and polardb
Kotlin compose terminate todo project Click to edit and modify todo
网络安全检测与防范 测试题(一)
Cutting feet to fit shoes - talking about the ramp reconstruction on the track
最新數據挖掘賽事方案梳理!
Tcp/ip test questions (I)
Regular expression summary
[elt.zip] openharmony paper Club - memory compression for data intensive applications