当前位置：网站首页>Redis source code reading (I) general overview

Redis source code reading (I) general overview

2022-06-23 02:49:00 【Star sink】

Learning the source code is a process from simple to profound , And then to the process of gradual simplification , The so-called “ Read the book thick , Read the book thin ” The process of .

” The west wind withered the trees last night . Alone on tall buildings , Look to the end of the world ”, Let's start with Redis Characteristics of 、 Usage and data type are introduced Redis, Have a general understanding of it .

1. Redis Characteristics and Application

Definition （ key word ）：C Language 、 Open source 、key-value、NoSQL、 Memory based

characteristic ：

High performance , Memory based , Efficient reading and writing , Concurrent 10W QPS
Single process single / Multithreading , Thread safety ,IO Multiplexing Single thread here means Redis A thread is used to process the request . Strictly speaking , from Redis4.0 After that, it's not a single thread , Apart from the main route , There are also background threads processing some slow operations , For example, cleaning up dirty data 、 The release of useless connections 、 Big key Delete, etc ;Redis 6.0 The multi thread part introduced is only used to handle the reading and writing of network data and protocol parsing , Executing commands is still single threaded .
Rich data types （string/list/hash/set/zset/stream）
Data persistence （RDB、AOF）
High availability （ Master slave copy 、 Sentinel mechanism 、 colony ）

purpose ：

database
cache
Distributed lock
Message middleware

2. Redis Data type of

(1) Six basic data types

data type	Underlying data structure	Common commands	Application scenarios	remarks
String	int, embstr, raw	set, mset, get, append, setbit ...	【 character string 】 All things can be String	character string , Binary security
Hash	ht, ziplist	hget, hset, hegetall, hkeys, hexists ...	【 Store the object 】 Manipulate user attributes	hash , Key value pair set ,map
List	linkedlist, ziplist	lpush/lpop, rpush/rpop, lrange ...	【 Additions and deletions quickly 】 The latest news ;【 In order 】 Message queue	list , Double linked list
Set	intset, ht	sadd, spop, sinter, sunion, sdiff ...	【 Support delivery / and / Difference set 】 Common friends ;【 No repetition 】 Statistics visit website IP	aggregate , No repetition
ZSet	ziplist, skipklist	zadd, zrange, zrem, zcard, zrank ...	【 Orderly 】 Ranking List ;【score】 Weighted message queue 、 Delay queue	Ordered set , Press score Sort
Stream	listpack, rax	xadd, xrange, xrevrange, xgroup, xread ...	【 Message queue 】IRC / Real time chat system , IoT Data collection	Data structure similar to log , It is essentially a message queue

(2) Three extended data types based on basic data types

data type	Dependent data structures	Common commands	Application scenarios	remarks
Bitmap	String	setbit, getbit, bitop, bitcount, bitpos	Statistics of active users ; Count the number of user logins on a certain day	Storage and objects ID Associated space efficient and high performance Boolean information
Hyperloglog	String	pfadd, pfcount, pfmerge	Base Statistics ; Statistics daily visits IP Count / page UV Count / Number of online users, etc	bitmap Upgraded version ; Probability algorithm , Do not store the data set itself directly , The probability statistics method estimates the base value
Geo	ZSet	geoadd, geohash, geopos, geodist, georadius	Location based services （LBS）	Redis3.2 Later versions

Be careful ：

In general, it can bitmap and hyperloglog In combination with ,bitmap Identify which users are active ,hyperloglog Count
Geo It is recommended to use separate Redis Instance deployment （ In project development , There are special ones for longitude and latitude Redis, And business use Redis Separate ）

A. Bitmap

bitmap Not the actual data type is string A set of faces on type bit Set of operations .

because string It's binary safe , And their maximum length is 512m, therefore bitmap Maximum setting 2^32 Different bit.

8 *1024*1024* 512 = 2^32 = 4294967296 = 40+ Billion

advantage ：

You can save a lot of space when storing information . For example, in a system , Different users get a growing user ID Express .40 Billion （2^32=4*1024*1024*1024≈40 Billion ） Users only need 512M Memory can remember certain information , For example, whether the user has logged in .

operation ：

A single constant time bit operation , For example, put a bit Set to 0 perhaps 1. Or get a bit Value .
- SETBIT： Set the value
- GETBIT： Value
To a group bit The operation of . For example, within a given range bit Statistics （ For example, demography ）.
- BITOP： Perform two different string Bit manipulation , Include AND,OR,XOR and NOT
- BITCOUNT： The value of the statistics bit is 1 The number of
- BITPOS： Address first 0 perhaps 1 Of bit The location of

Use scenarios ：

Various real-time analysis
Storage and objects ID Associated space efficient and high performance Boolean information for example ： Count the longest continuous time of users visiting the website Count the number of user logins on a certain day （ Take the log of the day plus a fixed prefix as key, Build a bitmap, Every bit of binary is a user ID The logo of ）

B. Hyperloglog

hyperLogLog yes bitmap Upgraded version . It is essentially a probabilistic algorithm , Do not store the data set itself directly , Instead, a certain probability and statistical method is used to estimate the base value . This method can save a lot of memory , At the same time, ensure that the error is controlled within a certain range .

To be encoded as Redis character string . So you can call GET Command serializes a Hyperloglog(HLL), You can also call SET Command to deserialize it to redis The server .HLL Of API Similar use SETS Data structures do the same task ,SETS In structure , adopt SADD Command to add each observed element to a SET aggregate , use SCARD Command check SET Number of elements in the collection , The elements in the collection are unique , Existing elements will not be added repeatedly .

While using HLL Is not really adding items to HLL in （ This and SETS Very different structure ）, because HLL The data structure of contains only one state without the actual element .

operation ：

PFADD： Used to add a new element to the statistics .
PFCOUNT： Used to get the passed so far PFADD Approximate number of unique elements added by the command .
PFMERGE： Execute multiple HLL Joint operation between .

Use scenarios ：

Count the number of bases （ A lot of ）
- Statistical registration IP Count
- Statistics daily visits IP Count
- Statistics page real time UV Count
- Count the number of online users
- Count the number of different terms searched by users every day
In general, it can bitmap and hyperloglog In combination with ,bitmap Identify which users are active ,hyperloglog Count

C. Geo

The underlying data type ：zset

Redis Of GEO Characteristic in Redis3.2 Released in version , This function can give users a given geographical location （ Longitude and latitude ） Information stored , And operate on this information .

GEO Relevant orders only 6 individual ：

GEOADD：GEOADD key longitude latitude member [longitude latitude member …], Geospatial location to be specified （ latitude 、 longitude 、 name ） Add to specified key in
GEOHASH：GEOHASH key member [member …], Returns the criteria for one or more location elements Geohash value , It can be http://geohash.org/ Use
GEOPOS：GEOPOS key member [member …], from key Returns the location of all the positioning elements （ Longitude and latitude ）
GEODIST：GEODIST key member1 member2 [unit], Returns the distance between two given positions .GEODIST The command assumes that the earth is perfectly spherical when calculating distance . In extreme cases , This assumption will lead to 0.5% The error of the .
GEORADIUS：GEORADIUS key longitude latitude radius m|km|ft|mi [WITHCOORD][WITHDIST] [WITHHASH][COUNT count], Centered on a given latitude and longitude , Return key contains position elements , All position elements whose distance from the center does not exceed the given maximum distance . This command can query the surrounding city groups of a city .
GEORADIUSBYMEMBER：GEORADIUSBYMEMBER key member radius m|km|ft|mi [WITHCOORD][WITHDIST] [WITHHASH][COUNT count], This command and GEORADIUS command , Can find the elements in the specified range , however GEORADIUSBYMEMBER The center point of is determined by a given location element , Not like it GEORADIUS like that , Use the entered longitude and latitude to determine the center point .

Use scenarios ：

Location based services （LBS）

GEO The underlying data structure of type is Zset Realized .

for example , Store the vehicle / Store longitude and latitude information , The element is the vehicle / The store ID, The weight of the element Sore It should be longitude and latitude information , but Sore Should be float type , therefore , You need to code a set of longitude and latitude （ namely GeoHash code ）.

Brief steps ：

Step 1： Change longitude / The latitude is divided into two parts , Get the binary tree structure , And carry on 0/1 code , Re pass N position bit For storage （N The bigger it is , The higher the accuracy ）;

Step 2： The longitude and latitude of N position bit Cross combine , obtain GeoHash value .

GeoHash The basic principle of coding is “ Dichotomous interval , Interval coding ”, First encode longitude and latitude respectively , Then the longitude and latitude codes are combined into a final code . Simply speaking ,GeoHash Divide a space into small squares , We can query the grid around a given latitude and longitude 4 A or 8 A square , In this way “ Look around ”.【GeoHash Similar values , Not necessarily in close proximity , Therefore, it is necessary to calculate the neighbor nodes , In order to improve the LBS precision 】

GeoHash Encoding rules

Be careful ：

In project development , You will see a Redis Separate them out , Used to calculate longitude and latitude . If the amount of data exceeds 100 million , You need to be right about Geo Split the data , By country / province / City split , Even split by Region , To reduce the cost of a single zset The size of the collection .

【 reason ： In a map application , There may be millions of data , If you use Geo, The location information will all be placed in one zset Collection . stay Redis In the cluster environment , aggregate It is possible to migrate from one node to another , If single key The data is too large , This will cause problems such as cluster migration jams , Affect the normal operation of online services . therefore , Suggest Geo Using separate Redis Instance deployment .】

example ：

#  newly added 
127.0.0.1:6379> geoadd city 114.06667 22.61667 "shenzhen" 119.30000 26.08333 "fuzhou"
(integer) 2

# zset type 
127.0.0.1:6379> TYPE city
zset

#  standard Geohash value , Can be in http://geohash.org/ Use .
127.0.0.1:6379> geohash city shenzhen fuzhou
1) "ws10ethzdh0"
2) "wssu6srd7k0"

#  obtain key Longitude and latitude of 
127.0.0.1:6379> geopos city shenzhen fuzhou
1) 1) "114.06667023897171021"
   2) "22.61666928352524764"
2) 1) "119.29999798536300659"
   2) "26.08332883679719316"

#  Calculated distance 
127.0.0.1:6379> geodist city shenzhen fuzhou km
"655.5342"

#  Calculation range 
127.0.0.1:6379> georadius city 116 15 1000 km
1) "shenzhen"
127.0.0.1:6379> georadius city 116 15 2000 km
1) "shenzhen"
2) "fuzhou"
127.0.0.1:6379> georadius city 116 15 2000 km asc
1) "shenzhen"
2) "fuzhou"

原网站

版权声明
本文为[Star sink]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/01/202201281618543870.html