当前位置：网站首页>LETV group payment system architecture sharing for processing 100000 high concurrent orders per second

LETV group payment system architecture sharing for processing 100000 high concurrent orders per second

2022-06-22 16:42:00 【MarshalEagle】

With the continuous upgrading of leeco hardware rush purchase , LETV group is facing a 100 - or even 1000 fold increase in the demand pressure for payment . As the last link of commodity purchase , It is particularly important to ensure that users complete the payment quickly and stably . So in 15 year 11 month , We have comprehensively upgraded the architecture of the entire payment system , So it has stable processing per second 10 Million orders . It provides a strong support for various forms of rush buying and seckill activities of LETV ecology .

One 、 Warehouse sub table

stay redis,memcached In the Internet age when caching systems are popular , Building a system that supports 100000 reads per second is not complicated , It's nothing more than extending the cache node through a consistent hash , Horizontal expansion web The server etc. . The payment system has to process 100000 orders per second , What is needed is hundreds of thousands of database updates per second （insert Add update）, This is an impossible task on any independent database , So the first thing we need to do is to check the order form （ abbreviation order） Divide the database and tables .

When doing database operations , There are usually users ID（ abbreviation uid） Field , So we choose to uid Divide the database and tables .

We chose the strategy of sub Treasury “ Binary tree branch ”, So-called “ Binary tree branch ” refer to ： When we are expanding the database , It's all about 2 To expand the capacity . such as ：1 Taiwan expanded to 2 platform ,2 Taiwan expanded to 4 platform ,4 Taiwan expanded to 8 platform , And so on . The advantages of this method are , When we are expanding the capacity , just DBA Perform table level data synchronization , You don't need to write scripts to synchronize row level data .

It is not enough to have a sub - Treasury , After continuous stress testing, we found that , In the same database , The efficiency of concurrent updating multiple tables is much higher than that of concurrent updating one table , So in each sub database, we will order Table split into 10 Share ：order_0,order_1,….,order_9.

Finally, we put order The watch is on 8 Of the sub libraries （ Number 1 To 8, They correspond to each other DB1 To DB8）, In each sub warehouse 10 A minute table （ Number 0 To 9, They correspond to each other order_0 To order_9）, The deployment structure is shown in the following figure ：

Picture description

according to uid Calculate database number ：

Database number = (uid / 10) % 8 + 1

according to uid Calculation sheet No ：

Table number = uid % 10

When uid=9527 when , According to the algorithm above , Is actually the uid It's divided into two parts 952 and 7, among 952 model 8 Add 1 be equal to 1 Number the database , and 7 Is the table number . therefore uid=9527 The order information needs to go to DB1 In the library order_7 Look up . See the following figure for the specific algorithm flow ：

Picture description

With the structure and algorithm of sub database and sub table, the last step is to find the implementation tool of sub database and sub table , At present, there are about two types of sub database and sub table tools on the market ：

Client sub database and sub table , Complete database and table splitting on the client , Direct connection database
Use database and table splitting middleware , Client connected database and table middleware , The middleware completes the operations of dividing databases and tables

Both types of tools are available on the market , Here's not a list , Overall, these two types of tools have advantages and disadvantages . Because the client is directly connected to the database , Therefore, the performance is higher than that of the database and table middleware 15% To 20%. The use of database and table based middleware has unified middleware management , Separate database and table operations from clients , Clearer module division , Easy DBA Unified management .

We choose to divide databases and tables on the client side , Because we have developed and open source a set of data layer access framework , Its code name is “ Mango. ”, Mango framework natively supports the function of dividing databases and tables , And the configuration is very simple .

Mango homepage ：mango.jfaster.org
Mango source code ：github.com/jfaster/mango

Two 、 Order ID

Order system ID Must have globally unique characteristics , The simplest way is to use the sequence of the database , Each operation can obtain a globally unique auto increment ID, If you want to support processing per second 10 Ten thousand orders , That will need to generate at least 10 Ten thousand orders ID, Generate auto increment through database ID Obviously, the above requirements cannot be fulfilled . So we can only get globally unique orders through memory calculation ID.

JAVA The most famous one in the field ID Should be UUID 了 , however UUID Too long and contains letters , Not suitable as an order ID. Through repeated comparison and screening , We learned from Twitter Of Snowflake Algorithm , It realizes global uniqueness ID. Here is the order ID Simplified structure diagram ：

Picture description

The picture above is divided into 3 Parts of ：

Time stamp

Here, the granularity of timestamp is in the order of milliseconds , Generate order ID when , Use System.currentTimeMillis() As a time stamp .

Machine number

Each order server will be assigned a unique number , Generate order ID when , Directly use this unique number as the machine number .

Auto increment No

When there are multiple orders generated in the same millisecond on the same server ID The request of , This sequence number will be automatically increased in the current milliseconds , This sequence number continues from... For the next millisecond 0 Start . For example, on the same server in the same millisecond 3 Generate an order ID Request , this 3 Order per order ID The self incrementing serial numbers of will be 0,1,2.

above 3 A combination of parts , We can quickly generate globally unique orders ID. But it's not enough to be unique , Most of the time, we will only follow the order ID Directly query the order information , At this time, there is no uid, We don't know which sub database to query in the sub table , Traverse all tables of all libraries ？ This is obviously not going to work . So we need to add the information of sub database and sub table to the order ID On , The following is the order with sub warehouse and sub table information ID Simplified structure chart ：

Picture description

We are generating global orders ID The header adds sub database and sub table information , So only according to the order ID, We can also quickly query the corresponding order information .

What are the specific contents of sub database and sub table information ？ The first part discusses , We have arranged the order form according to uid The dimension is split into 8 A database , Every database 10 A watch , The simplest database and table information only needs a length of 2 String can be stored , The first 1 Bit store database number , Value range 1 To 8, The first 2 Bit storage table number , Value range 0 To 9.

Or according to the first part uid Algorithm for calculating database number and table number , When uid＝9527 when , Sub database information ＝1, Sub table information ＝7, Combine them , The two digit sub database sub table information is ”17”. See the following figure for the specific algorithm flow ：

Picture description

There is no problem in using the table number as the sub table information , But there are hidden dangers in using database number as sub database information , Consider future capacity expansion requirements , We need to 8 The library is expanded to 16 library , At this time, the value range is 1 To 8 The sub database information of will not be able to support 1 To 16 The sub database scene of , Sub database routing will not complete correctly , We abbreviate the appeal problem as the loss of sub database information accuracy .

In order to solve the problem of accuracy loss of sub database information , We need to redundancy the accuracy of sub database information , That is, the sub database information we save now should support future capacity expansion . Here we assume that we will eventually expand to 64 Station database , So the new sub database information algorithm is ：

Sub database information = (uid / 10) % 64 + 1

When uid＝9527 when , According to the new algorithm , Sub database information =57, there 57 Is not a real database number , It redundantly extends to 64 Accuracy of sub database information in a database . We only have 8 Station database , The actual database number needs to be calculated according to the following formula ：

Actual database number = ( Sub database information - 1) % 8 + 1

When uid＝9527 when , Sub database information ＝57, Actual database number ＝1, Sub database sub table information =”577”.

Because we choose the module 64 To save the sub database information after accuracy redundancy , The length of saving sub database information is determined by 1 Change into 2, The length of the last sub database and sub table information is 3. See the following figure for the specific algorithm flow ：

Picture description

As shown in the figure above , The module is used to calculate the sub database information 64 The method redundancy the accuracy of sub database information , In this way, when our system needs to be expanded to 16 library ,32 library ,64 No more problems with the library .

The order above ID The structure can well meet our current and future expansion needs , But given the uncertainty of the business , We are placing an order ID At the front of the 1 Bit is used to identify the order ID Version of , This version number belongs to redundant data , It is not used at present . Here is the final order ID Simplified structure chart ：

Picture description

Snowflake Algorithm ：github.com/twitter/snowflake

3、 ... and 、 Final consistency

up to now , We pass the right order surface uid The repository table of the dimension , Realized order Ultra high concurrent write and update of tables , And can pass uid And orders ID Query order information . But as an open group payment system , We also need to go through the line of business ID（ Also known as merchant ID, abbreviation bid） To query the order information , So we introduced bid Dimensional order Table cluster , take uid Dimensional order One copy of the table cluster redundancy to bid Dimensional order Table in the cluster , According to bid When querying order information , Just check bid Dimensional order Just cluster the tables .

The above scheme is simple , But keep two order Data consistency of table cluster is a troublesome thing . The two table clusters are obviously in different database clusters , If a highly consistent distributed transaction is introduced in write and update , This will undoubtedly reduce the efficiency of the system , Increase service response time , This is what we cannot accept , So we introduce message queuing for asynchronous data synchronization , To achieve the final consistency of data . Of course, various exceptions of the message queue will also cause data inconsistency , So we introduced real-time monitoring services , Calculate the data difference between the two clusters in real time , And do consistency synchronization .

The following is a simplified consistency synchronization diagram ：

Picture description

Four 、 Database high availability

No machine or service can guarantee the stable operation on the line without failure . For example, at a certain time , The main database of a database is down , At this time, we will not be able to read or write to the Library , Online services will be affected .

The so-called database high availability refers to ： When the database has problems for various reasons , It can recover database services and repair data in real time or quickly , From the perspective of the whole cluster , It's like there's no problem . It should be noted that , The recovery database service here does not necessarily mean repairing the original database , It also includes switching services to another standby database .

The main work of database high availability is database recovery and data patching , Generally, we take the time to complete these two tasks , As a measure of high availability . There is a vicious circle of problems , The longer the database recovery time , The more inconsistent data , The longer it takes to patch data , The overall repair time will become longer . Therefore, rapid database recovery has become the top priority of database high availability , Imagine if we could have a database failure 1 Complete database recovery in seconds , Fixing inconsistent data and costs will also be greatly reduced .

The following figure is a classic master-slave structure ：

Picture description

The picture above shows 1 platform web The server and 3 Station database , among DB1 It's the main warehouse ,DB2 and DB3 It's from the library . Let's assume here that web The server is maintained by the project team , The database server consists of DBA maintain .

When from the library DB2 When something goes wrong ,DBA The project team will be notified , The project team will DB2 from web Delete... From the configuration list of the service , restart web The server , This is the wrong node DB2 Will no longer be accessed , The entire database service is restored , etc. DBA Repair DB2 when , Then the project team will DB2 Add to web service .

When the main library DB1 When something goes wrong ,DBA Will DB2 Switch to main library , And inform the project team , The project team uses DB2 Replace the original main library DB1, restart web The server , such web The service will use the new master library DB2, and DB1 Will no longer be accessed , The entire database service is restored , etc. DBA Repair DB1 when , then DB1 As DB2 From the library .

The classical structure above has great disadvantages ： No matter what happens to the master or slave database , Need to be DBA Cooperate with the project team to complete database service recovery , It's hard to automate , And the restoration work is too slow .

We think , Database operation and maintenance should be separated from the project team , When there is a problem with the database , Should be DBA Achieve unified recovery , There is no need for the project team to operate services , It's easy to automate , Shorten service recovery time .

Let's first look at the highly available structure of the library ：

Picture description

As shown in the figure above ,web The server will no longer connect directly from the library DB2 and DB3, It's about connecting LVS Load balancing , from LVS Connect from library . The advantage of this is LVS Can automatically sense whether the slave library is available , Slave Library DB2 After downtime ,LVS The read data request will not be sent to DB2. meanwhile DBA When you need to add or remove slave nodes , Just operate independently LVS that will do , There is no longer a need for project groups to update configuration files , Restart the server to cooperate .

Let's look at the high availability structure diagram of the main database ：

Picture description

As shown in the figure above ,web The server will no longer be directly connected to the main library DB1, It's about connecting KeepAlive A virtual thing ip, Make this virtual again ip Map to main library DB1 On , Simultaneous addition DB_bak Slave Library , Real time synchronization DB1 Data in . Under normal circumstances web Still DB1 Read and write data in , When DB1 After downtime , The script will automatically DB_bak Set as main library , And will virtual ip Mapping to DB_bak On ,web The service will use healthy DB_bak Read and write access as the main library . It only takes a few seconds , Can complete the main database service recovery .

Combine the structure above , Get master-slave high availability structure chart ：

Picture description

Database high availability also includes data patching , Because when we operate the core data , All log first and then update , In addition, it realizes the fast recovery of database services in near real time , So the amount of data patched is not large , A simple recovery script can quickly complete data repair .

5、 ... and 、 Data classification

In addition to the most core payment order table and payment flow table in the payment system , There are also some configuration information tables and some user related information tables . If all the read operations are completed on the database , The performance of the system will be greatly reduced , So we introduced the data classification mechanism .

We simply divide the data of payment system into 3 level ：

The first 1 level ： Order data and payment flow data ; These two pieces of data require high real-time and accuracy , So no cache is added , Read and write operations will directly operate the database .

The first 2 level ： User related data ; These data are related to users , It has the characteristics of reading more and writing less , So we use redis Cache .

The first 3 level ： Payment configuration information ; It's not about users , Small amount of data , Frequent reading , Features that are barely modified , So we use local memory for caching .

There is a data synchronization problem using the local memory cache , Because the configuration information is cached in memory , And the local memory can't perceive the modification of configuration information in the database , This will cause the data in the database and the data in the local memory is inconsistent .

To solve this problem , We have developed a highly available message push platform , When the configuration information is modified , We can use push platform , Push profile update message to all servers of payment system , The server will automatically update the configuration information after receiving the message , And give success feedback .

6、 ... and 、 Thick and thin pipes

The hacker attacks , Some reasons, such as front-end retry, may cause the request volume to soar , If our service is killed by a surge of requests , Want to restore , It is a very painful and tedious process .

A simple example , Our current order processing capacity is average 10 Ten thousand orders per second , Peak value 14 Ten thousand orders per second , If there is 100 Ten thousand order requests enter the payment system , There is no doubt that our whole payment system will collapse , The continuous requests will make our service cluster unable to start at all , The only way is to cut off all the traffic , Restart the entire cluster , Then slowly import the traffic .

We're outside web Add a layer to the server “ Thick and thin pipes ”, Can solve the above problems very well .

The following is a simple structure diagram of thick and thin pipelines ：

Picture description

Please see the structure diagram above ,http The request is entering web Before cluster , It will pass through a layer of thick and thin pipes first . The entrance end is rough , We set the maximum support 100 Ten thousand requests per second , Redundant requests will be discarded directly . The exit end is a slit , We set it up for web colony 10 Ten thousand requests per second . remainder 90 Ten thousand requests will be queued in thick and thin pipes , wait for web After the cluster has processed the old request , There will be new requests coming out of the pipeline , to web Cluster processing . such web The number of requests processed by the cluster per second will never exceed 10 ten thousand , Under this load , All services in the cluster will run in Colleges and universities , The whole cluster will not stop service due to the increasing requests .

How to realize thick and thin pipes ？nginx Support already exists in the Business Edition , Please search for relevant information

nginx max_conns, It should be noted that max_conns Is the number of active connections , In addition to the specific settings need to determine the maximum TPS Outside , It is also necessary to determine the average response time .

原网站

版权声明
本文为[MarshalEagle]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/173/202206221525109865.html

当前位置：网站首页>LETV group payment system architecture sharing for processing 100000 high concurrent orders per second

LETV group payment system architecture sharing for processing 100000 high concurrent orders per second

边栏推荐

猜你喜欢

随机推荐