当前位置:网站首页>LETV group payment system architecture sharing for processing 100000 high concurrent orders per second

LETV group payment system architecture sharing for processing 100000 high concurrent orders per second

2022-06-22 16:42:00 MarshalEagle


With the continuous upgrading of leeco hardware rush purchase , LETV group is facing a 100 - or even 1000 fold increase in the demand pressure for payment . As the last link of commodity purchase , It is particularly important to ensure that users complete the payment quickly and stably . So in 15 year 11 month , We have comprehensively upgraded the architecture of the entire payment system , So it has stable processing per second 10 Million orders . It provides a strong support for various forms of rush buying and seckill activities of LETV ecology .

One 、 Warehouse sub table

stay redis,memcached In the Internet age when caching systems are popular , Building a system that supports 100000 reads per second is not complicated , It's nothing more than extending the cache node through a consistent hash , Horizontal expansion web The server etc. . The payment system has to process 100000 orders per second , What is needed is hundreds of thousands of database updates per second (insert Add update), This is an impossible task on any independent database , So the first thing we need to do is to check the order form ( abbreviation order) Divide the database and tables .

When doing database operations , There are usually users ID( abbreviation uid) Field , So we choose to uid Divide the database and tables .

We chose the strategy of sub Treasury “ Binary tree branch ”, So-called “ Binary tree branch ” refer to : When we are expanding the database , It's all about 2 To expand the capacity . such as :1 Taiwan expanded to 2 platform ,2 Taiwan expanded to 4 platform ,4 Taiwan expanded to 8 platform , And so on . The advantages of this method are , When we are expanding the capacity , just DBA Perform table level data synchronization , You don't need to write scripts to synchronize row level data .

It is not enough to have a sub - Treasury , After continuous stress testing, we found that , In the same database , The efficiency of concurrent updating multiple tables is much higher than that of concurrent updating one table , So in each sub database, we will order Table split into 10 Share :order_0,order_1,….,order_9.

Finally, we put order The watch is on 8 Of the sub libraries ( Number 1 To 8, They correspond to each other DB1 To DB8), In each sub warehouse 10 A minute table ( Number 0 To 9, They correspond to each other order_0 To order_9), The deployment structure is shown in the following figure :

 Picture description

according to uid Calculate database number :

Database number = (uid / 10) % 8 + 1

according to uid Calculation sheet No :

Table number = uid % 10

When uid=9527 when , According to the algorithm above , Is actually the uid It's divided into two parts 952 and 7, among 952 model 8 Add 1 be equal to 1 Number the database , and 7 Is the table number . therefore uid=9527 The order information needs to go to DB1 In the library order_7 Look up . See the following figure for the specific algorithm flow :

 Picture description

With the structure and algorithm of sub database and sub table, the last step is to find the implementation tool of sub database and sub table , At present, there are about two types of sub database and sub table tools on the market :

  1. Client sub database and sub table , Complete database and table splitting on the client , Direct connection database
  2. Use database and table splitting middleware , Client connected database and table middleware , The middleware completes the operations of dividing databases and tables

Both types of tools are available on the market , Here's not a list , Overall, these two types of tools have advantages and disadvantages . Because the client is directly connected to the database , Therefore, the performance is higher than that of the database and table middleware 15% To 20%. The use of database and table based middleware has unified middleware management , Separate database and table operations from clients , Clearer module division , Easy DBA Unified management .

We choose to divide databases and tables on the client side , Because we have developed and open source a set of data layer access framework , Its code name is “ Mango. ”, Mango framework natively supports the function of dividing databases and tables , And the configuration is very simple .

  • Mango homepage :mango.jfaster.org
  • Mango source code :github.com/jfaster/mango

Two 、 Order ID

Order system ID Must have globally unique characteristics , The simplest way is to use the sequence of the database , Each operation can obtain a globally unique auto increment ID, If you want to support processing per second 10 Ten thousand orders , That will need to generate at least 10 Ten thousand orders ID, Generate auto increment through database ID Obviously, the above requirements cannot be fulfilled . So we can only get globally unique orders through memory calculation ID.

JAVA The most famous one in the field ID Should be UUID 了 , however UUID Too long and contains letters , Not suitable as an order ID. Through repeated comparison and screening , We learned from Twitter Of Snowflake Algorithm , It realizes global uniqueness ID. Here is the order ID Simplified structure diagram :

 Picture description

The picture above is divided into 3 Parts of :

  1. Time stamp

Here, the granularity of timestamp is in the order of milliseconds , Generate order ID when , Use System.currentTimeMillis() As a time stamp .

  1. Machine number

Each order server will be assigned a unique number , Generate order ID when , Directly use this unique number as the machine number .

  1. Auto increment No

When there are multiple orders generated in the same millisecond on the same server ID The request of , This sequence number will be automatically increased in the current milliseconds , This sequence number continues from... For the next millisecond 0 Start . For example, on the same server in the same millisecond 3 Generate an order ID Request , this 3 Order per order ID The self incrementing serial numbers of will be 0,1,2.

above 3 A combination of parts , We can quickly generate globally unique orders ID. But it's not enough to be unique , Most of the time, we will only follow the order ID Directly query the order information , At this time, there is no uid, We don't know which sub database to query in the sub table , Traverse all tables of all libraries ? This is obviously not going to work . So we need to add the information of sub database and sub table to the order ID On , The following is the order with sub warehouse and sub table information ID Simplified structure chart :

 Picture description

We are generating global orders ID The header adds sub database and sub table information , So only according to the order ID, We can also quickly query the corresponding order information .

What are the specific contents of sub database and sub table information ? The first part discusses , We have arranged the order form according to uid The dimension is split into 8 A database , Every database 10 A watch , The simplest database and table information only needs a length of 2 String can be stored , The first 1 Bit store database number , Value range 1 To 8, The first 2 Bit storage table number , Value range 0 To 9.

Or according to the first part uid Algorithm for calculating database number and table number , When uid=9527 when , Sub database information =1, Sub table information =7, Combine them , The two digit sub database sub table information is ”17”. See the following figure for the specific algorithm flow :

 Picture description

There is no problem in using the table number as the sub table information , But there are hidden dangers in using database number as sub database information , Consider future capacity expansion requirements , We need to 8 The library is expanded to 16 library , At this time, the value range is 1 To 8 The sub database information of will not be able to support 1 To 16 The sub database scene of , Sub database routing will not complete correctly , We abbreviate the appeal problem as the loss of sub database information accuracy .

In order to solve the problem of accuracy loss of sub database information , We need to redundancy the accuracy of sub database information , That is, the sub database information we save now should support future capacity expansion . Here we assume that we will eventually expand to 64 Station database , So the new sub database information algorithm is :

Sub database information = (uid / 10) % 64 + 1

When uid=9527 when , According to the new algorithm , Sub database information =57, there 57 Is not a real database number , It redundantly extends to 64 Accuracy of sub database information in a database . We only have 8 Station database , The actual database number needs to be calculated according to the following formula :

Actual database number = ( Sub database information - 1) % 8 + 1

When uid=9527 when , Sub database information =57, Actual database number =1, Sub database sub table information =”577”.

Because we choose the module 64 To save the sub database information after accuracy redundancy , The length of saving sub database information is determined by 1 Change into 2, The length of the last sub database and sub table information is 3. See the following figure for the specific algorithm flow :

 Picture description

As shown in the figure above , The module is used to calculate the sub database information 64 The method redundancy the accuracy of sub database information , In this way, when our system needs to be expanded to 16 library ,32 library ,64 No more problems with the library .

The order above ID The structure can well meet our current and future expansion needs , But given the uncertainty of the business , We are placing an order ID At the front of the 1 Bit is used to identify the order ID Version of , This version number belongs to redundant data , It is not used at present . Here is the final order ID Simplified structure chart :

 Picture description

Snowflake Algorithm :github.com/twitter/snowflake

3、 ... and 、 Final consistency

up to now , We pass the right order surface uid The repository table of the dimension , Realized order Ultra high concurrent write and update of tables , And can pass uid And orders ID Query order information . But as an open group payment system , We also need to go through the line of business ID( Also known as merchant ID, abbreviation bid) To query the order information , So we introduced bid Dimensional order Table cluster , take uid Dimensional order One copy of the table cluster redundancy to bid Dimensional order Table in the cluster , According to bid When querying order information , Just check bid Dimensional order Just cluster the tables .

The above scheme is simple , But keep two order Data consistency of table cluster is a troublesome thing . The two table clusters are obviously in different database clusters , If a highly consistent distributed transaction is introduced in write and update , This will undoubtedly reduce the efficiency of the system , Increase service response time , This is what we cannot accept , So we introduce message queuing for asynchronous data synchronization , To achieve the final consistency of data . Of course, various exceptions of the message queue will also cause data inconsistency , So we introduced real-time monitoring services , Calculate the data difference between the two clusters in real time , And do consistency synchronization .

The following is a simplified consistency synchronization diagram :

 Picture description

Four 、 Database high availability

No machine or service can guarantee the stable operation on the line without failure . For example, at a certain time , The main database of a database is down , At this time, we will not be able to read or write to the Library , Online services will be affected .

The so-called database high availability refers to : When the database has problems for various reasons , It can recover database services and repair data in real time or quickly , From the perspective of the whole cluster , It's like there's no problem . It should be noted that , The recovery database service here does not necessarily mean repairing the original database , It also includes switching services to another standby database .

The main work of database high availability is database recovery and data patching , Generally, we take the time to complete these two tasks , As a measure of high availability . There is a vicious circle of problems , The longer the database recovery time , The more inconsistent data , The longer it takes to patch data , The overall repair time will become longer . Therefore, rapid database recovery has become the top priority of database high availability , Imagine if we could have a database failure 1 Complete database recovery in seconds , Fixing inconsistent data and costs will also be greatly reduced .

The following figure is a classic master-slave structure :

 Picture description

The picture above shows 1 platform web The server and 3 Station database , among DB1 It's the main warehouse ,DB2 and DB3 It's from the library . Let's assume here that web The server is maintained by the project team , The database server consists of DBA maintain .

When from the library DB2 When something goes wrong ,DBA The project team will be notified , The project team will DB2 from web Delete... From the configuration list of the service , restart web The server , This is the wrong node DB2 Will no longer be accessed , The entire database service is restored , etc. DBA Repair DB2 when , Then the project team will DB2 Add to web service .

When the main library DB1 When something goes wrong ,DBA Will DB2 Switch to main library , And inform the project team , The project team uses DB2 Replace the original main library DB1, restart web The server , such web The service will use the new master library DB2, and DB1 Will no longer be accessed , The entire database service is restored , etc. DBA Repair DB1 when , then DB1 As DB2 From the library .

The classical structure above has great disadvantages : No matter what happens to the master or slave database , Need to be DBA Cooperate with the project team to complete database service recovery , It's hard to automate , And the restoration work is too slow .

We think , Database operation and maintenance should be separated from the project team , When there is a problem with the database , Should be DBA Achieve unified recovery , There is no need for the project team to operate services , It's easy to automate , Shorten service recovery time .

Let's first look at the highly available structure of the library :

 Picture description

As shown in the figure above ,web The server will no longer connect directly from the library DB2 and DB3, It's about connecting LVS Load balancing , from LVS Connect from library . The advantage of this is LVS Can automatically sense whether the slave library is available , Slave Library DB2 After downtime ,LVS The read data request will not be sent to DB2. meanwhile DBA When you need to add or remove slave nodes , Just operate independently LVS that will do , There is no longer a need for project groups to update configuration files , Restart the server to cooperate .

Let's look at the high availability structure diagram of the main database :

 Picture description

As shown in the figure above ,web The server will no longer be directly connected to the main library DB1, It's about connecting KeepAlive A virtual thing ip, Make this virtual again ip Map to main library DB1 On , Simultaneous addition DB_bak Slave Library , Real time synchronization DB1 Data in . Under normal circumstances web Still DB1 Read and write data in , When DB1 After downtime , The script will automatically DB_bak Set as main library , And will virtual ip Mapping to DB_bak On ,web The service will use healthy DB_bak Read and write access as the main library . It only takes a few seconds , Can complete the main database service recovery .

Combine the structure above , Get master-slave high availability structure chart :

 Picture description

Database high availability also includes data patching , Because when we operate the core data , All log first and then update , In addition, it realizes the fast recovery of database services in near real time , So the amount of data patched is not large , A simple recovery script can quickly complete data repair .

5、 ... and 、 Data classification

In addition to the most core payment order table and payment flow table in the payment system , There are also some configuration information tables and some user related information tables . If all the read operations are completed on the database , The performance of the system will be greatly reduced , So we introduced the data classification mechanism .

We simply divide the data of payment system into 3 level :

The first 1 level : Order data and payment flow data ; These two pieces of data require high real-time and accuracy , So no cache is added , Read and write operations will directly operate the database .

The first 2 level : User related data ; These data are related to users , It has the characteristics of reading more and writing less , So we use redis Cache .

The first 3 level : Payment configuration information ; It's not about users , Small amount of data , Frequent reading , Features that are barely modified , So we use local memory for caching .

There is a data synchronization problem using the local memory cache , Because the configuration information is cached in memory , And the local memory can't perceive the modification of configuration information in the database , This will cause the data in the database and the data in the local memory is inconsistent .

To solve this problem , We have developed a highly available message push platform , When the configuration information is modified , We can use push platform , Push profile update message to all servers of payment system , The server will automatically update the configuration information after receiving the message , And give success feedback .

6、 ... and 、 Thick and thin pipes

The hacker attacks , Some reasons, such as front-end retry, may cause the request volume to soar , If our service is killed by a surge of requests , Want to restore , It is a very painful and tedious process .

A simple example , Our current order processing capacity is average 10 Ten thousand orders per second , Peak value 14 Ten thousand orders per second , If there is 100 Ten thousand order requests enter the payment system , There is no doubt that our whole payment system will collapse , The continuous requests will make our service cluster unable to start at all , The only way is to cut off all the traffic , Restart the entire cluster , Then slowly import the traffic .

We're outside web Add a layer to the server “ Thick and thin pipes ”, Can solve the above problems very well .

The following is a simple structure diagram of thick and thin pipelines :

 Picture description

Please see the structure diagram above ,http The request is entering web Before cluster , It will pass through a layer of thick and thin pipes first . The entrance end is rough , We set the maximum support 100 Ten thousand requests per second , Redundant requests will be discarded directly . The exit end is a slit , We set it up for web colony 10 Ten thousand requests per second . remainder 90 Ten thousand requests will be queued in thick and thin pipes , wait for web After the cluster has processed the old request , There will be new requests coming out of the pipeline , to web Cluster processing . such web The number of requests processed by the cluster per second will never exceed 10 ten thousand , Under this load , All services in the cluster will run in Colleges and universities , The whole cluster will not stop service due to the increasing requests .

How to realize thick and thin pipes ?nginx Support already exists in the Business Edition , Please search for relevant information

nginx max_conns, It should be noted that max_conns Is the number of active connections , In addition to the specific settings need to determine the maximum TPS Outside , It is also necessary to determine the average response time .

原网站

版权声明
本文为[MarshalEagle]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/173/202206221525109865.html