当前位置：网站首页>Practice of dynamic load balancing based on open source tars

Practice of dynamic load balancing based on open source tars

2022-06-24 12:20:00 【2020labs assistant】

One 、 background

vivo In the practice of micro services, some businesses in the field of Internet have chosen... Based on the consideration of many comprehensive factors TARS Microservice framework .

The official description is ：TARS Is a multi language support 、 Embedded service governance capabilities , And Devops Micro service framework that can coordinate well . On the basis of open source, we have done a lot to adapt the internal system , For example, with CICD Build publishing system 、 Single sign on system to get through , But it's not the point we're going to talk about this time . Here I would like to focus on the dynamic load balancing algorithm that we implement in addition to the existing load balancing algorithm .

Two 、 What is load balancing

Wikipedia is defined as ： Load balancing （Load balancing） It's an electronic computer technology , Used on multiple computers （ Computer cluster ）、 network connections 、CPU、 Load distribution in disk drives or other resources , In order to optimize the use of resources 、 Maximize throughput 、 Minimize response time 、 Purpose of avoiding overload at the same time . Using multiple server components with load balancing , Replace a single component , Reliability can be improved by redundancy . Load balancing services are usually done by dedicated software and hardware . The main function is to allocate a large number of jobs reasonably to multiple operation units for execution , Used to solve the problem of high concurrency and high availability in Internet Architecture .

This passage is easy to understand , In essence, it is a method to solve the problem of traffic allocation when distributed services deal with a large number of concurrent requests .

3、 ... and 、TARS Which load balancing algorithms are supported

TARS Three load balancing algorithms are supported , Load balancing algorithm based on polling 、 Polling load balancing algorithm based on weight distribution 、 Uniformity hash Load balancing algorithm . The function entry is selectAdapterProxy, Code in TarsCpp In the document , If you are interested, you can learn more about this function .

3.1 Load balancing algorithm based on polling

The implementation of load balancing algorithm based on polling is very simple , The principle is to make all the services available ip Form a call list . When a request arrives, it is assigned to each machine in the request list one by one in chronological order , If it is assigned to the last node in the last list, the cycle starts again from the first node in the list . In this way, the purpose of traffic dispersion is achieved , Balance the load of each machine as much as possible , Improve the efficiency of the machine . This algorithm can basically satisfy a large number of distributed scenarios , This is also TARS The default load balancing algorithm .

But if the processing power of each node is different ？ Although the traffic is evenly distributed , But because there are weak nodes in the middle , These nodes still have the possibility of overload . So we have the following load balancing algorithm .

3.2 Polling load balancing algorithm based on weight distribution

As the name suggests, weight assignment is to assign a fixed weight to each node , This weight represents the probability that each node can be assigned traffic . for instance , Yes 5 Nodes , The weights of the configuration are 4,1,1,1,3, If there is 100 Please come here , Then the corresponding assigned traffic is also 40,10,10,10,30. In this way, the client requests are allocated according to the configured weight . Here's a detail to pay attention to , When implementing weighted polling, it must be smooth . That is to say, if there is 10 A request , Not before 4 It's the third time 1 A node .

There are many smooth weighted polling algorithms in the industry , Interested readers can search for information on their own .

3.3 Uniformity Hash

Most of the time, in some business scenarios with cache , In addition to the demand for average traffic distribution , At the same time, there is a requirement that the same client request should fall on the same node as much as possible .

Let's say there's a scenario , A business has 1000 Million users , Each user has an identity id And a set of user information . User ID id And user information is one-to-one correspondence , This mapping exists in DB in , And all other modules need to query this mapping relationship and get some necessary user field information from it . In a big concurrency scenario , Direct request DB The system must be unstoppable , So we naturally want to use the cache solution to solve . Does every node need to store the full amount of user information ？ While you can , But it's not the best plan , In case the user scale changes from 1000 Ten thousand rose to 1 Hundred million? ？ Obviously, as the number of users increases , Become stretched , Soon there will be bottlenecks or even inability to meet demand . So there's a need for consistency hash Algorithm to solve this problem . Uniformity hash The algorithm provides the guarantee that the request falls on the same node as much as possible under the same input .

Why say as much as possible ？ Because the node will fail and go offline , It may also be added due to capacity expansion , Uniformity hash The algorithm is able to minimize the cache reconstruction under such changes .TARS The use of hash There are two algorithms , One is right key seek md5 After value , Take the address offset to do XOR operation , The other is ketama hash.

Four 、 Why dynamic load balancing is needed ？

Most of our current services are based on virtual machines , So mixed deployment （ One node deploys multiple services ） It's a common phenomenon . In the case of mixed deployment , If a service code has bug It takes up a lot of CPU Or memory , Then the services deployed with him will be affected .

If the above three load balancing algorithms are still used , There's a problem , The affected machines will still allocate traffic according to the specified rules . Maybe some people will think , The polling load balancing algorithm based on weight can not configure the nodes with problems to have low weight and then allocate them to less traffic ？ It can , But this method is often not timely , If it happened in the middle of the night ？ And it needs to be configured manually after the fault is removed , Increased operation and maintenance costs . Therefore, we need a dynamic load balancing algorithm to automatically adjust the traffic distribution , Try to ensure the quality of service in this abnormal situation .

It's not hard to see from here that , To achieve the core of dynamic load balancing function, we only need to dynamically adjust the weight of different nodes according to the load of services . This is also a common practice in the industry , All of them get server status information periodically , Dynamically calculate the current weight of each server .

5、 ... and 、 Dynamic load balancing strategy

Here we also use the method of dynamic weight calculation for available nodes based on various load factors , Return the weight and reuse it TARS Static weight node selection algorithm . The load factors we choose are ： Interface 5 The average time of a minute / Interface 5 Minute timeout rate / Interface 5 Minute abnormal rate /CPU load / Memory usage / Network card load . Load factor supports dynamic expansion .

The overall function diagram is as follows ：

5.1 Overall interaction sequence diagram

rpc Invocation time ,EndpointManager Get the set of available nodes on a regular basis . Nodes have weight information . When the service initiates the call, it selects the corresponding node according to the load balancing algorithm specified by the service side ;
RegistrServer On a regular basis from db/ Monitor and learn to get information such as timeout rate and average time consumption . From other platforms （ such as CMDB） Get machine load class information , such as cpu/ Memory, etc. . All computation threads execute asynchronously and are cached locally ;
EndpointManager The selection strategy is executed according to the weight obtained . The following figure shows the impact of node weight change on request traffic allocation ：

5.2 Node update and load balancing strategy

All performance data of each node 60 Seconds to update , Use thread timing update ;
Calculate the weight value and value range of all nodes , In the memory cache ;
After getting the node weight information, the main call executes the current static weight load balancing algorithm to select the node ;
Out strategy ： If all nodes are the same or abnormal, the default method is polling ;

5.3 How the load is calculated

Load calculation method ： Each load factor sets the weight value and the corresponding importance level （ In percentage terms ）, Adjust the settings according to the specific importance , Finally, the total value will be calculated by multiplying the weight value of all load factors by the corresponding percentage . such as ： The weight of time consumption is 10, The weight of timeout rate is 20, The corresponding importance levels are 40% and 60%, Then the sum is 10 * 0.4 + 20 * 0.6 = 16. Each load factor is calculated as follows （ At present, we only use two load factors, average time consuming and timeout rate , It's also the easiest to TARS Data available in the current system ）：

1、 According to the proportion of each machine in the total time consumption, the weight is distributed in inverse proportion ： The weight = Initial weight *（ The sum of time - The average time of a single machine is ）/ The sum of time （ The disadvantage is that the traffic is not allocated according to the time consumption ratio ）;
2、 Timeout rate weight ： Timeout rate weight = Initial weight - Overtime rate * Initial weight * 90%, Conversion 90% Because 100% Overtime may also be caused by excessive traffic , Keep small traffic probing requests ;

The corresponding code is implemented as follows ：

void LoadBalanceThread::calculateWeight(LoadCache &loadCache)
{
    for (auto &loadPair : loadCache)
    {
        ostringstream log;
        const auto ITEM_SIZE(static_cast<int>(loadPair.second.vtBalanceItem.size()));
        int aveTime(loadPair.second.aveTimeSum / ITEM_SIZE);
        log << "aveTime: " << aveTime << "|"
            << "vtBalanceItem size: " << ITEM_SIZE << "|";
        for (auto &loadInfo : loadPair.second.vtBalanceItem)
        {
            //  According to the proportion of each machine in the total time consumption, the weight is distributed in inverse proportion ： The weight  =  Initial weight  *（ The sum of time  -  The average time of a single machine is ）/  The sum of time 
            TLOGDEBUG("loadPair.second.aveTimeSum: " << loadPair.second.aveTimeSum << endl);
            int aveTimeWeight(loadPair.second.aveTimeSum ? (DEFAULT_WEIGHT * ITEM_SIZE * (loadPair.second.aveTimeSum - loadInfo.aveTime) / loadPair.second.aveTimeSum) : 0);
            aveTimeWeight = aveTimeWeight <= 0 ? MIN_WEIGHT : aveTimeWeight;
            //  Timeout rate weight ： Timeout rate weight  =  Initial weight  -  Overtime rate  *  Initial weight  * 90%, Conversion 90% Because 100% Overtime may also be caused by excessive traffic , Keep small traffic probing requests 
            int timeoutRateWeight(loadInfo.succCount ? (DEFAULT_WEIGHT - static_cast<int>(loadInfo.timeoutCount * TIMEOUT_WEIGHT_FACTOR / (loadInfo.succCount           
+ loadInfo.timeoutCount))) : (loadInfo.timeoutCount ? MIN_WEIGHT : DEFAULT_WEIGHT));
            //  All kinds of weights are multiplied by corresponding proportions and then added to sum 
            loadInfo.weight = aveTimeWeight * getProportion(TIME_CONSUMING_WEIGHT_PROPORTION) / WEIGHT_PERCENT_UNIT
                              + timeoutRateWeight * getProportion(TIMEOUT_WEIGHT_PROPORTION) / WEIGHT_PERCENT_UNIT ;
 
            log << "aveTimeWeight: " << aveTimeWeight << ", "
                << "timeoutRateWeight: " << timeoutRateWeight << ", "
                << "loadInfo.weight: " << loadInfo.weight << "; ";
        }
 
        TLOGDEBUG(log.str() << "|" << endl);
    }
}

The related code is implemented in RegistryServer, The code file is shown below ：

The core implementation is LoadBalanceThread class , Welcome to correct .

5.4 Usage mode

stay Servant Management office configuration -w -v Parameters can support dynamic load balancing , If it is not configured, it is not enabled .

Here's the picture ：

Be careful ： All nodes need to be enabled to take effect , otherwise rpc It is found in the framework that different nodes adopt different load balancing algorithms to force all nodes to be polled .

6、 ... and 、 Scenarios for dynamic load balancing

If your service is running in Docker On the container , That may not require dynamic load balancing . Use it directly Docker The scheduling ability of the system can automatically scale services , Or directly deploy Docker The granularity of distribution is small , Let the service monopolize docker There is no question of interaction . If the services are deployed mixed , And the service rate may be affected by other services , For example, a service can directly cpu completely fill , It's suggested that this function be turned on .

7、 ... and 、 Next step

At present, only two factors, average time consuming and timeout rate, are considered in the implementation , To a certain extent, this can reflect the service capacity , But not completely . therefore , We will consider joining in the future cpu These indicators can better reflect the load of nodes . as well as , Some strategies for the main caller to adjust the weight according to the return code .

Finally, welcome to discuss with us , Together for TARS Open source makes contributions .