当前位置:网站首页>Distributed background task load balancing

Distributed background task load balancing

2022-06-24 05:23:00 tech

One . introduction

System services can be simply divided into interface services according to the operation mode ( External use tcp/http And so on ) And background tasks ( One time backstage / Periodically trigger tasks ). For interface services , Use load balancing in distributed scenarios (DNS/NAT/ARP And other methods are optional software load balancing methods ) It is a common way to improve system availability and scalability .

Load balancing in the interface scenario

For background tasks , Is there a corresponding load balancing method in the distributed scenario , It can improve the availability and scalability of the system ?

How to realize load balancing in task scenarios

Two . Background task load balancing

A typical application scenario for background tasks is the producer - Consumer model , hypothesis task Publish as a relational database .

Preemptive

The preemptive type means each consumer Get the task and lock it , Make sure that the task is not used by others consumer Repeated consumption .

For the scenario of task publishing using a relational database , Status bits can be introduced into tasks to avoid repeated consumption of tasks . The following is a general example of task state reversal . When the task is released , Will experience CREATED, DELETED, READY Three states , Task consumption , Will experience RUNNING, SUCCESS, FAIL Three states . among READY The status is preemptable ,consumer Set to... Immediately after getting RUNNING, And set according to the running results SUCCESS perhaps FAIL.

Negotiation

Preemptive needs to transform the task , Introduce the status bit . In the negotiation mode, there is no need to change the task , But by introducing a registry , Perform a consistent hash distribution task to each consumer.

The following is an example of using a relational database as a registry .service_name Identify the consumer type ,instance_name Use {host_ip}:{process_id} Identify each consumer task process ,expired_time_stamp Identify the effective deadline of the consumer task process ,created_time_stamp Identify the registration time of the consumer task process .

Every consumer can get the global registration view , And use the consistent hash method to obtain the corresponding task .

instance_name

expired_time_stamp

created_time_stamp

10.104.63.157/1082

2021-08-09 01:22:39

2021-08-08 13:22:39

10.104.63.157/1083

2021-08-09 01:22:38

2021-08-08 13:22:38

10.104.63.157/1085

2021-08-09 01:22:38

2021-08-08 13:22:38

10.104.63.157/1087

2021-08-09 01:22:38

2021-08-08 13:22:38

Consistent Hashing , It refers to hash mapping for both consumers and pending tasks , The task to be processed belongs to the nearest clockwise consumer . Compared with ordinary hash , Consistent hash has good monotonicity , That is, when new consumers are added , The pending data will be mapped to new consumers or existing consumers .

The following figure shows the data mapping relationship between consumer addition and consumer failure scenarios , It can be seen that the consistent hash greatly reduces the data migration in the expansion and contraction scenarios .

3、 ... and . Expand and summarize

In a distributed task system , Use the above method for load balancing , There are two other issues to consider

Exactly Once problem

Whether it is preemptive or negotiated task distribution , What is offered is at least once Service for . Most scenes can be done exactly once, But in extreme scenarios, there is still the possibility that tasks may be consumed repeatedly . On the one hand, we can introduce task lock Mechanism to ensure exactly once, On the one hand, the business side can also eliminate the impact of repeated consumption of tasks .

Multi queue problem

Multi queue means that tasks are of different types , Each task is a task queue . For preemptive mode, you can introduce task_type To achieve ; For the negotiation type, you can introduce in the registry task_name To achieve , Every kind of task_type That is, a hash distribution ring is formed .

task_type

instance_name

expired_time_stamp

created_time_stamp

task_a

10.104.63.157/1082

2021-08-09 01:22:39

2021-08-08 13:22:39

task_a

10.104.63.157/1083

2021-08-09 01:22:38

2021-08-08 13:22:38

task_b

10.104.63.157/1085

2021-08-09 01:22:38

2021-08-08 13:22:38

task_b

10.104.63.157/1087

2021-08-09 01:22:38

2021-08-08 13:22:38

Sum up , Preemption and coordination are applied in practice , Can provide high availability and easy horizontal expansion , There is no need for tasks to communicate with each other , Easy to implement .

Implementation features

Implementation mechanism

Whether it supports multiple machines and processes

Do you support exactly once

Whether multiple pairs of columns are supported

Preemptive

State machine

Support

at least once

Tasks can be introduced task_type Support

Coordinated

Service registration Consistent Hashing

Support

at least once

The registry can import task_type

原网站

版权声明
本文为[tech]所创,转载请带上原文链接,感谢
https://yzsam.com/2021/08/20210814222110952d.html