当前位置:网站首页>To implement a task scheduling system, it is enough to read this article
To implement a task scheduling system, it is enough to read this article
2022-06-23 03:05:00 【Yongge Java actual combat sharing】
Read an article 「 Timing task framework selection 」 When , A netizen's message electric Here I am :
I have seen so many so-called tutorials , Most of them teach “ How to use tools ” Of , Not much is taught “ How to make tools ” Of , Able to teach “ How to copy tools ” Are already rare , China Software industry , What is missing is that you can “ Production tools ” The programmer , And there is absolutely no shortage of those “ Using tools ” The programmer ! ...... ” The last thing the industry needs is “ Will use XX Tool Engineer ”, It is “ Creative Software Engineers ”! All jobs in the industry , Nature is “ Creative Software Engineers ” Provided !
Write this article , I want to talk about task scheduling from head to toe , I hope that after reading , Be able to understand the core logic of a task scheduling system .
1 Quartz
Quartz Is a Java Open source task scheduling framework , And a lot of Java Engineers are exposed to the starting point of task scheduling .
The following figure shows the overall process of task scheduling :
Quartz The core of is three components .
- Mission :Job Used to represent a scheduled task ;
- trigger :Trigger The element that defines the scheduling time , That is, according to what time rule to perform the task . One Job Can be multiple Trigger relation , But one Trigger Only one... Can be associated Job;
- Scheduler : Factory class creation Scheduler, Schedule the task according to the time rules defined by the trigger .
In the code above Quartz Of JobStore yes RAMJobStore,Trigger and Job Stored in memory .
The core class for executing task scheduling is QuartzSchedulerThread .
- Scheduling threads from JobStore Get the trigger list to be executed in , And modify the state of the trigger ;
- Fire trigger , Modify trigger information ( The next time the trigger is executed , And trigger status ), And store it .
- Finally, create a specific execution task object , adopt worker Thread pool execution task .
Let's talk about Quartz Cluster deployment scheme based on XML .
Quartz Cluster deployment scheme based on XML , Different database types are required (MySQL , ORACLE) Create... On the database instance Quartz surface ,JobStore yes : JobStoreSupport .
This scheme is distributed , There is no node responsible for centralized management , Instead, the database row level lock is used to realize the concurrency control in the cluster environment .
scheduler The instance is first obtained in the cluster mode {0}LOCKS Row locks in tables ,Mysql Statement to obtain row lock :
{0} Will be replaced with the default configuration of the configuration file QRTZ_.sched_name Is the instance name of the application cluster ,lock_name It is the row level lock name .Quartz There are mainly two row level lock triggers to access locks (TRIGGER_ACCESS) and State access lock (STATE_ACCESS).
This architecture solves the problem of distributed task scheduling , Only one node can run on the same task , Other nodes will not perform tasks , When it comes to a lot of short tasks , Each node frequently competes for database locks , The more nodes, the worse the performance .
2 Distributed lock mode
Quartz The cluster mode of can be expanded horizontally , It can also be distributed , But the business party needs to add the corresponding table in the database , It is highly invasive to some extent .
Many R & D students want to avoid this kind of invasion , Also explored Distributed lock mode .
Business scenario : E-commerce projects , The user has not paid for a period of time after placing an order , The system will close the order after the timeout .
Usually we do a scheduled task every two minutes to check the orders for the first half hour , Query the order list without payment , Then restore the inventory of the goods in the order , Then set the order as invalid .
We use Spring Schedule The way to do a scheduled task .
@Scheduled(cron = "0 */2 * * * ? ")
public void doTask() {
log.info(" Timed tasks start ");
// Execute the operation of closing the order
orderService.closeExpireUnpayOrders();
log.info(" The timed mission ends ");
}The single server runs normally , Considering high availability , Business volume surges , The architecture will evolve into a cluster mode , stay At the same time There are multiple services performing a scheduled task , It may lead to business disorder .
The solution is when the task is executed , Use Redis Distributed locks are used to solve such problems .
@Scheduled(cron = "0 */2 * * * ? ")
public void doTask() {
log.info(" Timed tasks start ");
String lockName = "closeExpireUnpayOrdersLock";
RedisLock redisLock = redisClient.getLock(lockName);
// Try to lock , Waiting for the most 3 second , After the lock 5 Minutes auto unlock
boolean locked = redisLock.tryLock(3, 300, TimeUnit.SECONDS);
if(!locked){
log.info(" Distributed lock not obtained :{}" , lockName);
return;
}
try{
// Execute the operation of closing the order
orderService.closeExpireUnpayOrders();
} finally {
redisLock.unlock();
}
log.info(" The timed mission ends ");
}Redis Excellent read and write performance , Distributed locks are also better than Quartz Database row level locks are more lightweight . Of course Redis The lock can also be replaced by Zookeeper lock , The same mechanism .
In small projects , Use : Timing task framework (Quartz/Spring Schedule) and Distributed lock (redis/zookeeper) It has a good effect .
But what? ? We can see that this combination has two problems :
- The timed task has a runaway in the distributed scene , And the task can not be divided into pieces ;
- To trigger a task manually , Additional code must be added to complete .
3 ElasticJob-Lite frame
ElasticJob-Lite Positioning as a lightweight, decentralized solution , Use jar Provides coordination services for distributed tasks in the form of .
The application defines the task class internally , Realization SimpleJob Interface , Write the actual business process of your task .
public class MyElasticJob implements SimpleJob {
@Override
public void execute(ShardingContext context) {
switch (context.getShardingItem()) {
case 0:
// do something by sharding item 0
break;
case 1:
// do something by sharding item 1
break;
case 2:
// do something by sharding item 2
break;
// case n: ...
}
}
}give an example : application A There are five tasks to perform , Namely A,B,C,D,E. Mission E It needs to be divided into four sub tasks , The application is deployed on two machines .
application A After the start , 5 A task passed Zookeeper After coordination, it is allocated to two machines , adopt Quartz Scheduler Perform different tasks separately .
ElasticJob essentially , The underlying task is scheduled through Quartz , comparison Redis Distributed lock perhaps Quartz Distributed deployment , Its advantage is that it can rely on Zookeeper This big killer , The tasks are distributed to the... In the application through the load balancing algorithm Quartz Scheduler Containers .
From the user's point of view , It's very easy to use . But in terms of Architecture , The scheduler and the actuator are still on the same application side JVM Inside , And after the container is started , Load balancing is still needed . If the application restarts frequently , Constantly choose the Lord , Load balance the partitions , These are relative comparisons heavy The operation of .
in addition ,ElasticJob The console is rough , Display the job status by reading the registry data , Update the registry data and modify the global task configuration .
4 Centralization school
The principle of centralization is : Scheduling and task execution , Separate into two parts : Dispatch centers and actuators . The scheduling center module only needs to be responsible for task scheduling attributes , Trigger scheduling command . The actuator receives the scheduling command , To execute specific business logic , And both can be expanded in a distributed way .
4.1 MQ Pattern
First of all, let's talk about the first centralized structure I came into contact with in elong's promotion team .
Dispatch center dependency Quartz Cluster pattern , When the task is scheduled , Send a message to RabbitMQ . After the business application receives the task message , Consumption task information .
This model makes full use of MQ The characteristics of decoupling , Dispatch center sends tasks , The role of the application as an actuator , Receive the task and execute .
But this design relies heavily on message queuing , Scalability and functionality , The system load is greatly related to the message queue . This architectural design requires architects to be very familiar with message queuing .
4.2 XXL-JOB
XXL-JOB Is a distributed task scheduling platform , Its core design goal is rapid development 、 Learn easy 、 Lightweight 、 Easy to expand . Now open source and access to a number of companies online product lines , Open the box .
xxl-job 2.3.0 Architecture diagram
Let's focus on the architecture diagram :
▍ Network communication server-worker Model
Dispatch centers and actuators The communication between the two modules is server-worker Pattern . The dispatch center itself is a SpringBoot engineering , Start will listen 8080 port .
After the actuator is started , Will start the built-in service ( EmbedServer ) monitor 9994 port . So both sides can send orders to each other .
How does the dispatch center know the address information of the actuator ? Above picture , The actuator will send the registration command regularly , In this way, the dispatching center can obtain the online actuator list .
Through the actuator list , The node can be selected to execute the task according to the routing policy configured by the task . There are three common routing strategies :
- Random node execution : Select an available execution node in the cluster to execute the scheduling task . Applicable scenario : Offline order settlement .
- Broadcast execution : In the cluster, all the execution nodes distribute and execute the scheduling tasks . Applicable scenario : Batch update application local cache .
- Shard to perform : Split according to user-defined slicing logic , Distributed to different nodes in the cluster for parallel execution , Improve the efficiency of resource utilization . Applicable scenario : Massive log statistics .
▍ Scheduler
The scheduler is the core component of the task scheduling system .XXL-JOB The early versions of depend on Quartz.
But in v2.1.0 The version completely removes Quartz Dependence , The original needs to be created Quartz The table is also replaced by a self-developed table .
The core scheduling class is :JobTriggerPoolHelper . call start After the method , Will start two threads :scheduleThread and ringThread .
First scheduleThread The task to be scheduled will be loaded from the database on a regular basis , In essence, only one scheduling center node triggers task scheduling based on database row lock .
Connection conn = XxlJobAdminConfig.getAdminConfig()
.getDataSource().getConnection();
connAutoCommit = conn.getAutoCommit();
conn.setAutoCommit(false);
preparedStatement = conn.prepareStatement(
"select * from xxl_job_lock where lock_name = 'schedule_lock' for update");
preparedStatement.execute();
# Trigger task scheduling ( Pseudo code )
for (XxlJobInfo jobInfo: scheduleList) {
// Omit code
}
# Transaction submission
conn.commit();The scheduling thread will be based on the 「 Next trigger time 」, Take different actions :
Overdue tasks that need to be executed immediately , Directly put it into the thread pool to trigger execution , The tasks that need to be performed in five seconds are put into ringData In the object .
ringThread After starting , From time to time ringData Object to get the list of tasks to be executed , Put it into the thread pool to trigger execution .
5 Self study on the shoulders of giants
2018 year , I have a self-developed experience in task scheduling system .
The background is : Developed by the compatible technology team RPC frame , The technical team doesn't need to change the code ,RPC Annotation methods can be hosted in the task scheduling system , Perform directly as a task .
In the process of self research , Have studied XXL-JOB Source code , At the same time, Alibaba cloud distributed task scheduling SchedulerX Absorbed a lot of nutrition .
SchedulerX 1.0 Architecture diagram
- Schedulerx-console It is the console for task scheduling , Used to create 、 Manage scheduled tasks . Responsible for data creation 、 Modification and query . Within the product with schedulerx-server Interaction .
- Schedulerx-server It is the server of task scheduling , yes Scheduler Core components . Responsible for the scheduling and triggering of client tasks and the monitoring of task execution status .
- Schedulerx-client It is the client of task scheduling . Each application process connected to the client is one Worker. Worker Responsible for working with schedulerx-server Establish communication , Give Way schedulerx-server Discover the client machine . And to schedulerx-server Register the group where the current application is located , such schedulerx-server To regularly trigger tasks to the client .
We imitated SchedulerX Module , The architecture design is as follows :
I chose RocketMQ Source code communication module remoting As the communication framework of self-developed dispatching system . Based on the following two points :
- I am well-known in the industry Dubbo Not familiar with , and remoting I have made several wheels , I believe I can handle it ;
- In the reading SchedulerX 1.0 client Source code , Find out SchedulerX Communication framework and RocketMQ Remoting Many places are very similar . Its source code has ready-made engineering implementation , It's just a treasure .
I will RocketMQ remoting Module name service code , Made a certain degree of customization .
stay RocketMQ Of remoting in , Server using Processor Pattern .
The dispatch center needs to register two processors : Callback result handler CallBackProcessor And the heartbeat processor HeartBeatProcessor . The actuator needs to register the trigger task processor TriggerTaskProcessor .
public void registerProcessor(
int requestCode,
NettyRequestProcessor processor,
ExecutorService executor);Processor interface :
public interface NettyRequestProcessor {
RemotingCommand processRequest(
ChannelHandlerContext ctx,
RemotingCommand request) throws Exception;
boolean rejectRequest();
}For the communication framework , I don't need to pay attention to communication details , Just implement the processor interface .
To trigger the task processor TriggerTaskProcessor give an example :
After finishing the network communication , How the scheduler is designed ? In the end, I chose Quartz Cluster pattern . Mainly based on the following reasons :
- When the scheduling volume is small ,Quartz The cluster mode is stable enough , And can be compatible with the original XXL-JOB Mission ;
- Using the time wheel , I don't have enough practical experience , Worry about problems . in addition , How to make tasks pass through different scheduling services (schedule-server) Trigger , Need to have a coordinator . So I thought of Zookeeper. But in that case , New components have been introduced .
- The R & D cycle should not be too long , Want to get results quickly .
The self-developed dispatching service has been online for one and a half months . The system runs very stably , The access of the R & D team is also very smooth . The dispatch volume is not large , In four months, it was close to 4000 Wan to 5000 The dispatch volume between 10000 .
Frankly speaking , The bottleneck of self-developed version , I can often see in my mind . Large amount of data , I can handle the sub database and sub table , but Quartz The cluster is based on row level locking , It is doomed that the upper limit will not be too high .
In order to remove the confusion in my heart , I write a wheel DEMO See if you can work:
- Remove the external registry , The scheduler (schedule-server) Manage sessions ;
- introduce zookeeper, adopt zk Coordinate dispatching services . however HA The mechanism is very rough , It is equivalent to a task scheduling service running , Another service standby;
- Quartz Replace with a time wheel ( Reference resources Dubbo Time wheel source code in ).
This Demo Version can run in the development environment , But there are many details that need to be optimized , Just a toy , No chance to run into production environment .
I recently read an article from Alibaba cloud 《 How to realize million rule alarm through task scheduling 》,SchedulerX2.0 The high availability architecture is shown in the figure below :
The article mentions :
Each application will do three backups , adopt zk Grab the lock , One master and two standby , If a station Server Hang up , Will be carried out in failover, By other Server Take over the scheduling task .
In terms of architecture, the self-developed task scheduling system , Is not complicated , Realized XXL-JOB Core functions , Also compatible with the technical team RPC frame , But workflow and mapreduce Fragmentation .
SchedulerX The upgrade to the 2.0 Then based on the new Akka framework , This architecture It's called Implement high-performance workflow engine , Realize interprocess communication , Reduce network communication code .
In the open source task scheduling system I investigated ,PowerJob Is based on Akka framework , At the same time, workflow and MapReduce Execution mode .
I am right. PowerJob Very interested , I will also output relevant articles after learning and practice , Coming soon .
6 Technology selection
First, we will schedule tasks for open source products and commercial products SchedulerX Put together , Generate a cross reference table :
Quartz and ElasticJob In essence, it belongs to the framework level .
Centralized products are more clear in terms of architecture , The dispatching level is more flexible , It can support more complex scheduling (mapreduce Dynamic segmentation , workflow ).
XXL-JOB From the product level, we have achieved minimalism , Open the box , The scheduling mode can meet the needs of most R & D teams . Simple and easy to use + Can fight , So it is very popular .
In fact, each technical team has different technical reserves , The scene is also different , Therefore, technology selection cannot be generalized .
No matter which technology is used , When writing task business code , There are still two points to note :
- idempotent . When a task is repeated , Or when the distributed lock fails , The program can still output the correct results ;
- The task will not run away , Don't panic . Check the scheduling log ,JVM Level use Jstack Command to view the stack , The network communication needs to add a timeout , Generally, it can solve most problems .
7 Wrote last
2015 In fact, it was a very interesting year .ElasticJob and XXL-JOB These two different genres of task scheduling projects are open source .
stay XXL-JOB Source code , A dynamic screenshot of Mr. Xu Xueli in open source China is still preserved :
The task scheduling framework just written ,Web Dynamic management tasks , In real time , Hot . If there is no accident , It will be delivered to at noon tomorrow git.osc Up . ha-ha , Go downstairs and fry noodles with a poached egg to celebrate .
See this screenshot , There should be a kind of empathy in my heart , The corners of the mouth can't help rising .
I think of :2016 year ,ElasticJob Teacher Zhang Liang, the author of sharding-jdbc . I am here github Created a private project on , Reference resources sharding-jdbc Source code , Realize the function of sub database and sub table . The first class is called :ShardingDataSource, Time is fixed on 2016/3/29.
I don't know how to define “ Creative Software Engineers ”, But I believe : One who is curious , Study hard , Willing to share , Engineers willing to help others , I'm sure my luck won't be too bad .
边栏推荐
- Markdown - enter a score (typora, latex)
- What is the difference between JS undefined and null
- SAP mm initial transaction code MEK1 maintenance pb00 price
- Redis source code reading (I) general overview
- Why can only a small number of condition type prices be maintained in me12 of SAP mm?
- Golang string comparison
- Establishment of JMeter distributed pressure measurement environment
- Goframe framework (RK boot): Based on cloud native environment, distinguish configuration files (config)
- Pytest common summary
- MySQL gets the top 1 and top n records after grouping
猜你喜欢

C language series - Section 4 - arrays

Vulnhub DC-5

8. greed

Soft exam information system project manager_ Information system comprehensive testing and management - Senior Information System Project Manager of soft test 027

How to store, manage and view family photos in an orderly manner?

Soft exam information system project manager_ Contract Law_ Copyright_ Implementation Regulations - Senior Information System Project Manager of soft exam 030

6. template for integer and real number dichotomy

5. concept of ruler method
What is sitelock? What is the function?
随机推荐
Great WPF open source control library newbeecoder UI
CFS topics
HTTP cache
Related concepts of TTF, TOF, woff and woff2
Goframe framework (RK boot): fast implementation of server-side JWT verification
A bit about the state machine (FSM SMR DFSM)
Flowable refactoring process editor to obtain user information
5. concept of ruler method
Aiot application innovation competition -- I am the master of my project, and use gn+ninja to complete the system construction (vscode Development)
Concept and function of ES6 symbol
"Tianzhou II" successfully docked! Three minutes to understand the shocking black technology on "Tianzhou II"! Headlines
How does the easyplayer streaming video player set up tiling?
Capture passwords of all chrome versions
2022 opening H5 mobile page special effects
Integrated solution for intelligent supply chain platform management in rubber industry
Build a weather forecast applet using a widget
Goframe framework (RK boot): rapid configuration of server CORS
PowerShell automated reinstallation of cloudbase init to version 1.1.2
Implementing StdevP function of Excel with PHP
Optimization method of live weak network