当前位置:网站首页>Talent column | can't use Apache dolphin scheduler? The most complete introductory tutorial written by the boss in a month

Talent column | can't use Apache dolphin scheduler? The most complete introductory tutorial written by the boss in a month

2022-07-23 12:23:00 Dolphin scheduler community

author | Ouyang Tao Zhaolian financial big data Development Engineer

Dolphin dispatch (Apache DolphinScheduler, Hereinafter referred to as" DS) It is distributed and easy to expand visualization DAG Workflow task scheduling system , Committed to solving the complex dependencies in the data processing process , Make scheduling system use out of the box in data processing flow .Apache DolphinScheduler As Apache Top open source projects , What is similar to other open source projects is , Its operation and installation begin with scripts .

The location of the script is the root directory script Under folder , The script execution sequence is as follows :

1. Check the startup script start-all.sh, Start can be found 4 The most important startup service , Namely dolphinscheduler-daemon.sh start master-server/worker-server/alert-server/api-server

2. stay dolphinscheduler-daemon.sh The script will execute first dolphinscheduler-env.sh Script , This script is used to introduce the environment , Include Hadoop、Spark、Flink、Hive Environment, etc . because DS These tasks need to be scheduled , If these environments are not introduced , Even if the scheduling is successful , The execution cannot succeed .

3. Then in dolphinscheduler-daemon.sh The script circulates the above 4 Modules bin/start.sh. As shown in the figure below :

As shown in the figure below : perform dolphinscheduler-daemon.sh start master-server I'll go when I get there master Modular src/main/bin perform start.sh, open start.sh after , It can be found that a MasterServer, other Worker,Alert as well as API Module equivalence .

thus , It's over from how the script runs the code , Next, we will introduce this in detail 4 The main purpose of the modules .Master Mainly responsible for DAG Task segmentation 、 Task submission monitoring , And listen to others at the same time Master and Worker Health status, etc ;Worker Mainly responsible for the implementation of tasks ;Alert Is responsible for warning service ;API be responsible for DS Add, delete, modify and check business logic , That is, the project management seen on the website 、 Resource management 、 Safety management and so on .

Actually , If you have been exposed to other big data projects , for example Flink、Hdfs、Hbase etc. , You will find that these architectures are similar , image hdfs yes NameNode and WorkNode The architecture of ;Hbase yes HMasterServer and HRegionServer The architecture of ;Flink yes JobManager and TaskManager Architecture, etc , If you can master these frameworks , It must be for DS It will be easier to master .

Master,Worker It's all through SpringBoot Start of , The objects created are also created by Spring trusteeship , If you usually contact Spring More words , Then I think you understand DS It will be easier than other open source projects .

remarks :

1、 There is another one in the running script python-gateway-server modular , This module uses python Code writing workflow , It is not within the scope of this article , So ignore for the time being , If you know this module in detail , Ask other students in the community .

2、 start-up Alert Script is execution Alert Under the module of alert-server Script for , because Alert It is also a parent module , I'm not going to talk about alert-server. I believe after watching Master and Worker After the execution process of ,Alert The module should not be difficult to understand .

3、 in addition , For the first time contact DS My classmates will find that Alert The module has a alert-api modular , What I want to say is this alert-api And what I said before api-server It doesn't matter at all ,api-server It's startup api Modular ApiApplicationServer Script , Responsible for the whole DS Of business logic , and alert-api Is responsible for the alarm spi Plug in interface for , open alert-api The module can send all the codes in it are interfaces and definitions , Not dealing with any logic , So it's still very easy to distinguish . Empathy task Under the module of task-api And alert-api Just the same responsibilities , It deals with different functions .

4、DS It's all about SpringBoot Managed , If there are students who haven't done it SpringBoot perhaps Spring Words , You can refer to the following websites and other relevant online materials .

https://spring.io/quickstart

If you want to learn more about the warning module , Please refer to the links below and consult other students .

https://dolphinscheduler.apache.org/zh-cn/blog/Hangzhou_cisco.html

Apache DolphinScheduler The address of the official website of the project is :https://github.com/apache/dolphinscheduler

Next chapter , The author will introduce it DS The two most important modules Master and Worker, And how they communicate , Coming soon .

原网站

版权声明
本文为[Dolphin scheduler community]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/204/202207230508030704.html

随机推荐