当前位置:网站首页>Talent column | can't use Apache dolphin scheduler? The most complete introductory tutorial written by the boss in a month
Talent column | can't use Apache dolphin scheduler? The most complete introductory tutorial written by the boss in a month
2022-07-23 12:23:00 【Dolphin scheduler community】

author | Ouyang Tao Zhaolian financial big data Development Engineer
Dolphin dispatch (Apache DolphinScheduler, Hereinafter referred to as" DS) It is distributed and easy to expand visualization DAG Workflow task scheduling system , Committed to solving the complex dependencies in the data processing process , Make scheduling system use out of the box in data processing flow .Apache DolphinScheduler As Apache Top open source projects , What is similar to other open source projects is , Its operation and installation begin with scripts .
The location of the script is the root directory script Under folder , The script execution sequence is as follows :
1. Check the startup script start-all.sh, Start can be found 4 The most important startup service , Namely dolphinscheduler-daemon.sh start master-server/worker-server/alert-server/api-server
2. stay dolphinscheduler-daemon.sh The script will execute first dolphinscheduler-env.sh Script , This script is used to introduce the environment , Include Hadoop、Spark、Flink、Hive Environment, etc . because DS These tasks need to be scheduled , If these environments are not introduced , Even if the scheduling is successful , The execution cannot succeed .
3. Then in dolphinscheduler-daemon.sh The script circulates the above 4 Modules bin/start.sh. As shown in the figure below :

As shown in the figure below : perform dolphinscheduler-daemon.sh start master-server I'll go when I get there master Modular src/main/bin perform start.sh, open start.sh after , It can be found that a MasterServer, other Worker,Alert as well as API Module equivalence .

thus , It's over from how the script runs the code , Next, we will introduce this in detail 4 The main purpose of the modules .Master Mainly responsible for DAG Task segmentation 、 Task submission monitoring , And listen to others at the same time Master and Worker Health status, etc ;Worker Mainly responsible for the implementation of tasks ;Alert Is responsible for warning service ;API be responsible for DS Add, delete, modify and check business logic , That is, the project management seen on the website 、 Resource management 、 Safety management and so on .
Actually , If you have been exposed to other big data projects , for example Flink、Hdfs、Hbase etc. , You will find that these architectures are similar , image hdfs yes NameNode and WorkNode The architecture of ;Hbase yes HMasterServer and HRegionServer The architecture of ;Flink yes JobManager and TaskManager Architecture, etc , If you can master these frameworks , It must be for DS It will be easier to master .
Master,Worker It's all through SpringBoot Start of , The objects created are also created by Spring trusteeship , If you usually contact Spring More words , Then I think you understand DS It will be easier than other open source projects .
remarks :
1、 There is another one in the running script python-gateway-server modular , This module uses python Code writing workflow , It is not within the scope of this article , So ignore for the time being , If you know this module in detail , Ask other students in the community .
2、 start-up Alert Script is execution Alert Under the module of alert-server Script for , because Alert It is also a parent module , I'm not going to talk about alert-server. I believe after watching Master and Worker After the execution process of ,Alert The module should not be difficult to understand .
3、 in addition , For the first time contact DS My classmates will find that Alert The module has a alert-api modular , What I want to say is this alert-api And what I said before api-server It doesn't matter at all ,api-server It's startup api Modular ApiApplicationServer Script , Responsible for the whole DS Of business logic , and alert-api Is responsible for the alarm spi Plug in interface for , open alert-api The module can send all the codes in it are interfaces and definitions , Not dealing with any logic , So it's still very easy to distinguish . Empathy task Under the module of task-api And alert-api Just the same responsibilities , It deals with different functions .
4、DS It's all about SpringBoot Managed , If there are students who haven't done it SpringBoot perhaps Spring Words , You can refer to the following websites and other relevant online materials .
If you want to learn more about the warning module , Please refer to the links below and consult other students .
https://dolphinscheduler.apache.org/zh-cn/blog/Hangzhou_cisco.html
Apache DolphinScheduler The address of the official website of the project is :https://github.com/apache/dolphinscheduler
Next chapter , The author will introduce it DS The two most important modules Master and Worker, And how they communicate , Coming soon .
边栏推荐
- 数据挖掘场景-发票虚开
- NLP自然语言处理-机器学习和自然语言处理介绍(二)
- Using pycaret for data mining: association rule mining
- With statement
- ARM架构与编程4--串口(基于百问网ARM架构与编程教程视频)
- 【Autosar CP通用 1.如何阅读Autosar官方文档】
- 数据分析的重要性
- Data analysis of time series (III): decomposition of classical time series
- 高电压技术基础知识
- Eigen multi version library installation
猜你喜欢

论文解读:《i4mC-Deep: 利用具有化学特性的深度学习方法,对 N4-甲基胞嘧啶位点进行智能预测》

Installation and use of APP automated testing tool appium

Solution to schema verification failure in saving substantive examination request

NVIDIA 英伟达发布H100 GPU,水冷服务器适配在路上

Gartner调查研究:中国的数字化发展较之世界水平如何?高性能计算能否占据主导地位?

ARM架构与编程3--按键控制LED(基于百问网ARM架构与编程教程视频)

Use steps of Charles' packet capturing

时间序列的数据分析(一):主要成分

CPC client installation tutorial

Comparison between pytorch and paddlepaddle -- Taking the implementation of dcgan network as an example
随机推荐
Importance of data analysis
NVIDIA 英伟达发布H100 GPU,水冷服务器适配在路上
论文解读:《BERT4Bitter:一种基于transformer(BERT)双向编码器表示用于改善苦肽预测的基础模型》
Interpretation of the paper: "bert4bitter: a basic model for improving bitter peptide prediction based on transformer (BERT) bidirectional encoder representation"
2021信息科学Top10发展态势。深度学习?卷积神经网络?
使用pycaret来进行数据挖掘:关联规则挖掘
Integrate all lvgl controls into one project (lvgl6.0 version)
LVGL8.1版本笔记
CPC客户端的安装教程
利用or-tools来求解带容量限制的路径规划问题(CVRP)
绿色数据中心:风冷GPU服务器和水冷GPU服务器综合分析
高电压技术试题及答案
Using pycaret: low code, automated machine learning framework to solve regression problems
论文解读:《i4mC-Deep: 利用具有化学特性的深度学习方法,对 N4-甲基胞嘧啶位点进行智能预测》
How to build a liquid cooling data center is supported by blue ocean brain liquid cooling technology
Hard disk partition of obsessive-compulsive disorder
Interpretation of the paper: "deep-4mcw2v: sequence based predictor for identifying N4 methylcytosine (4mc) sites in E. coli"
Gaode positioning - the problem that the permission pop-up box does not appear
读写文件数据
How to develop the computing power and AI intelligent chips in the data center of "digital computing in the East and digital computing in the west"?