当前位置:网站首页>Talent column | can't use Apache dolphin scheduler? The most complete introductory tutorial written by the boss in a month
Talent column | can't use Apache dolphin scheduler? The most complete introductory tutorial written by the boss in a month
2022-07-23 12:23:00 【Dolphin scheduler community】

author | Ouyang Tao Zhaolian financial big data Development Engineer
Dolphin dispatch (Apache DolphinScheduler, Hereinafter referred to as" DS) It is distributed and easy to expand visualization DAG Workflow task scheduling system , Committed to solving the complex dependencies in the data processing process , Make scheduling system use out of the box in data processing flow .Apache DolphinScheduler As Apache Top open source projects , What is similar to other open source projects is , Its operation and installation begin with scripts .
The location of the script is the root directory script Under folder , The script execution sequence is as follows :
1. Check the startup script start-all.sh, Start can be found 4 The most important startup service , Namely dolphinscheduler-daemon.sh start master-server/worker-server/alert-server/api-server
2. stay dolphinscheduler-daemon.sh The script will execute first dolphinscheduler-env.sh Script , This script is used to introduce the environment , Include Hadoop、Spark、Flink、Hive Environment, etc . because DS These tasks need to be scheduled , If these environments are not introduced , Even if the scheduling is successful , The execution cannot succeed .
3. Then in dolphinscheduler-daemon.sh The script circulates the above 4 Modules bin/start.sh. As shown in the figure below :

As shown in the figure below : perform dolphinscheduler-daemon.sh start master-server I'll go when I get there master Modular src/main/bin perform start.sh, open start.sh after , It can be found that a MasterServer, other Worker,Alert as well as API Module equivalence .

thus , It's over from how the script runs the code , Next, we will introduce this in detail 4 The main purpose of the modules .Master Mainly responsible for DAG Task segmentation 、 Task submission monitoring , And listen to others at the same time Master and Worker Health status, etc ;Worker Mainly responsible for the implementation of tasks ;Alert Is responsible for warning service ;API be responsible for DS Add, delete, modify and check business logic , That is, the project management seen on the website 、 Resource management 、 Safety management and so on .
Actually , If you have been exposed to other big data projects , for example Flink、Hdfs、Hbase etc. , You will find that these architectures are similar , image hdfs yes NameNode and WorkNode The architecture of ;Hbase yes HMasterServer and HRegionServer The architecture of ;Flink yes JobManager and TaskManager Architecture, etc , If you can master these frameworks , It must be for DS It will be easier to master .
Master,Worker It's all through SpringBoot Start of , The objects created are also created by Spring trusteeship , If you usually contact Spring More words , Then I think you understand DS It will be easier than other open source projects .
remarks :
1、 There is another one in the running script python-gateway-server modular , This module uses python Code writing workflow , It is not within the scope of this article , So ignore for the time being , If you know this module in detail , Ask other students in the community .
2、 start-up Alert Script is execution Alert Under the module of alert-server Script for , because Alert It is also a parent module , I'm not going to talk about alert-server. I believe after watching Master and Worker After the execution process of ,Alert The module should not be difficult to understand .
3、 in addition , For the first time contact DS My classmates will find that Alert The module has a alert-api modular , What I want to say is this alert-api And what I said before api-server It doesn't matter at all ,api-server It's startup api Modular ApiApplicationServer Script , Responsible for the whole DS Of business logic , and alert-api Is responsible for the alarm spi Plug in interface for , open alert-api The module can send all the codes in it are interfaces and definitions , Not dealing with any logic , So it's still very easy to distinguish . Empathy task Under the module of task-api And alert-api Just the same responsibilities , It deals with different functions .
4、DS It's all about SpringBoot Managed , If there are students who haven't done it SpringBoot perhaps Spring Words , You can refer to the following websites and other relevant online materials .
If you want to learn more about the warning module , Please refer to the links below and consult other students .
https://dolphinscheduler.apache.org/zh-cn/blog/Hangzhou_cisco.html
Apache DolphinScheduler The address of the official website of the project is :https://github.com/apache/dolphinscheduler
Next chapter , The author will introduce it DS The two most important modules Master and Worker, And how they communicate , Coming soon .
边栏推荐
- Interpretation of the paper: using attention mechanism to improve the identification of N6 methyladenine sites in DNA
- 使用PyOD来进行异常值检测
- Chaoslibrary · UE4 pit opening notes
- C语言中,对柔性数组的理解
- 高分子合成工艺学
- Deep convolution generation countermeasure network
- NVIDIA 英伟达发布H100 GPU,水冷服务器适配在路上
- 线性规划之Google OR-Tools 简介与实战
- K-nucleotide frequencies (KNF) or k-mer frequencies
- Integrate all lvgl controls into one project (lvgl6.0 version)
猜你喜欢

高分子合成工艺学

钢结构基本原理复习

利用or-tools来求解路径规划问题(VRP)

论文解读:《基于注意力的多标签神经网络用于12种广泛存在的RNA修饰的综合预测和解释》

NLP natural language processing - Introduction to machine learning and natural language processing (I)

Neo4j 知识图谱的图数据科学-如何助力数据科学家提升数据洞察力线上研讨会于6月8号举行

Check the sandbox file in the real app

“东数西算”下数据中心的液冷GPU服务器如何发展?

Gaode positioning - the problem that the permission pop-up box does not appear

Practical convolution correlation trick
随机推荐
G2o installation path record -- for uninstallation
UE4 solves the problem that the WebBrowser cannot play H.264
Neo4j 知识图谱的图数据科学-如何助力数据科学家提升数据洞察力线上研讨会于6月8号举行
Gartner research: how is China's digital development compared with the world level? Can high-performance computing dominate?
知识图谱、图数据平台、图技术如何助力零售业飞速发展
怎么建立数据分析思维
Data analysis of time series (III): decomposition of classical time series
LVGL8.1版本笔记
高分子物理考研概念及要点、考点总结
数字经济“双碳”目标下,“东数西算”数据中心为何依靠液冷散热技术节能减排?
ARM架构与编程6--重定位(基于百问网ARM架构与编程教程视频)
永磁电机参数的测量获取(电感、电阻、极对数、磁链常数)
A hundred schools of thought contend at the 2021 trusted privacy computing Summit Forum and data security industry summit
Data analysis of time series (II): Calculation of data trend
Gaode positioning - the problem that the permission pop-up box does not appear
高等代数100道题及答案解析
高分子物理名词解释
How to build a liquid cooling data center is supported by blue ocean brain liquid cooling technology
时间序列的数据分析(一):主要成分
钢结构基本原理试题及答案