当前位置:网站首页>Detailed explanation of Flink operation architecture
Detailed explanation of Flink operation architecture
2022-07-24 10:42:00 【InfoQ】
1. Flink Program structure

Map / FlatMap / Filter / KeyBy / Reduce / Fold / Aggregations / Window / WindowAll / Union / Window join / Split / Select2. Flink Parallel data stream

3. Task and Operator chain

4. Task scheduling and execution

- When Flink perform executor It will automatically generate... According to the program code DAG Data flow diagram ;
- ActorSystem establish Actor Send the data flow diagram to JobManager Medium Actor;
- JobManager Will continue to receive TaskManager The heartbeats of , Thus, effective TaskManager;
- JobManager Through the scheduler TaskManager Scheduling execution Task( stay Flink in , The smallest scheduling unit is task, The corresponding is a thread );
- While the program is running ,task And task Data can be transmitted between .
- The main responsibility is to submit tasks , After submission, the process can be ended , You can also wait for the result to return ;
- Job Client No Flink The internal part of program execution , But it's the starting point for task execution ;
- Job Client Responsible for receiving user's program code , Then create the data stream , Submit the data stream to Job Manager In order to carry out further . After execution ,Job Client Return the result to the user .
- The main responsibility is to schedule work and coordinate tasks to do checkpoints ;
- At least one in the cluster master,master Responsible for scheduling task, Coordinate checkpoints And fault tolerance ;
- High availability settings can have more than one master, But make sure one is leader, The others are standby;
- Job Manager contain Actor System、Scheduler、CheckPoint Three important components ;
- JobManager After receiving the task from the client , First, an optimized execution plan is generated , Reschedule to TaskManager In the implementation of .
- The main responsibility is to start from JobManager Mission reception , And deploy and start tasks , Receive upstream data and process ;
- Task Manager Is in JVM The work node in one or more threads of ;
- TaskManager Set it up at the beginning of creation Slot, Every Slot Can perform a task .
5. Task slot and slot sharing

1) Task slot
- TaskManager The tasks that can be executed concurrently at most are controllable , That's it 3 individual , Because it can't be more than slot The number of .
- slot Have exclusive memory space , This is in a TaskManager Multiple different jobs can be run in , Not affected between jobs .
2) Slot sharing
- Just calculate Job Medium maximum parallelism (parallelism) Of task slot, As long as this is satisfied , Other job Can also be satisfied .
- More equitable distribution of resources , If you have more free slot More tasks can be assigned to it . If there is no task slot sharing in the figure , Low load Source/Map etc. subtask Will occupy a lot of resources , Windows with higher loads subtask There will be a lack of resources .
- With task slot sharing , The basic parallelism can be (base parallelism) from 2 Upgrade to 6. The utilization rate of slotted resources is improved . At the same time, it can also ensure TaskManager to subtask The distribution of slot The scheme is more equitable .

边栏推荐
- MySQL - lock
- PC Museum (2) 1972 hp-9830a
- Sentinel three flow control modes
- 图像处理:浮点数转定点数
- Daily three questions 7.22
- Golang migrate is easy to use
- Association Rules -- July 10, 2022
- Google cooperates with colleges and universities to develop a general model, proteogan, which can design and generate proteins with new functions
- MySQL - 唯一索引
- Erlang学习01
猜你喜欢

《nlp入门+实战:第二章:pytorch的入门使用 》

Sentinel three flow control modes

Record AP and map calculation examples
![[personal summary] end of July 17, 2022](/img/56/8c69b171140ca38e16f0bbb7f344e3.jpg)
[personal summary] end of July 17, 2022

MySQL - 唯一索引

Intranet remote control tool under Windows

QT application prevents multiple opening, that is, single instance operation

【微服务】Eureka+Ribbon实现注册中心与负载均衡
![[AHK] AutoHotKey tutorial ①](/img/70/20f2e19e3e268ffe2f1e2956719c09.png)
[AHK] AutoHotKey tutorial ①

MySQL - lock
随机推荐
Protocol Bible - talk about ports and quads
Arduino + AD9833 波形发生器
Binlog and iptables prevent nmap scanning, xtrabackup full + incremental backup, and the relationship between redlog and binlog
QT application prevents multiple opening, that is, single instance operation
cookie sessionStorage localStorage 区别
Activity review | Anyuan AI X machine heart series lecture No. 1 | deepmind research scientist rohin Shah shares "finding a safe path for AgI"
Adobe Substance 3D Designer 2021软件安装包下载及安装教程
NiO knowledge points
MySQL - 索引的隐藏和删除
PyTorch 常用 Tricks 总结
redis 缓存设置,实现类似putIfAbsent功能
Google Earth engine - QA in Landsat 5 toa dataset_ Pixel and QA_ Radsat band
New:Bryntum Grid 5.1.0 Crack
Erlang学习02
ECCV 2022 | 清华提出首个嵌入光谱稀疏性的Transformer
When to use obj['attribute name'] for the attribute name of an object
[carving master learning programming] Arduino hands-on (59) - RS232 to TTL serial port module
WEB安全基础 - - -文件上传(文件上传绕过)
Qt应用程序防止多开,即单例运行
UVM——双向通信