当前位置:网站首页>Understanding of Flink parallelism
Understanding of Flink parallelism
2022-07-24 06:15:00 【sf_ www】
Concept description
One Flink Program consists of multiple Operator form (source、transformation and sink).
One Operator By multiple parallel Task( Threads ) To execute , One Operator Parallel of Task( Threads ) The number is called the Operator( Mission ) Parallelism of (Parallel). That is, parallelism is relative to Operator Speaking of .
The official pair is listed below Operator Explanation :
Operator
Node of a Logical Graph. An Operator performs a certain operation, which is usually executed by a Function. Sources and Sinks are special Operators for data ingestion and data egress.
Logical Graph
A logical graph is a directed graph where the nodes are Operators and the edges define input/output-relationships of the operators and correspond to data streams or data sets. A logical graph is created by submitting jobs from a Flink Application.
Logical graphs are also often referred to as dataflow graphs.
The description in the source code :
/** * Abstract base class for all operators. An operator is a source, sink, or it applies an operation * to one or more inputs, producing a result. * * @param <OUT> Output type of the records output by this operator */@Internalpublic abstract class Operator<OUT> implements Visitable<Operator<?>> {
/** * Sets the parallelism for this contract instance. The parallelism denotes how many parallel * instances of the user function will be spawned during the execution. * * @param parallelism The number of parallel instances to spawn. Set this value to {@link * ExecutionConfig#PARALLELISM_DEFAULT} to let the system decide on its own. */ public void setParallelism(int parallelism) { this.parallelism = parallelism; }Setting of parallelism
There can be 4 Levels to set Operator Parallelism of
1. Operator Level( Operator level )
2. Execution Environment Level( Execution environment level )
3. Client Level( Client level )
4. System Level( System default level , Not recommended , Because it will affect all operations )
1. Operator Level
Use the corresponding operator.setParallelism(xxx) that will do
2. Execution Environment Level
Use env.setParallelism(xxx) (env namely StreamExecutionEnvironment)
3. Client Level
Parallelism can be changed on the client job Submitted to the Flink Time setting .
about CLI client , Can pass -p Parameter specifies the degree of parallelism
./bin/flink run -p 3 ...
4. System Level
At the system level, you can set flink-conf.yaml In the document parallelism.default Property to specify the default parallelism for all execution environments .
4 Priority of setting methods
Priority of parallelism : Operator level > env Level > Client level > System default level
That is to say, the value with high priority can override the value with low priority if it is set .
In addition, the parallelism set will not always be the same as that of the actual execution , such as If source Cannot be executed in parallel , Even if the parallelism is specified as multiple , It doesn't work ;kafka Read, etc .
In actual production , It is recommended to specify the degree of parallelism at the operator level , Convenient display and precise resource control .
边栏推荐
- 使用Keras和LSTM实现对于长期趋势记忆的时间序列预测-LSTNet
- Dameng database_ Common initialization parameters
- ue4 换装系统3.最终成果
- IP笔记(9)
- Channel attention and spatial attention module
- [principles of database system] Chapter 5 algebra and logic query language: package, extension operator, relational logic, relational algebra and datalog
- Dameng database_ Logical architecture foundation
- C language linked list (create, traverse, release, find, delete, insert a node, sort, reverse order)
- JDBC advanced -- learning from Shang Silicon Valley (DAO)
- Paper reading endmember guided unmixing network (EGU net)
猜你喜欢

Write the list to txt and directly remove the comma in the middle

ue4 瞄准偏移

UE4 random generation of items

LSTM neural network

Hit the wall record (continuously updated)

MySQL download and installation environment settings

MySQL基础---约束

Lua基础

Dameng database_ Various methods of connecting databases and executing SQL and scripts under disql

不租服务器,自建个人商业网站(如何购买域名)
随机推荐
Day2 websocket+ sort
10大漏洞评估和渗透测试工具
IA笔记 1
Write the list to txt and directly remove the comma in the middle
Vsual studio 2013 environment UDP multicast
JUC concurrent programming foundation (9) -- thread pool
Hololens 2 development: development environment deployment
Hololens 2 development 101: create the first hololens 2 Application
Basic knowledge of unity and the use of some basic APIs
公网访问内网IIS网站服务器【无需公网IP】
day1-jvm+leetcode
Lunix命令入门 - 用户及文件权限(chmod 详解)
Dameng database_ Common user management commands
Unity(三)三维数学和坐标系统
Jestson installs IBus input method
记一次高校学生账户密码的获取,从无到有
MySQL foundation - constraints
Paper reading endmember guided unmixing network (EGU net)
Machine learning & deep learning introduction information sharing summary
【无需公网IP】为远程桌面树莓派配置固定的公网TCP端口地址