当前位置：网站首页>Adaptive batch job scheduler: automatically derive parallelism for Flink batch jobs

Adaptive batch job scheduler: automatically derive parallelism for Flink batch jobs

2022-07-16 05:12:00 【Robert's house of Technology】

01 introduction

For most users , by Flink It is not easy for operators to configure proper parallelism . For batch jobs , A small degree of parallelism will cause the job to run for a long time , Slow recovery , And unnecessary large parallelism will lead to a waste of resources , Task deployment and data shuffle The cost will also increase .

In order to control the execution time of batch jobs , The parallelism of an operator should be proportional to the amount of data it needs to process . The user needs to configure the parallelism by estimating the amount of data to be processed by the operator . But it is very difficult to accurately estimate the amount of data that the operator needs to process ： The amount of data that needs to be processed may change every day , There may be a large number of UDF And complex operators make it difficult to judge the amount of data they produce .

To solve this problem , We are Flink 1.15 A new scheduler is introduced in ： Adaptive batch job scheduler （Adaptive Batch Scheduler）. The adaptive batch job scheduler will automatically derive the parallelism according to the actual amount of data each operator needs to process when the job is running . It will bring the following benefits ：

Greatly reduce the complexity of batch job concurrency tuning ;
Different parallelism can be configured for different operators according to the amount of data processed , This is applicable to those that can only configure global parallelism SQL Homework is especially helpful ;
It can better adapt to the daily changing data volume .

02 usage

send Flink Automatically deduce the parallelism of operators , The following configuration is required ：

Enable adaptive batch job scheduler ;
The parallelism of the configuration operator is -1.

2.1 Enable adaptive batch job scheduler

Enable adaptive batch job scheduler &#x

原网站

版权声明
本文为[Robert's house of Technology]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/197/202207131726529372.html

当前位置：网站首页>Adaptive batch job scheduler: automatically derive parallelism for Flink batch jobs

Adaptive batch job scheduler: automatically derive parallelism for Flink batch jobs

2.1 Enable adaptive batch job scheduler

边栏推荐

猜你喜欢

随机推荐