当前位置:网站首页>Flink restart policy
Flink restart policy
2022-07-24 06:16:00 【sf_ www】
When the task fails ,Flink You need to restart the failed tasks and other affected tasks , To restore the task to normal .
Restart strategy (Restart strategies) And failover strategies (failover strategies) Used to control task restart . Restart strategy determines whether and when restart fails / Affected tasks . The failover strategy determines which tasks should be restarted to resume the job .
The cluster can be started using the default restart policy , When there is no job specific restart policy defined , Always use the default restart policy . If you use the restart policy to submit jobs , Then this policy will override the default settings of the cluster .
The default restart policy is through Flink Configuration file for link-conf.yaml Set up , Configuration parameters restart-strategy Defines which strategy to use . If checkpoints are not enabled , Then use " No restart " Strategy . If the checkpoint is activated , And the restart policy is not configured , The fixed delay strategy is used ( Restart attempts use Integer.MAX_VALUE).
Except in the flink-conf.yaml Set the default restart policy in , We can also do it for each Flink Job definition specific restart strategy . This restart strategy is achieved by calling StreamExecutionEnvironment Upper setRestartStrategy Method is programmed .
The current restart strategy is as follows 5 Kind of :
( You can see org.apache.flink.api.common.restartstrategy.RestartStrategies class )
1) Fixed delay strategy (Fixed Delay Restart Strategy)
2) Exponential delay strategy (Exponential Delay Restart Strategy)
3) Failure rate restart strategy (Failure Rate Restart Strategy)
4) No restart policy (No Restart Strategy)
5) Backup strategy (Fallback Restart Strategy)1. Fixed delay strategy (Fixed Delay Restart Strategy)
The fixed delay restart strategy is to try to restart the job for a given number of times . If the maximum number of attempts is exceeded , The mission will eventually fail . Between two consecutive restart attempts , The restart strategy needs to wait for a fixed time .
stay flink-conf.yaml Set in :
# Restart strategy
restart-strategy: fixed-delay
# Number of attempts
restart-strategy.fixed-delay.attempts: 3
# Fixed delay time
restart-strategy.fixed-delay.delay: 10 sSet... In the code :
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setRestartStrategy(RestartStrategies.fixedDelayRestart(
3, // number of restart attempts
Time.of(10, TimeUnit.SECONDS) // delay
));2. Exponential delay strategy (Exponential Delay Restart Strategy)
The exponential delay restart strategy attempts to restart the job indefinitely , As the delay increases , Until the maximum delay . Tasks never fail . Between two consecutive restart attempts , The waiting time of restart strategy will increase exponentially , Until the maximum . then , It keeps the waiting delay time at the maximum . If the job is performed correctly , Exponential delay value after a period of time (restart-strategy.exponential-delay.reset-backoff-threshold) Reset to the original value (restart-strategy.exponential-delay.initial-backoff).
stay flink-conf.yaml Set in :
# Restart strategy
restart-strategy: exponential-delay
# Restart interval after initial failure ( Initial value )
restart-strategy.exponential-delay.initial-backoff: 10 s
# Maximum restart interval , After exceeding this maximum , The restart interval is no longer increased
restart-strategy.exponential-delay.max-backoff: 2 min
# After each failure , The restart interval is the last restart interval multiplied by this value
restart-strategy.exponential-delay.backoff-multiplier: 2.0
# How long does the job run without failure , The restart interval will be reset to the initial value ( The value of the first configuration item )
restart-strategy.exponential-delay.reset-backoff-threshold: 10 min
# The maximum jitter value of each restart interval ( Add or subtract a random number within the scope of the configuration item ), Prevent a large number of jobs from restarting at the same time
restart-strategy.exponential-delay.jitter-factor: 0.1Set... In the code :
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setRestartStrategy(RestartStrategies.exponentialDelayRestart(
Time.milliseconds(1),
Time.milliseconds(1000),
1.1, // exponential multiplier
Time.milliseconds(2000), // threshold duration to reset delay to its initial value
0.1 // jitter
));3. Failure rate restart strategy (Failure Rate Restart Strategy)
The failure rate restart strategy is to restart the task after the task fails , But when the failure rate ( Failure rate of each time interval ) Over time , The mission will eventually fail . Between two consecutive restart attempts , The restart strategy needs to wait for a certain time . That is to say restart-strategy.failure-rate.failure-rate-interval Failed for more than restart-strategy.failure-rate.max-failures-per-interval This value fails .
stay flink-conf.yaml Set in :
# Restart strategy
restart-strategy: failure-rate
# Maximum number of restarts in a given time interval before a failed job
restart-strategy.failure-rate.max-failures-per-interval: 3
# Time interval for measuring failure rate
restart-strategy.failure-rate.failure-rate-interval: 5 min
# Delay between two consecutive restart attempts
restart-strategy.failure-rate.delay: 10 sSet... In the code :
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setRestartStrategy(RestartStrategies.failureRateRestart(
3, // max failures per interval
Time.of(5, TimeUnit.MINUTES), //time interval for measuring failure rate
Time.of(10, TimeUnit.SECONDS) // delay
));4. No restart policy (No Restart Strategy)
The job failed directly , Don't try to restart .
stay flink-conf.yaml Set in :
restart-strategy: noneSet... In the code :
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setRestartStrategy(RestartStrategies.noRestart());5. Backup strategy (Fallback Restart Strategy)
Use the restart strategy defined by the cluster . This is very helpful for streaming programs that support checkpointing . By default , If no other restart policy is defined , Select the fixed delay restart policy .
Applicable scenarios, especially when you have a custom restart strategy implementation , stay flink-conf.yaml Especially useful when configuring .
Failover strategy
Failover strategies only 2 Kind of ,full and region
full: It's just one. task failed Just restart all task
region: Just restart the affected task
Can be in flink-conf.yaml Set in , Through configuration items :jobmanager.execution.failover-strategy, The default value is region.
The official website description is attached here :
Task Failure Recovery
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/ops/state/task_failure_recovery/#task-failure-recovery
边栏推荐
- Unity shader: realize diffuse reflection and specular reflection
- Dameng database_ Common initialization parameters
- IP笔记(8)
- data normalization
- Dameng database_ Supported table types, usage, characteristics
- day5-jvm
- 利用内网穿透,实现公网访问内网
- Foundation of JUC concurrent programming (7) -- multithread lock
- Dameng database_ Summary of supported data types
- Synergy LAN realizes multi host shared keyboard and mouse (AMD, arm)
猜你喜欢

Using keras and LSTM to realize time series prediction of long-term trend memory -lstnet

unity2D游戏之让人物动起来-下

公网访问内网IIS网站服务器【无需公网IP】

ue4 换装系统3.最终成果

Day3 jvm+ sorting summary

【数据库系统原理】第四章 高级数据库模型:统一建模语言UML、对象定义语言ODL

IP笔记(8)

Unicast, multicast, broadcast, tool development, introduction to QT UDP communication protocol development and source code of development tools

Installation of tensorflow and pytorch frames and CUDA pit records

Openpose unity plug-in deployment tutorial
随机推荐
公网访问内网IIS网站服务器【无需公网IP】
ue4 换装系统3.最终成果
餐饮数据统计分析---泰迪云课程大作业
How does the latest version of text (TMP) UI text of unity display in Chinese
Dameng database_ User password policy
使用Keras实现 基于注意力机制(Attention)的 LSTM 时间序列预测
Unity2d game let characters move - next
使用Keras实现CNN+BiLSTM+Attention的多维(多变量)时间序列预测
Unity shader migrated from built-in rendering pipeline to URP
不租服务器,自建个人商业网站(3)
Unity 3D frame rate statistics script
unity2D游戏之让人物动起来-下
Unity (II) more APIs and physical engines
UE4: what is the gameplay framework
Dameng database_ LENGTH_ IN_ Influence of char and charset
IP笔记(8)
IP笔记(11)
什么是单调队列
ue4 瞄准偏移
Channel attention and spatial attention module