当前位置:网站首页>Spark yard memory resource calculation and analysis (Reference) -- optimized configuration of executor cores, nums and memory
Spark yard memory resource calculation and analysis (Reference) -- optimized configuration of executor cores, nums and memory
2022-06-22 08:40:00 【Alex_ 81D】
Spark on Yarn Executor Cores、Nums、Memory Optimize configuration
Three aspects :executor Number of cores ,executor Number ,executor Memory . about driver memory This parameter , The settings are flexible , commonly 1-8, I won't say much here
Set the above three parameters , In addition to calculating the number of nodes in the cluster 、 node Cores And memory size , There are also four factors to consider :
spark Use yarn Do resource management ,yarn Some daemons run in the background , Such as NameNode,Secondary NameNode,DataNode,JobTracker and TaskTracker, So it's setting num-executors, Reserve for each node 1 individual core To ensure that these daemons run smoothly .
Yarn ApplicationMaster(AM):
AM In charge of from ResourceManager Application resources , And NodeManager Start communication / Stop task , Monitor the use of resources . stay Yarn On the implementation Spark Consider also AM resource requirement (1G and 1 individual Executor).
HDFS Throughput:
HDFS Client There is a problem with multiple concurrent threads writing ,HDFS Every Executor Use 5 One task can get full concurrent write . So it's best that each Executor Of cores No more than 5.
MemoryOverhead:
The following picture shows spark-yarn-memory-usage

Every executor Need memory =spark-executor-memory+spark.yarm.executor.memoryOverhead
spark.yarm.executor.memoryOverhead=Max(384m,7%*spark.executor.memory) Greater than or equal to 384M
If we executor apply 20GB resources , actually AM Resources obtained 20GB+7%*20GB=~23GB.
Case study :3 Nodes , Every node 8 nucleus + 32 GB RAM( Are real clusters ,1 Master and slave )
Because each executor It's all one JVM example , So we can assign multiple to each node executor.
Preparation : Reserve resources for the operation of the system
The content of the original article is : To ensure that the operating system and hadoop Running process , Each node should reserve 1 Core + 1 GB Memory . So the available resources for each node are :7 Core + 30 GB RAM.
But if you don't want your system to fall into a high load situation , You can reserve more resources , My personal experience is that each node is reserved 4 Core + 4 GB Memory . This piece I 6 Core + 28 GB ( Good calculation )
1. Make sure that each executor The number of cores ——“magic number”
executor The core number = executor Tasks that can be executed concurrently (task) Number , Studies have shown that , Not for executor The more cores allocated, the better , Any one of them application The number of cores allocated exceeds 5 This will only lead to performance degradation , So we usually put executor The number of cores is set to 5 And 5 The following figures . In the following case explanation, we set the number of cores to 3.
2. Set up executor Number
Every node Of executor Number = 6( The total number of cores per node ) / 3( Every executor The number of cores ) = 2,
therefore The total executor The number of 2( That's right 2) * 3(3 Nodes ) = 6, because YARN Of ApplicationMaster Need to take up a executor, So we set executor The number of 6 - 1 =5.
Be careful : A single task submission requires at least 2 nucleus , That is, a executor And a driver( The essence is and is executor)
3. Set up executor Allocated memory size
In the previous step , We assign to each node 2 individual executor, Available for each node RAM yes 28GB, So each executor The memory of is 28 / 2 = 14 GB.
however spark In the YARN When applying for memory ,YARN Will give each executor The multiple allocation size is overhead Of memory , and overhead = max(384 MB, 0.07 * spark.executor.memory).
In our case overhead = 0.07 * 14 = 0.98 GB > 384 MB, So the memory size we applied for is 14 - 0.98 ~ 13 GB.
driver: Memory (1~13G)
In the end, we get the distribution scheme is : Every executor Distribute 3 Core , Set up 5 individual executor, Every executor The allocated memory is 13 GB. It's written in spark-submit The order is :
spark-submit
--master yarn
--num-executors 3
--executor-memory 13
--executor-cores 3 ( This represents every executor Number of cores )
This only provides one calculation method , It doesn't mean it's optimal , It can be calculated by itself

Be careful :driver: Memory (1~13G), This needs attention , We need to set up container Parameter size of , Otherwise, an error will be reported , See below for details
Expand : understand 2 Details of configuration items :
yarn.nodemanager.resource.memory-mb
Start with this parameter , Let's see NodeManager Configuration item for . This parameter is actually set NodeManager How much memory is to be requested from this computer , For all Container Distribution and calculation of . This parameter is equivalent to a threshold , Limit NodeManager The maximum amount of memory a server can use , To prevent NodeManager Excessive consumption of system memory , Resulting in eventual server downtime . This value can be used according to the actual server configuration , Moderate resizing .
For example, the algorithm above : We are NodeManager Allocated 28GB Of memory .
yarn.scheduler.maximum-allocation-mb
Single container (container) The maximum memory resources that can be requested , The memory requested by the application at run time cannot exceed the value of this configuration item , Because this configuration item specifies a container Maximum memory , The actual memory allocation is not based on this configuration item , Therefore, this configuration item can be configured as and nodemanager Available memory for (yarn.nodemanager.resource.memory-mb) Just the same , In this case , This means that as long as this node's nodemanager The available memory is only enough to run one container, This container It can also be started .


Reference link :
https://blog.csdn.net/wx6gml18/article/details/111433509
边栏推荐
- Top ten of the year! Saining network security was once again shortlisted in the top 100 report on China's digital security
- 16 interpreter mode
- Summary of key knowledge of induction motor in Electrical Engineering (reflected in existing topics)
- 深度学习——(1)ResNet实现
- 07 适配器模式
- yolov5 export Gpu推理模型导出
- Analyzing the role of cognitive theory in maker teacher training
- 深入理解MySQL索引凭什么能让查询效率提高这么多?
- Basic concepts of homomorphic encryption
- 新型冠状病毒疫情
猜你喜欢

Crawling microblog comments | emotional analysis of comment information | word cloud of comment information

Mysql5.6.36 tutorial

20 status mode

矩阵分解

Thread.start()方法源码分析

Mysql+orcle (SQL implements recursive query of all data of child nodes)

培养以科学技能为本的Steam教育

Eureka的InstanceInfoReplicator类(服务注册辅助工具)

Remove the restriction of video memory occupied by tensorflow GPU

14 职责链模式
随机推荐
12 享元模式
JVM memory overflow
Installation and use of Jupiter notebook
Chapter II exercise | MNIST dataset | Titanic dataset | image enhancement
Top ten of the year! Saining network security was once again shortlisted in the top 100 report on China's digital security
10 decoration mode
luogu P4557 [JSOI2018]战争
我的第一个Go程序
Detailed explanation of the underlying principle of concurrent thread pool and source code analysis
I spring and autumn web Penetration Test Engineer (elementary) learning notes (Chapter 3)
Do not use primitive types in new code during the use of generic types
Thread. Source code analysis of start() method
【路径规划】辅助点与多段贝塞尔平滑RRT
面试突击59:一个表中可以有多个自增列吗?
Questions 101 to 200 of the national information security grade examination nisp level 1 question bank (1)
深入理解MySQL索引凭什么能让查询效率提高这么多?
377. combined total Ⅳ
Summary of key knowledge of induction motor in Electrical Engineering (reflected in existing topics)
luogu P4292 [WC2010]重建计划
One hot and embedding