当前位置:网站首页>Yarn performance tuning of CDH cluster
Yarn performance tuning of CDH cluster
2022-06-27 22:07:00 【javastart】
This article mainly discusses CDH Clustered YARN Tuning configuration , About YARN Tuning configuration for , Main concern CPU And memory tuning , among CPU Means physical CPU Multiply the number by CPU Check the number , namely Vcores = CPU Number *CPU Check the number .YARN In order to container The form of container encapsulates the resource ,task stay container Internal implementation .
Cluster configuration
The configuration of the cluster mainly includes three steps , The first is to plan the working hosts of the cluster and the configuration of each host , The second is to plan the installed components of each host and their resource allocation , The third is to plan the size of the cluster .
Configuration of working host
As shown in the following table : The memory of the host is 256G,4 individual 6 nucleus CPU,CPU Hyper threading support , The network bandwidth is 2G
host Components | Number | size | A total of | describe |
RAM | 256G | 256G | Memory size | |
CPU | 4 | 6 | 48 | total CPU Check the number |
HyperThreading CPU | YES | hyper-threading CPU, Make the operating system think that the number of cores of the processor is twice the actual number of cores 2 times , So if there is 24 A core processor , The operating system will think that the processor has 48 Core | ||
The Internet | 2 | 1G | 2G | network bandwidth |
Work host installation component configuration
The first step is to define the memory and memory of each host CPU To configure , Next, allocate resources for the services of each node , Main distribution CPU And memory .
service | Category | CPU Check the number | Memory (MB) | describe |
operating system | Overhead | 1 | 8192 | Assign... To the operating system 1 nucleus 8G Memory , commonly 4~8G |
Other services | Overhead | 0 | 0 | Not CDH colony 、 Resources occupied by non operating systems |
Cloudera Manager agent | Overhead | 1 | 1024 | Distribute 1 nucleus 1G |
HDFS DataNode | CDH | 1 | 1024 | Default 1 nucleus 1G |
YARN NodeManager | CDH | 1 | 1024 | Default 1 nucleus 1G |
Impala daemon | CDH | 0 | 0 | Optional services , The suggestion is at least impala demon Distribute 16G Memory |
Hbase RegionServer | CDH | 0 | 0 | Optional services , Suggest 12~16G Memory |
Solr Server | CDH | 0 | 0 | Optional services , The minimum 1G Memory |
Kudu Server | CDH | 0 | 0 | Optional services ,kudu Tablet server The minimum 1G Memory |
Available Container Resources | 44 | 250880 | The remainder is allocated to yarn Of container |
container Resource allocation Physical Cores to Vcores Multiplier: Every container Of cpu core The number of concurrent threads , This article is set to 1
YARN Available Vcores:YARN Usable CPU Check the number =Available Container Resources * Physical Cores to Vcores Multiplier, That is to say 44
YARN Available Memory:250880
The cluster size
Number of working nodes in the cluster :10
YARN To configure
YARN NodeManager Configuration properties
Configuration parameters | value | describe |
yarn.nodemanager.resource.cpu-vcores | 44 | yarn Of nodemanager Distribute 44 nucleus , The remaining of each node CPU |
yarn.nodemanager.resource.memory-mb | 250800 | Allocated memory size , The remaining memory of each node |
verification YARN Configuration of
Sign in YARN Of resourcemanager Of WEBUI:http://<ResourceManagerIP>:8088/, verification 'Memory Total' And 'Vcores Total', If all nodes are normal , that Vcores Total Should be 440,Memory Should be 2450G, namely 250800/1024*10
YARN Of container To configure
YARN Of container Of Vcore To configure
Configuration parameters | value | describe |
yarn.scheduler.minimum-allocation-vcores | 1 | Assigned to container Minimum vcore Number |
yarn.scheduler.maximum-allocation-vcores | 44 | Assigned to container Maximum vcore Count |
yarn.scheduler.increment-allocation-vcores | 1 | Container virtualization CPU Kernel increment |
YARN Of container Memory configuration
Configuration parameters | value | describe |
yarn.scheduler.minimum-allocation-mb | 1024 | Assigned to container Minimum memory size , by 1G |
yarn.scheduler.maximum-allocation-mb | 250880 | Assigned to container Maximum memory of , be equal to 245G, That is, the maximum memory remaining for each node |
yarn.scheduler.increment-allocation-mb | 512 | Container memory increment , Default 512M |
Cluster resource allocation estimation
describe | minimum value | Maximum |
According to each container Minimum memory allocation , The largest cluster container The number of | 2450 | |
According to each container Minimum Vcore Distribute , The largest cluster container The number of | 440 | |
According to each container Maximum memory allocation , The smallest of the cluster container The number of | 10 | |
According to each container Maximum Vcores Distribute , The smallest of the cluster container The number of | 10 |
container Reasonable configuration inspection
Configuration constraints | describe |
maximal Vcore Quantity must be greater than or equal to the allocated minimum Vcore Count | yarn.scheduler.maximum-allocation-vcores >= yarn.scheduler.minimum-allocation-vcores |
The maximum memory allocated must be greater than or equal to the minimum memory allocated | yarn.scheduler.maximum-allocation-mb >= yarn.scheduler.minimum-allocation-mb |
The minimum number of cores allocated must be greater than or equal to 0 | yarn.scheduler.minimum-allocation-vcores >= 0 |
The biggest allocation Vcore Number must be greater than or equal to 1 | yarn.scheduler.maximum-allocation-vcores >= 1 |
Each host is assigned to nodemanaer Of vcore The total must be greater than the minimum allocated vcore Count | yarn.nodemanager.resource.cpu-vcores >= yarn.scheduler.minimum-allocation-vcores |
Each host is assigned to nodemanaer Of vcore The total must be greater than the maximum allocated vcore Count | yarn.nodemanager.resource.cpu-vcores >= yarn.scheduler.maximum-allocation-vcores |
Each host is assigned to nodemanaer The memory of must be greater than the minimum memory allocated by the schedule | yarn.nodemanager.resource.memory-mb >= yarn.scheduler.maximum-allocation-mb |
Each host is assigned to nodemanaer The memory of must be greater than the maximum memory allocated by the schedule | yarn.nodemanager.resource.memory-mb >= yarn.scheduler.minimum-allocation-mb |
container Minimum configuration | If yarn.scheduler.minimum-allocation-mb Less than 1GB,container May be YARN Kill , Because there will be OutOfMemory Memory overflow |
MapReduce To configure
ApplicationMaster To configure
Configuration parameters | take value | describe |
yarn.app.mapreduce.am.resource.cpu-vcores | 1 | ApplicationMaster The virtual CPU Number of cores |
yarn.app.mapreduce.am.resource.mb | 1024 | ApplicationMaster Physical memory requirements for (MiB) |
yarn.app.mapreduce.am.command-opts | 800 | Pass on to MapReduce ApplicationMaster Of Java Command line arguments ,AM Java heap size , by 800M |
The ratio of heap to container size
Configuration parameters | Value | describe |
task Automatic heap size | yes | |
mapreduce.job.heap.memory-mb.ratio | 0.8 | Map and Reduce The ratio of the heap size of the task to the container size . The heap size should be smaller than the container size , With permission JVM Some of the expenses of , The default is 0.8 |
map task To configure
Configuration parameters | value | describe |
mapreduce.map.cpu.vcores | 1 | Assigned to map task Of vcore Count |
mapreduce.map.memory.mb | 1024 | Assigned to map task Of memory ,1G |
mapreduce.task.io.sort.mb | 400 | I/O Sort memory buffer (MiB), Default 256M, Generally, there is no need to modify |
reduce task To configure
Configuration parameters | value | describe |
mapreduce.reduce.cpu.vcores | 1 | Assigned to reduce task Of vcore Count |
mapreduce.reduce.memory.mb | 1024 | Assigned to reduce task Of memory ,1G |
MapReduce Configuration rationality check
Application Master Check the rationality of the configuration
yarn.scheduler.minimum-allocation-vcores <= yarn.app.mapreduce.am.resource.cpu-vcores<= yarn-scheduler.maximum-allocation-vcores
yarn.scheduler.minimum-allocation-mb <= yarn.app.mapreduce.am.resource.cpu-vcores <= yarn.scheduler.maximum-allocation-mb
Java Heap Size is container The size of 75%~90%: Too low will cause a waste of resources , Too high will cause OOMMap Task Check the rationality of the configuration
Reduce Task Check the rationality of the configuration
yarn.scheduler.minimum-allocation-vcores <= mapreduce.map.cpu.vcores <= yarn-scheduler.maximum-allocation-vcores
yarn.scheduler.minimum-allocation-mb <= mapreduce.map.memory.mb <= yarn.scheduler.maximum-allocation-mb
Spill/Sort Memory for each task Heap memory 40%~60%
Reduce Task Check the rationality of the configuration
yarn.scheduler.minimum-allocation-vcores <= mapreduce.reduce.cpu.vcores <= yarn-scheduler.maximum-allocation-vcores
yarn.scheduler.minimum-allocation-mb <= mapreduce.reduce.memory.mb <= yarn.scheduler.maximum-allocation-mb
YARN and MapReduce Summary of configuration parameters
YARN/MapReduce Parameter configuration | describe |
yarn.nodemanager.resource.cpu-vcores | Assigned to container The virtual cpu Count |
yarn.nodemanager.resource.memory-mb | Assigned to container The memory size of |
yarn.scheduler.minimum-allocation-vcores | Assigned to container The smallest virtual cpu Count |
yarn.scheduler.maximum-allocation-vcores | Assigned to container Maximum virtual cpu Count |
yarn.scheduler.increment-allocation-vcores | Assigned to container Incremental virtual cpu Count |
yarn.scheduler.minimum-allocation-mb | Assigned to container Minimum memory size |
yarn.scheduler.maximum-allocation-mb | Assigned to container Maximum memory of |
yarn.scheduler.increment-allocation-mb | Assigned to container Incremental memory size |
yarn.app.mapreduce.am.resource.cpu-vcores | ApplicationMaste The virtual cpu Count |
yarn.app.mapreduce.am.resource.mb | ApplicationMaste The memory size of |
mapreduce.map.cpu.vcores | map task The virtual CPU Count |
mapreduce.map.memory.mb | map task The memory size of |
mapreduce.reduce.cpu.vcores | reduce task The virtual cpu Count |
mapreduce.reduce.memory.mb | reduce task The memory size of |
mapreduce.task.io.sort.mb | I/O Sort memory size |
note: stay CDH5.5 Or later , Parameters mapreduce.map.java.opts, mapreduce.reduce.java.opts, yarn.app.mapreduce.am.command-opts Will be based on container The proportion of heap memory is automatically configured
边栏推荐
- .NET学习笔记(五)----Lambda、Linq、匿名类(var)、扩展方法
- 01 golang environment construction
- GBase 8a V8版本节点替换期间通过并发数控制资源使用减少对系统影响的方法
- [LeetCode]100. Same tree
- Sharing | intelligent environmental protection - ecological civilization informatization solution (PDF attached)
- Express e stack - small items in array
- How to design an elegant caching function
- Summary of Web testing and app testing by bat testing experts
- [Sword Offer II]剑指 Offer II 029. 排序的循环链表
- qt base64加解密
猜你喜欢
管理系统-ITclub(中)
. Net learning notes (V) -- lambda, LINQ, anonymous class (VaR), extension method
6G显卡显存不足出现CUDA Error:out of memory解决办法
Special training of guessing game
Professor of Tsinghua University: software testing has gone into a misunderstanding - "code is necessary"
How to delete "know this picture" on win11 desktop
PCIe knowledge point -008: structure of PCIe switch
I think I should start writing my own blog.
使用Fiddler模拟弱网测试(2G/3G)
读写分离-Mysql的主从复制
随机推荐
Gbase 8A method for reducing the impact on the system by controlling resource usage through concurrency during node replacement of V8 version
不外泄的测试用例设计秘籍--模块测试
[LeetCode]515. 在每个树行中找最大值
Go from introduction to actual combat - execute only once (note)
Gbase 8A OLAP analysis function cume_ Example of dist
Matlab finds the position of a row or column in the matrix
YOLOv6:又快又准的目标检测框架开源啦
Read write separation master-slave replication of MySQL
Golang uses regularity to match substring functions
Fill in the blank of rich text test
Codeforces Round #719 (Div. 3)
6G显卡显存不足出现CUDA Error:out of memory解决办法
Go from introduction to practice -- shared memory concurrency mechanism (notes)
Stm32f107+lan8720a use stm32subemx to configure network connection +tcp master-slave +udp app
Have time to look at ognl expressions
Professor of Tsinghua University: software testing has gone into a misunderstanding - "code is necessary"
How to delete "know this picture" on win11 desktop
开源技术交流丨一站式全自动化运维管家ChengYing入门介绍
[LeetCode]100. 相同的树
GBase 8a V8版本节点替换期间通过并发数控制资源使用减少对系统影响的方法