当前位置:网站首页>Spark 内存管理机制 新版
Spark 内存管理机制 新版
2022-07-25 15:10:00 【南风知我意丿】
Spark Memory Management mechanism
一、内存参数
| Property Name | Default |
|---|---|
| spark.memory.fraction | 0.6 |
| spark.memory.storageFraction | 0.5 |
| RESERVED_SYSTEM_MEMORY_BYTES | 300M |
1.简图:

2.示例:
Calculate the Memory for 5GB executor memory:
To calculate Reserved memory, User memory, Spark memory, Storage memory, and Execution memory, we will use the following parameters:
spark.executor.memory=5g
spark.memory.fraction=0.6
spark.memory.storageFraction=0.5
Java Heap Memory = 5 GB
= 5 * 1024 MB
= 5120 MB
Reserved Memory = 300 MB
Usable Memory = (Java Heap Memory — Reserved Memory)
= 5120 MB - 300 MB
= 4820 MB
User Memory = Usable Memory * (1.0 — spark.memory.fraction)
= 4820 MB * (1.0 - 0.6)
= 4820 MB * 0.4
= 1928 MB
Spark Memory = Usable Memory * spark.memory.fraction
= 4820 MB * 0.6
= 2892 MB
Spark Storage Memory = Spark Memory * spark.memory.storageFraction
= 2892 MB * 0.5
= 1446 MB
Spark Execution Memory = Spark Memory * (1.0 - spark.memory.storageFraction)
= 2892 MB * ( 1 - 0.5)
= 2892 MB * 0.5
= 1446 MB

Reserved Memory — 300 MB — 5.85%
User Memory — 1928 MB — 37.65%
Spark Memory — 2892 MB — 56.48%
二、Spark 内存分配在Spark UI的表现
1.Spark UI with On Heap
- 1.提交参数
spark-shell \
--driver-memory 5g \
--executor-memory 5g
- 2.Spark UI表现

此时可以看到 StorageMemory只有 2.7GB,下面我们算一下这个数据是怎么来的
- 3.Storage Memory计算
Java Heap Memory = 5 GB
Reserved Memory = 300 MB
Usable Memory = 4820 MB
User Memory = 1928 MB
Spark Memory = 2892 MB = 2.8242 GB
Spark Storage Memory = 1446 MB = 1.4121 GB
Spark Execution Memory = 1446 MB = 1.4121 GB
从spark UI我们得知, Storage Memory value 是 2.7 GB ,但是我们计算的 the Storage Memory 是 1.4121 GB. 由此可知 Spark UI Storage Memory = Storage Memory + Execution Memory.
Storage Memory = Spark Storage Memory + Spark Execution Memory
= 1.4121 GB + 1.4121 GB
= 2.8242 GB
Spark UI Storage Memory (2.7 GB) 但是, 我们计算的 Storage Memory (2.8242 GB) 。这是因为我们设置的 --executor-memory 5g. 然而spark运行得到的最大的堆内存还要减去300MB。 so Java Heap Memory is only 4772593664 bytes.
Java Heap Memory = 4772593664 bytes = 4772593664/(1024 * 1024) = 4551 MB
Reserved Memory = 300 MB
Usable Memory = (Java Heap Memory - Reserved Memory) = (4551 - 300) MB = 4251 MB
User Memory = (Usable Memory * (1 -spark.memory.fraction)) = 1700.4 MB
Spark Memory = (Usable Memory * spark.memory.fraction) = 2550.6 MB
Spark Storage Memory = 1275.3 MB
Spark Execution Memory = 1275.3 MB
Spark Memory (2550.6 MB/2.4908 GB) 依然不匹配 Spark UI (2.7 GB)这是因为我们转换 Java Heap Memory 字节变为MB 用的是 1024 * 1024 ,但是 Spark UI 转换 bytes 变 MB 除以的是 1000 * 1000.
Java Heap Memory = 4772593664 bytes = 4772593664/(1000 * 1000) = 4772.593664 MB
Reserved Memory = 300 MB
Usable Memory = (Java Heap Memory - Reserved Memory) = (4472.593664 - 300) MB = 4472.593664 MB
User Memory = (Usable Memory * (1 -spark.memory.fraction)) = 1789.0374656 MB
Spark Memory = (Usable Memory * spark.memory.fraction) = 2683.5561984 MB = ~ 2.7 GB
Spark Storage Memory = 1341.7780992 MB
Spark Execution Memory = 1341.7780992 MB
至此,Spark Memory = (Usable Memory * spark.memory.fraction) = 2683.5561984 MB = ~ 2.7 GB,这样就和Spark UI一致了
- 4.不同版本bytes 变 MB 转换规则
- 4.1 spark 2.x
function formatBytes(bytes, type) {
if (type !== 'display') return bytes;
if (bytes == 0) return '0.0 B';
var k = 1000;
var dm = 1;
var sizes = ['B', 'KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'];
var i = Math.floor(Math.log(bytes) / Math.log(k));
return parseFloat((bytes / Math.pow(k, i)).toFixed(dm)) + ' ' + sizes[i];
}
- 4.2 spark 3.x
function formatBytes(bytes, type) {
if (type !== 'display') return bytes;
if (bytes <= 0) return '0.0 B';
var k = 1024;
var dm = 1;
var sizes = ['B', 'KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB'];
var i = Math.floor(Math.log(bytes) / Math.log(k));
return parseFloat((bytes / Math.pow(k, i)).toFixed(dm)) + ' ' + sizes[i];
}
2.Spark UI with OffHeap Enabled
- 1.提交参数
spark-shell \
--driver-memory 1g \
--executor-memory 1g \
--conf spark.memory.offHeap.enabled=true \
--conf spark.memory.offHeap.size=5g
- 2.Spark UI表现

- 3.Storage Memory计算
Storage Memory = On Heap Memory + Off Heap Memory
- 3.1.On Heap Memory
Java Heap Memory = 954728448 bytes = 954728448/1000/1000 = 954 MB
Reserved Memory = 300 MB
Usable Memory = (Java Heap Memory - Reserved Memory) = (954 - 300) MB = 654 MB
User Memory = (Usable Memory * (1 -spark.memory.fraction)) = 261.6 MB
Spark Memory = (Usable Memory * spark.memory.fraction) = 392.4 MB
Spark Storage Memory = 196.2 MB
Spark Execution Memory = 196.2 MB
- 3.2.Off Heap Memory
spark.memory.offHeap.size = 5 GB = 5 * 1000 MB = 5000 MB
- 3.3 Storage Memory
Storage Memory = On Heap Memory + Off Heap Memory
= 392.4 MB + 5000 MB
= 5392.4 MB
= 5.4 GB
3.Spark Storage Memory 计算程序demo
// JVM Arguments: -Xmx5g
public class SparkMemoryCalculation {
private static final long MB = 1024 * 1024;
private static final long RESERVED_SYSTEM_MEMORY_BYTES = 300 * MB;
private static final double SparkMemoryStorageFraction = 0.5;
private static final double SparkMemoryFraction = 0.6;
public static void main(String[] args) {
long systemMemory = Runtime.getRuntime().maxMemory();
long usableMemory = systemMemory - RESERVED_SYSTEM_MEMORY_BYTES;
long sparkMemory = convertDoubletLong(usableMemory * SparkMemoryFraction);
long userMemory = convertDoubletLong(usableMemory * (1 - SparkMemoryFraction));
long storageMemory = convertDoubletLong(sparkMemory * SparkMemoryStorageFraction);
long executionMemory = convertDoubletLong(sparkMemory * (1 - SparkMemoryStorageFraction));
printMemoryInMB("Heap Memory\t\t", systemMemory);
printMemoryInMB("Reserved Memory", RESERVED_SYSTEM_MEMORY_BYTES);
printMemoryInMB("Usable Memory\t", usableMemory);
printMemoryInMB("User Memory\t\t", userMemory);
printMemoryInMB("Spark Memory\t", sparkMemory);
printMemoryInMB("Storage Memory\t", storageMemory);
printMemoryInMB("Execution Memory", executionMemory);
System.out.println();
printStorageMemoryInMB("Spark Storage Memory", sparkMemory);
printStorageMemoryInMB("Storage Memory UI \t", storageMemory);
printStorageMemoryInMB("Execution Memory UI", executionMemory);
}
private static void printMemoryInMB(String type, long memory) {
System.out.println(type + " \t=\t"+ (memory/MB) +" MB");
}
private static void printStorageMemoryInMB(String type, long memory) {
System.out.println(type + " \t=\t"+ (memory/(1000*1000)) +" MB");
}
private static Long convertDoubletLong(double val) {
return new Double(val).longValue();
}
}
总结
参考:
https://community.cloudera.com/t5/Community-Articles/Spark-Memory-Management/ta-p/317794
大佬文章写的宛如艺术品
最后,送大家一句话“知识,哪怕是知识的幻影,也会成为你的铠甲,保护你不被愚昧反噬”(来自知乎——《为什么读书?》)
边栏推荐
- 什么是物联网
- ES5写继承的思路
- L1 and L2 regularization
- [C题目]力扣876. 链表的中间结点
- Handle Oracle deadlock
- Splice a field of the list set into a single string
- Login of MySQL [database system]
- Stored procedure bias of SQL to LINQ
- Automatically set the template for VS2010 and add header comments
- Object.prototype.hasOwnProperty() 和 in
猜你喜欢

Add the jar package under lib directory to the project in idea

用setTimeout模拟setInterval定时器

API health status self inspection

瀑布流布局

什么是物联网

bridge-nf-call-ip6tables is an unknown key异常处理

Unable to start web server when Nacos starts

防抖(debounce)和节流(throttle)

37 element mode (inline element, block element, inline block element)

Deployment and simple use of PostgreSQL learning
随机推荐
转载----如何阅读代码?
Scala111-map、flatten、flatMap
LeetCode_字符串_中等_151.颠倒字符串中的单词
MeanShift聚类-01原理分析
Nacos2.1.0 cluster construction
System. Accessviolationexception: an attempt was made to read or write to protected memory. This usually indicates that other memory is corrupted
Vs2010添加wap移动窗体模板
TypeScript学习1——数据类型
MySQL sort
Live classroom system 05 background management system
用setTimeout模拟setInterval定时器
Reprint ---- how to read the code?
27 选择器的分类
Automatically set the template for VS2010 and add header comments
mysql heap表_MySQL内存表heap使用总结-九五小庞
[thread knowledge points] - spin lock
Leetcode combination sum + pruning
[C topic] the penultimate node in the Niuke linked list
Award winning interaction | 7.19 database upgrade plan practical Summit: industry leaders gather, why do they come?
"How to use" agent mode