当前位置:网站首页>Spark memory management mechanism new version
Spark memory management mechanism new version
2022-07-24 01:34:00 【The south wind knows what I mean】
Spark Memory Management mechanism
One 、 Memory parameters
| Property Name | Default |
|---|---|
| spark.memory.fraction | 0.6 |
| spark.memory.storageFraction | 0.5 |
| RESERVED_SYSTEM_MEMORY_BYTES | 300M |
1. Schematic diagram :

2. Example :
Calculate the Memory for 5GB executor memory:
To calculate Reserved memory, User memory, Spark memory, Storage memory, and Execution memory, we will use the following parameters:
spark.executor.memory=5g
spark.memory.fraction=0.6
spark.memory.storageFraction=0.5
Java Heap Memory = 5 GB
= 5 * 1024 MB
= 5120 MB
Reserved Memory = 300 MB
Usable Memory = (Java Heap Memory — Reserved Memory)
= 5120 MB - 300 MB
= 4820 MB
User Memory = Usable Memory * (1.0 — spark.memory.fraction)
= 4820 MB * (1.0 - 0.6)
= 4820 MB * 0.4
= 1928 MB
Spark Memory = Usable Memory * spark.memory.fraction
= 4820 MB * 0.6
= 2892 MB
Spark Storage Memory = Spark Memory * spark.memory.storageFraction
= 2892 MB * 0.5
= 1446 MB
Spark Execution Memory = Spark Memory * (1.0 - spark.memory.storageFraction)
= 2892 MB * ( 1 - 0.5)
= 2892 MB * 0.5
= 1446 MB

Reserved Memory — 300 MB — 5.85%
User Memory — 1928 MB — 37.65%
Spark Memory — 2892 MB — 56.48%
Two 、Spark Memory is allocated in Spark UI The performance of the
1.Spark UI with On Heap
- 1. Submit parameters
spark-shell \
--driver-memory 5g \
--executor-memory 5g
- 2.Spark UI performance

Now you can see that StorageMemory Only 2.7GB, Now let's calculate how this data comes from
- 3.Storage Memory Calculation
Java Heap Memory = 5 GB
Reserved Memory = 300 MB
Usable Memory = 4820 MB
User Memory = 1928 MB
Spark Memory = 2892 MB = 2.8242 GB
Spark Storage Memory = 1446 MB = 1.4121 GB
Spark Execution Memory = 1446 MB = 1.4121 GB
from spark UI We learned that , Storage Memory value yes 2.7 GB , But we calculated the Storage Memory yes 1.4121 GB. Thus we can see that Spark UI Storage Memory = Storage Memory + Execution Memory.
Storage Memory = Spark Storage Memory + Spark Execution Memory
= 1.4121 GB + 1.4121 GB
= 2.8242 GB
Spark UI Storage Memory (2.7 GB) however , We calculated Storage Memory (2.8242 GB) . This is because we set up --executor-memory 5g. However spark The maximum heap memory obtained by running is also subtracted 300MB. so Java Heap Memory is only 4772593664 bytes.
Java Heap Memory = 4772593664 bytes = 4772593664/(1024 * 1024) = 4551 MB
Reserved Memory = 300 MB
Usable Memory = (Java Heap Memory - Reserved Memory) = (4551 - 300) MB = 4251 MB
User Memory = (Usable Memory * (1 -spark.memory.fraction)) = 1700.4 MB
Spark Memory = (Usable Memory * spark.memory.fraction) = 2550.6 MB
Spark Storage Memory = 1275.3 MB
Spark Execution Memory = 1275.3 MB
Spark Memory (2550.6 MB/2.4908 GB) Still don't match Spark UI (2.7 GB) This is because we convert Java Heap Memory Bytes changed to MB It's using 1024 * 1024 , however Spark UI transformation bytes change MB Divided by 1000 * 1000.
Java Heap Memory = 4772593664 bytes = 4772593664/(1000 * 1000) = 4772.593664 MB
Reserved Memory = 300 MB
Usable Memory = (Java Heap Memory - Reserved Memory) = (4472.593664 - 300) MB = 4472.593664 MB
User Memory = (Usable Memory * (1 -spark.memory.fraction)) = 1789.0374656 MB
Spark Memory = (Usable Memory * spark.memory.fraction) = 2683.5561984 MB = ~ 2.7 GB
Spark Storage Memory = 1341.7780992 MB
Spark Execution Memory = 1341.7780992 MB
thus ,Spark Memory = (Usable Memory * spark.memory.fraction) = 2683.5561984 MB = ~ 2.7 GB, This is the same as Spark UI It's the same
- 4. Different versions bytes change MB Conversion rules
- 4.1 spark 2.x
function formatBytes(bytes, type) {
if (type !== 'display') return bytes;
if (bytes == 0) return '0.0 B';
var k = 1000;
var dm = 1;
var sizes = ['B', 'KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'];
var i = Math.floor(Math.log(bytes) / Math.log(k));
return parseFloat((bytes / Math.pow(k, i)).toFixed(dm)) + ' ' + sizes[i];
}
- 4.2 spark 3.x
function formatBytes(bytes, type) {
if (type !== 'display') return bytes;
if (bytes <= 0) return '0.0 B';
var k = 1024;
var dm = 1;
var sizes = ['B', 'KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB'];
var i = Math.floor(Math.log(bytes) / Math.log(k));
return parseFloat((bytes / Math.pow(k, i)).toFixed(dm)) + ' ' + sizes[i];
}
2.Spark UI with OffHeap Enabled
- 1. Submit parameters
spark-shell \
--driver-memory 1g \
--executor-memory 1g \
--conf spark.memory.offHeap.enabled=true \
--conf spark.memory.offHeap.size=5g
- 2.Spark UI performance

- 3.Storage Memory Calculation
Storage Memory = On Heap Memory + Off Heap Memory
- 3.1.On Heap Memory
Java Heap Memory = 954728448 bytes = 954728448/1000/1000 = 954 MB
Reserved Memory = 300 MB
Usable Memory = (Java Heap Memory - Reserved Memory) = (954 - 300) MB = 654 MB
User Memory = (Usable Memory * (1 -spark.memory.fraction)) = 261.6 MB
Spark Memory = (Usable Memory * spark.memory.fraction) = 392.4 MB
Spark Storage Memory = 196.2 MB
Spark Execution Memory = 196.2 MB
- 3.2.Off Heap Memory
spark.memory.offHeap.size = 5 GB = 5 * 1000 MB = 5000 MB
- 3.3 Storage Memory
Storage Memory = On Heap Memory + Off Heap Memory
= 392.4 MB + 5000 MB
= 5392.4 MB
= 5.4 GB
3.Spark Storage Memory Calculation program demo
// JVM Arguments: -Xmx5g
public class SparkMemoryCalculation {
private static final long MB = 1024 * 1024;
private static final long RESERVED_SYSTEM_MEMORY_BYTES = 300 * MB;
private static final double SparkMemoryStorageFraction = 0.5;
private static final double SparkMemoryFraction = 0.6;
public static void main(String[] args) {
long systemMemory = Runtime.getRuntime().maxMemory();
long usableMemory = systemMemory - RESERVED_SYSTEM_MEMORY_BYTES;
long sparkMemory = convertDoubletLong(usableMemory * SparkMemoryFraction);
long userMemory = convertDoubletLong(usableMemory * (1 - SparkMemoryFraction));
long storageMemory = convertDoubletLong(sparkMemory * SparkMemoryStorageFraction);
long executionMemory = convertDoubletLong(sparkMemory * (1 - SparkMemoryStorageFraction));
printMemoryInMB("Heap Memory\t\t", systemMemory);
printMemoryInMB("Reserved Memory", RESERVED_SYSTEM_MEMORY_BYTES);
printMemoryInMB("Usable Memory\t", usableMemory);
printMemoryInMB("User Memory\t\t", userMemory);
printMemoryInMB("Spark Memory\t", sparkMemory);
printMemoryInMB("Storage Memory\t", storageMemory);
printMemoryInMB("Execution Memory", executionMemory);
System.out.println();
printStorageMemoryInMB("Spark Storage Memory", sparkMemory);
printStorageMemoryInMB("Storage Memory UI \t", storageMemory);
printStorageMemoryInMB("Execution Memory UI", executionMemory);
}
private static void printMemoryInMB(String type, long memory) {
System.out.println(type + " \t=\t"+ (memory/MB) +" MB");
}
private static void printStorageMemoryInMB(String type, long memory) {
System.out.println(type + " \t=\t"+ (memory/(1000*1000)) +" MB");
}
private static Long convertDoubletLong(double val) {
return new Double(val).longValue();
}
}
summary
Reference resources :
https://community.cloudera.com/t5/Community-Articles/Spark-Memory-Management/ta-p/317794
The article written by the boss is like a work of Art
Last , Here's a message “ knowledge , Even the illusion of knowledge , It will also become your armor , Protect you from ignorance ”( From Zhihu ——《 Why read books ?》)
边栏推荐
- Arm architecture and programming 6 -- Relocation (based on Baiwen arm architecture and programming tutorial video)
- How the next dbcontext of efcore advanced SaaS system supports multi database migration
- Win11 highlights of win11 system
- 免费学习机器学习交易的资源
- Technology enabled new insurance: the digital transformation of China Property Insurance
- 关 于 路 由
- SCM learning notes 6 -- interrupt system (based on Baiwen STM32F103 series tutorials)
- Hcip day 5 notes
- 医院网络安全架构
- Arm architecture and programming 5 -- GCC and makefile (based on Baiwen arm architecture and programming tutorial video)
猜你喜欢

Data warehouse construction - ods floor

Hcip day 5 notes

Arm architecture and programming 2 -- arm architecture (based on Baiwen arm architecture and programming tutorial video)

架构实战营模块二作业

jenkins多任務並發構建

Problèmes de localisation et de planification des itinéraires (Lingo, mise en œuvre de MATLAB)

基于强化空间注意力的视网膜网络(ESA-Unet)

HCIP第十二天笔记

Hospital generic cabling

Database paradigm and schema decomposition
随机推荐
php7 垃圾回收机制详解
How to solve the problem that the device video cannot be played due to the missing CGO playback callback parameters of easycvr platform?
HCIP第二天笔记
医院综合布线
Arm architecture and programming 7 -- exceptions and interrupts (based on Baiwen arm architecture and programming tutorial video)
Hcip day 6 notes
[cloud native kubernetes] deployment advanced resource object management under kubernetes cluster
Measurement and acquisition of permanent magnet motor parameters (inductance, resistance, pole number, flux linkage constant)
Vessel Segmentation in Retinal Image Based on Retina-GAN
Network type
OSPF (fifth day notes)
Database design
Simple Gan instance code
HCIP实验
After the interview with 20 or 30 companies, there is no offer that you can't get after the Android interview
Hcip first day notes
Introduction to digital signature technology
HCIP第十天笔记
Research on retinal vascular segmentation based on GAN using few samples
Hospital network security architecture