当前位置:网站首页>An online frequent fullgc troubleshooting
An online frequent fullgc troubleshooting
2022-07-23 17:47:00 【pilaf1990】
background
One day , A node of a service on the line suddenly gives an alarm , Frequently FullGC, This service hasn't been released or restarted for half a month .
Processing steps
- First isolate the container of the alarm , Remove this node from the gateway , Let the traffic no longer enter this node .
- adopt jmap Snapshot the memory of the node dump Come down .
- adopt jprofiler analysis hprof file .
The analysis reason
There are many tools to analyze hprof file , Such as MAT、JProfiler.
We go through JProfiler analysis , stay JProfiler Open a single snapshot in the startup center of .
Choose to download to local hprof file , Wait for a moment to parse , Then go straight to “ Maximum object ”:
Find a static Field occupied 1851MB size , almost 2GB!!
our JVM The startup parameters have the following settings about memory :
-Xms8192m
-Xmx8192m
-Xmn5120m
-XX:MetaspaceSize=384m
-XX:MaxMetaspaceSize=512m
namely ,java Heap memory allocation 8GB, New generation heap memory 5GB, Then the only thing left for the old age is 3GB, And then that one ConcurrentHashMap Type of static Field , Nearly 2GB( This big object is obviously placed in the old age ).
We know JVM When it comes to recycling , It is through reachability analysis to judge whether an object survives , Anything that can be learned from GCRoots Objects that arrive are not recycled , and static The field is GCRoots A kind of , That is to say, our static Fields cannot be recycled . So when it happens FullGC It's always impossible to make room for the elderly , such JVM There will soon be another FullGC, But every time FullGC Can't recycle enough space , In a vicious circle .
solve the problem
By looking at the one in the code static ConcurrentHashMap, Discovery is a field in a class of middleware , Gray scale identification for storing traffic , Only put the content inside , But no deleted code . So if map Of key It's basically different , It will lead to map As time goes by, the content in the becomes larger and larger , Eventually lead to FullGC When , Can't reclaim the memory of the older generation .
This problem , Some time ago, other services have also encountered , Later, it was fed back to the middleware , The middleware side gave a version that solved the problem and let me upgrade .
So why didn't there be any problems when I didn't upgrade before ? I guess the upstream gateway may have changed the rule of traffic gray identification , Lead to map What's in it key Suddenly it becomes more colorful , such map Can't drop the weight well , Lead to map Growing . At that time, the people who wrote the middleware code probably thought that there would not be so many key, Think map It doesn't get too big , So I didn't delete map The content in .
Checked the version of middleware upgrade , Adopted guava Of cache, Set a maxsize, such map The number of elements in will not exceed maxsize, It will not cause the elderly generation to be occupied so much . This is actually a cache elimination strategy , Keep only the most recently used maxsize Bar record .
reflection
- JVM Memory problems for , Most of the time, it is not easy to set the heap memory a little larger , If there is a problem with the code , Always run out of memory , At this time, we have to modify the code to solve the problem .
- Some memory leaks , Not soon , But it happened a long time later , For example, our service , I haven't restarted in half a month , As a result, the elderly generation is slowly occupied and can not be recycled . It seems that increasing the frequency of release may also cover up some hidden memory leakage problems .
- Local caching can improve speed , But be careful , See if you need to remove unused content , For example, using guava Of cache, Setting a fixed size can effectively avoid memory leakage .
边栏推荐
- rust求两数之和
- Makefile common functions notdir, wildcard, patsubst
- 训练和测试的loss不下降,并且精度超低
- 日期格式化
- rust求数组中最大值
- Encapsulate the general connection and query of the project with pymysql
- USB Type-C PD CC逻辑芯片中的角色定义
- Summary of stock historical data download interface (dynamic update)
- 59. General knowledge of lightning safety
- Date formatting
猜你喜欢

Redis分布式锁,没它真不行

Three things programmers want to do most | comics

Interviewer: how to use redis to realize distributed locks?

Kubernetes kubelet hard core knowledge architecture

Food safety chocolate is also true or false? How much do you know about it

SAP HANA数据库备份失败解决办法

Données chronologiques dans l'Internet industriel des objets

MySQL7种JOIN(图)

记录一下MySql update会锁定哪些范围的数据

Geometric parametric reconstruction
随机推荐
记录一下MySql update会锁定哪些范围的数据
mysqldump的各项参数
Visualization of network infrastructure
工业物联网中的时序数据
From 5 seconds to 1 second, remember the performance optimization with "very" significant effect once
computed在项目中的使用
WSUS can patch MySQL Middleware_ Join the WSUS patch server and download the patch
isEmpty 和 isBlank 的用法区别,至少一半的人答不上来...
Research and implementation of network multi exit design based on policy routing deployment
卷积核越大性能越强?一文解读RepLKNet模型
Tapdata 与优炫数据库完成产品兼容性互认证
Don't ask me again why MySQL hasn't left the index? For these reasons, I'll tell you all
[operation] Yan Yi (Internet new technology operation)
工作常用操作
基于scrapy的电商平台数据爬取与展示
使用moment获取当天日期与下一天
Thread pool, who am I? Where am I?
sns_ sensor_ instance_ api
Food safety | attention to smoking food, do you know this knowledge
JS tool CECP