Preface
In modern containerization and microservice applications , Because of the distributed environment and the complicated calling relationship ,APM(Application Performance Monitoring Application performance monitoring ) It's very important , It collects various indicators of the application and requests links , Let you know the current state of the system and the points worthy of optimization , In addition, it can help you find exceptions in the application , Help you locate the problem more conveniently .
about .NET Take it like this GC(Garbage Collector Garbage collector ) In terms of platform ,GC The indicators are also particularly important , Collection can help us analyze memory leaks 、 Optimize system performance, etc . A comprehensive collection of .NET GC indicators , As shown in the figure below .
In most scenarios it meets the requirements , But if you come across a certain moment P95 The delay increases suddenly , Asynchronous task suddenly timed out , We want to check whether these exceptions are caused by GC Of STW Time(Stop The World Time finger GC The time during which all threads are suspended ) Caused by too long , There is no way , Because these indicators are not collected at present .
So this article will take you to know , How to collect .NET GC STW Time.
Method
Such as .NET Memory performance analysis guide As mentioned in ,.NET Runtime Many events will be released during the operation , These events represent the current Runtime Operating state , Again GC Many events will be released during the operation , We can use PerfView
Tools to collect such events . Here is WorkStationGC
happen GC A sequence of events .
Microsoft-Windows-DotNETRuntime/GC/SuspendEEStart // Start pausing managed threads
Microsoft-Windows-DotNETRuntime/GC/SuspendEEStop // Pause managed thread complete
Microsoft-Windows-DotNETRuntime/GC/Start // GC Start recycling
Microsoft-Windows-DotNETRuntime/GC/Stop // GC Recycling is over
Microsoft-Windows-DotNETRuntime/GC/RestartEEStart // Resume the managed thread that was suspended before
Microsoft-Windows-DotNETRuntime/GC/RestartEEStop // Resume managed thread run complete
PS: All events can be in .NET Document official Find , Very comprehensive .
and SuspendEEStart( Pause managed threads )
To RestartEEStop( Resume managed thread run complete )
The time elapsed in the process is STW Time, We just need to record the difference between the two events , You can know this time GC STW How long does it take .
BGC It's better than WorkStationGC It's a lot more complicated , But the same thing is to measure the time spent on these two events to collect STW Time, This article does not cover too much .
Use EventSource collection
So we know which two indicators can be obtained by calculating the difference STW Time , So how to collect through code ?
Here we need to know EventSource
and EventListener
Two classes , As the name suggests, we can pass EventSource
To post events , Use EventListener
To listen for events , In this article, we mainly use EventListener
To collect GC event , For this EventSource
Class, you can see the following Microsoft documentation link , I won't give you too much introduction here .
- EventSource
- EventListener
Let's see how to useEventListener
Class listening GC event , The code is as follows :
using System.Diagnostics.Tracing;
// Turn on GC Event monitoring
var gc = new GcStwMetricsCollector();
// Create some objects
var array = Enumerable.Range(0, 1000).Select(s => (decimal)s).ToArray();
// Do it manually GC
GC.Collect();
Console.ReadLine();
public class GcStwMetricsCollector : EventListener
{
// GC keyword
private const int GC_KEYWORD = 0x0000001;
// We need to focus on GC event
private const int GCSuspendEEBegin = 9;
private const int GCRestartEEEnd = 3;
private EventSource? _eventSource;
public void Stop()
{
if (_eventSource == null)
return;
DisableEvents(_eventSource);
}
protected override void OnEventSourceCreated(EventSource eventSource)
{
_eventSource = eventSource;
// GC The incident happened in Microsoft-Windows-DotNETRuntime Under namespace
if (eventSource.Name.Equals("Microsoft-Windows-DotNETRuntime"))
{
// Enable events , The event level is Informational, Monitor only GC event
EnableEvents(eventSource, EventLevel.Informational, (EventKeywords) (GC_KEYWORD));
}
}
private long _currentStwStartTime = 0;
protected override void OnEventWritten(EventWrittenEventArgs e)
{
switch (e.EventId)
{
// Freeze managed thread start , Record current time
case GCSuspendEEBegin:
_currentStwStartTime = e.TimeStamp.Ticks;
break;
// Resume managed thread ends , Calculate the difference between the current time and the start time of the frozen managed thread
case GCRestartEEEnd:
if (_currentStwStartTime > 0)
{
var ms = TimeSpan.FromTicks(e.TimeStamp.Ticks - _currentStwStartTime).TotalMilliseconds;
_currentStwStartTime = 0;
// Output results
Console.WriteLine($"STW: {ms}ms");
}
break;
}
}
}
Running results :
STW: 0.2568ms
as for GC The enumeration value corresponding to the event , You can find it in the document I gave above .
.NET7 new API
When implementing this requirement , I noticed .NET7 There's a new one issue, Directly provides a API, So we can get the total GC STW Time, I excerpted and translated the key information .
Background and motivation
Today we are already in GetGCMemoryInfo Open access to GC Percentage value of processing time and pause time API.
Specifically, through GCMemoryInfo Of PauseTimePercentage
Field .
This is very useful , But if I just want one molecule ( namely : Total since the program was run GC Pause time ). There is no way to get it now .
API The proposal
I suggest that System.GC Add one such as the following on the API:
TimeSpan System.GC.GetTotalPauseDuration()
It will be returned GC Total pause time .
API Use
TimeSpan start = System.GC.GetTotalPauseDuration();
// ... Perform some work ...
TimeSpan end= System.GC.GetTotalPauseDuration();
Console.WriteLine(end - start + " was spent pausing in GC");
I see this API Already with the latest .NET7 Release together with the preview , We download the latest .NET7 SDK, Then change the project into .NET7, Let's try this API, The code is as follows :
using System.Diagnostics.Tracing;
// Turn on GC Event monitoring
var gc = new GcStwMetricsCollector();
// Create some objects
var array = Enumerable.Range(0, 1000).Select(s => (decimal)s).ToArray();
// Do it manually GC
GC.Collect();
Console.WriteLine($"API STW:{GC.GetTotalPauseDuration().TotalMilliseconds}ms");
Console.ReadLine();
// Omit the same code as above
Running results :
API STW: 0.223ms
Event STW: 0.296ms
API The statistics should be more accurate , We use events to get more or less a little extra overhead , However, the error is within the acceptable range .
summary
There are two ways to get .NET GC STW Time, We just need a little modification , It can be STW The monitoring function is added APM in , The following chart shows some data collected during the local test .
Of course by EventListener
More can be achieved APM Information collection , You can also study it if you are interested .
This article code link Github: https://github.com/InCerryGit/BlogCodes/tree/main/Get-GC-STW-Time
The articles :
.NET performance optimization - Recommended Collections.Pooled( Add )
.NET performance optimization - Use ValueStringBuilder String concatenation
.NET performance optimization - Use structs instead of classes