Capturing Garbage Collection Traces in NodeJS Applications
Garbage Collection (GC) is a crucial aspect of memory management in any NodeJS application. When an application is managed efficiently, it ensures smooth operation and optimal performance. However, when there is inefficient garbage collection, it can cause performance issues, leading to application slowdowns and a degraded user experience. Analyzing the garbage collection traces is vital for diagnosing memory problems and maintaining optimal performance. This small guide will explore a few methods for capturing GC
traces in a NodeJS application which I have worked on before.
Methods for Capturing
I have used three primary methods that allow me to be able to capture garbage collection traces. They are listed right below.
- Using the
--trace-gc
flag - Utilizing the
v8
module - Leveraging the
perf_hooks
module
I am going to explore each method in more detail, including their advantages, limitations, and practical implementations from my experience of using them.
Using the --trace-gc
Flag
The --trace-gc
flag is the most straightforward method for capturing GC
traces. This flag instructs NodeJS to output garbage collection information directly to the console, providing immediate insights into memory usage and GC
events. Here's the easy way to enable this right below.
node --trace-gc my-script.mjs
I responsible for maintaining and optimizing a extremely high-traffic e-commerce application for a client. It was built using TypeScript. One day, I started receiving complaints from other members of the team about the application slowing down during some of the peak shopping hours. I had investigated the issue. I chose to use the garbage collection traces to do so.
For my first debugging task to see what was wrong, I opted for the simplest and quickest method: using the --trace-gc
flag. By adding this flag to the application's startup command, I instantly saw GC
related information in the console. Here's a good idea of what this will look like.
[721159:0x61f0210] 1201125 ms: Scavenge 27.7 (28.8) -> 26.8 (29.8) MB, 0.5 / 0.2 ms (average mu = 0.999, current mu = 0.970) allocation failure
[721166:0x5889210] 1201338 ms: Scavenge 30.7 (32.1) -> 29.7 (33.1) MB, 0.6 / 0.3 ms (average mu = 0.998, current mu = 0.972) allocation failure
Each line provides detailed insights into the memory usage before and after the GC
event, the time taken for the GC
process, and whether there were any allocation failures. By going over the data that was presented, I identified that frequent garbage collection events (scavenges) are occurring, indicating a possible memory allocation issue – which turned out to be the issue in the long run.
With the information gathered from the --trace-gc
output, I had to dive deeper into the application code. The frequent GC events suggest that the application might be creating a lot of short-lived objects, leading to frequent minor garbage collections (scavenges). After reviewing parts of the source dode, I saw a part of the code focusing on sections that handle large datasets or frequent object creations.
I saw several areas where the code can be optimized to reduce memory allocation. As an example of an instance, a loop that processes user orders was creating numerous temporary objects, which could be minimized. I went in to refactor the code to reuse objects where possible and reduces the creation of unnecessary temporary objects.
Advantages:
- Simplicity: Easy to implement by adding a single flag.
- Immediate Feedback: Provides real-time
GC
information directly in the console.
Limitations:
- Performance Overhead: Continuous logging can impact performance.
- Limited Control: The flag remains active for the entire duration of the application.
Utilizing the perf_hooks
Module
The perf_hooks
module is a built-in NodeJS module designed to capture various performance metrics, including garbage collection events. This method provides a robust and flexible way to monitor and log GC
activities, allowing for detailed analysis and performance tuning. Right below is an example of how to use perf_hooks
to capture GC
Traces.
Yet again, there was issues with performance for this application which I was working on. I went back to see what the garbage collector reporting. Everything seemed just fine for the time being – till once again peak times hit then everything seemed to go to hell. This time I chose to go with the perf_hooks
module to see what was going on. My goal here was to catch performance metrics that can be more analyzed and easier to understand the root cause of the issue.
I started by importing the perf_hooks
module into the application code that I was working with. The important looked something like the following.
const { performance, PerformanceObserver } = require('perf_hooks');
Next for all of this to work, I needed to setup a simple PerformanceObserver
to listen for the events that could be causing our issue. It also provided insights into when and how often garbage collection is happening.
const obs = new PerformanceObserver((list) => {
const entries = list.getEntries();
for (const entry of entries) {
// Log GC events to the console
console.log(entry);
}
});
obs.observe({ entryTypes: ['gc'], buffered: true });
function processData(data) {
let processedData = [];
for (let item of data) {
let temp = { ...item };
processedData.push(temp);
}
return processedData;
}
setInterval(() => {
processData(generateData());
}, 1000);
obs.disconnect();
After I managed to setup an observer to work the way I needed – the source code of the application was a mess, but that is beside the point here – to run and start monitoring the application in the console. Here is an example output that was provided to be in JSON
formatting.
{
"kind": "mark_sweep_compact",
"startTime": 864.659982532,
"duration": 7.824,
"entryType": "gc",
"name": "GC Event"
}
{
"kind": "scavenge",
"startTime": 874.589382193,
"duration": 3.245,
"entryType": "gc",
"name": "GC Event"
}
Each of the entries include the type of GC
event (e.g., mark_sweep_compact
or scavenge
), the start time, duration, and additional metadata. By examining these logs, I saw the frequency and duration of GC
events, which provides crucial insights into the memory management behavior of the application.
I noticed that the scavenge
events are occurring frequently, with relatively short durations, while mark_sweep_compact
events take longer and happen less often. The frequent scavenge
events suggest that many short-lived objects are being created and quickly discarded.
To address this major issue the code was given me, I had to review the data processing logic and identifies opportunities to optimize memory usage. For example, instead of creating a new object for each item, I chose to modify the code to reuse existing objects where possible.
function processData(data) {
let processedData = [];
let temp;
for (let item of data) {
temp = { ...item };
processedData.push(temp);
}
return processedData;
}
After I was done implementing these optimizations, I decided to run the application again with the perf_hooks
monitoring enabled. The new logs show a reduced frequency of scavenge
events, indicating that the optimizations have helped decrease the number of short-lived objects and thus reduced the pressure on the garbage collector.
Advantages:
- Detailed Insights: The
perf_hooks
module provides comprehensive data about GC events, including types, durations, and timings. - Flexible Monitoring: GC tracing can be enabled or disabled programmatically, allowing for targeted performance monitoring without impacting the entire application.
Limitations:
- Initial Setup: Setting up
perf_hooks
requires more effort compared to simply using the--trace-gc
flag. - Performance Impact: Monitoring GC events can introduce performance overhead, particularly during high-frequency GC events, so it's essential to balance monitoring needs with performance considerations.
Utilizing the v8
Module
The v8
module offers a lot more granular control over when garbage collection traces are captured. This module allows you to dynamically enable or disable the --trace-gc
flag at the runtime. This will make it useful for debugging specific parts of your code without affecting the entire application's performance.
This is the last trick I have to use. I was tasked with general optimizations for the application code to make it way faster for my client. I am able to tell you that the garbage collection could sometimes be a contributing factor to intermittent slowdowns during peak time hours of the application running.
The v8
module provides the ability to enable and disable the --trace-gc
flag dynamically. This allows Jordan to activate GC
tracing only when necessary, such as during specific transaction processing routines, rather than having it run continuously.
I first start out by importing the v8
module into the code.
import v8 from 'v8';
Next, I needed to write code to enable and disable the --trace-gc
flag around the critical sections of the application. For instance, I needed to monitor GC
activity during the peak transaction processing phase once again.
function processTransactions(transactions) {
v8.setFlagsFromString('--trace-gc');
transactions.forEach(transaction => {
let processedTransaction = { ...transaction, processed: true };
console.log('Processing transaction:', processedTransaction);
});
// Disable GC tracing
v8.setFlagsFromString('--notrace-gc');
}
const transactions = generateTransactions();
processTransactions(transactions);
After running the application, I saw the console output that I got to see how transactions were processed detailed in the GC
traces.
Each line provides insights into memory usage before and after the GC
event, the time taken, and any allocation failures. This information helps me identify that frequent scavenging is occurring during high transaction volumes, suggesting that the application creates many short-lived objects.
Based on the GC
traces which I got back from the console, I managed to identify specific parts of the transaction processing code that can be optimized. As an example, I had to refactor the code to minimize the creation of temporary objects.
After implementing these optimizations, Jordan runs the application again with GC tracing enabled during transaction processing. The frequency of GC
events decreases, and the application’s performance improves significantly.
Advantages:
- Dynamic Control: The
v8
module allows enabling and disablingGC
tracing as needed, providing flexibility to target specific code sections. - Focused Debugging: By monitoring
GC
activity only during critical operations, developers can pinpoint and resolve memory-related issues more efficiently.
Limitations:
- Complexity: Using the
v8
module requires additional code and understanding of when and where to enableGC
tracing. - Performance Impact: While more controlled than global tracing, enabling
GC
tracing still introduces some performance overhead.
The Conclusion
Capturing and analyzing garbage collection traces is essential for maintaining optimal performance and diagnosing memory issues in NodeJS applications. By using the --trace-gc
flag, the v8
module, and the perf_hooks
module, developers can gain valuable insights into GC
behavior and address performance bottlenecks effectively. Each method offers unique advantages and flexibility, making it possible to tailor the GC
tracing approach to the specific needs of the application.
Incorporating these practices into your development and debugging workflow can lead to more efficient memory management, smoother application performance, and a better overall user experience. By understanding and leveraging the tools available for GC
tracing, you can ensure that your NodeJS applications run efficiently and reliably.