25 min readJava Jvm Garbage Collection

How JVM Garbage Collection Works: Types, Memory Impact, and Tuning

From Minor GC to ZGC: how each Java garbage collector manages heap memory and what it means for your application's latency and throughput.

Abstract Algorithms/Apr 10, 2026/How It Works: Internals Explained

Executive TLDR

TLDR: JVM garbage collection automatically reclaims unused heap memory, but every algorithm makes a different trade off between throughput, latency, and memory footprint.
The default G1GC targets 200ms pause goals and works well for most services.
For sub millisecond pauses on large heaps, use ZGC (Java 15+).
For pure batch throughput, Parallel GC wins.

Core mental model

Read this as a system of state, constraints, and failure boundaries.

From Minor GC to ZGC: how each Java garbage collector manages heap memory and what it means for your application's latency and throughput.

Explain simpler Compare tradeoffs

Key systems visualization

The article’s conceptual path

🚨 When the JVM Stops the World: A Production Incident

📖 Inside the JVM Heap: Memory Regions and Their Roles

⚙️ Young Generation Collections: How Minor GC Keeps Eden Flowing

🔍 Old Generation, Metaspace, and GC Roots: The Memory Management Foundation

🧬 Metaspace: The Class Metadata Region That Lives Outside the Heap

TLDR: JVM garbage collection automatically reclaims unused heap memory, but every algorithm makes a different trade-off between throughput, latency, and memory footprint. The default G1GC targets 200ms pause goals and works well for most services. For sub-millisecond pauses on large heaps, use ZGC (Java 15+). For pure batch throughput, Parallel GC wins. Understanding heap regions — Eden, Survivor, Old Gen, Metaspace — tells you why OutOfMemoryError happens, what triggers a full GC, and which JVM flags to tune first.

🚨 When the JVM Stops the World: A Production Incident

It is 2:47 AM. Your payment microservice is paging the on-call engineer. P99 latency has spiked from 180 ms to 34,000 ms. The load balancer is timing out requests. Customers cannot complete checkout. The service never crashed — it just went completely unresponsive for 30 seconds at a time, every three minutes, with clockwork regularity.

The APM dashboard shows the problem clearly once you know what to look for: full GC pause, 30,478 ms, Old Generation 99.8% full. The service is spending more time doing garbage collection than serving traffic.

This scenario has played out at dozens of companies. A fintech startup running a product recommendation engine on Java 11 with 4 GB of heap and default JVM settings saw the same pattern. Their Old Generation was filling with cached user session objects. The JVM's response was to trigger a full stop-the-world GC — halting every application thread until the collection finished. Switching to G1GC with explicit heap sizing (-Xms4g -Xmx4g) and setting -XX:MaxGCPauseMillis=100 cut those pauses from 30 seconds to under 200 ms without any code changes.

Understanding how JVM garbage collection actually works — which memory regions exist, what each GC algorithm does internally, and what its pause characteristics are — is the difference between guessing at JVM flags and making targeted, effective tuning decisions. This post builds that understanding from the ground up.

📖 Inside the JVM Heap: Memory Regions and Their Roles

Before examining any GC algorithm, you need a clear model of the heap's internal structure. If you want the stack-vs-heap baseline first, see Java Memory Model: Stack vs. Heap Demystified. The JVM divides memory into several distinct regions, each with a different object lifecycle role.

The diagram below shows the full JVM memory layout. The heap (where GC operates) is split into Young Generation and Old Generation. Metaspace sits outside the heap in native memory and holds class metadata rather than object instance data.

graph TD
    JVM[JVM Process Memory] --> Heap[Heap - GC Managed]
    JVM --> NonHeap[Non-Heap Memory]

    Heap --> YoungGen[Young Generation]
    Heap --> OldGen[Old Generation - Tenured Space]

    YoungGen --> Eden[Eden Space - new allocations]
    YoungGen --> S0[Survivor Space S0]
    YoungGen --> S1[Survivor Space S1]

    NonHeap --> Metaspace[Metaspace - class metadata]
    NonHeap --> CodeCache[Code Cache - JIT compiled code]
    NonHeap --> Stacks[Thread Stacks - per-thread frames]

The Young Generation is where every new object starts. Eden Space is the primary allocation area — the JVM uses fast pointer-bump allocation here, making object creation essentially free. The two Survivor spaces (S0 and S1) are used in a ping-pong fashion: after each minor GC, live objects from Eden and the active Survivor space are copied into the inactive Survivor space, incrementing the object's age counter. Objects that survive enough GC cycles (default threshold: 15) are promoted to the Old Generation.

The Old Generation holds long-lived objects — caches, session state, database connection pools. It is larger than Young Gen (typically 2–3x in ratio). When Old Gen fills, the JVM must run a major or full GC, which is far more expensive than a minor GC.

Metaspace (introduced in Java 8 to replace PermGen) holds class definitions, method bytecode, and constant pool data. It grows dynamically in native memory. Without an explicit -XX:MaxMetaspaceSize cap, a class loader leak can exhaust native memory entirely.

⚙️ Young Generation Collections: How Minor GC Keeps Eden Flowing

A Minor GC fires whenever Eden Space is full. It operates exclusively on the Young Generation, which is why it is fast — typically 5 to 50 ms even on large heaps. The collection algorithm is mark-copy: live objects are identified (marked) by tracing from GC Roots, then copied to the active Survivor space. The old Eden and the previously-active Survivor space are then cleared in a single sweep.

GC Roots are the anchor points the JVM uses to determine object reachability. An object is "live" if it is reachable from any GC Root — everything else is garbage. GC Roots include:

Local variables and parameters in active stack frames
Static fields of loaded classes
Active JNI references (objects held by native code)
Monitor objects held by synchronized threads
Objects referenced from the JVM's own internal structures

The key insight about Minor GC: it is a stop-the-world event — all application threads pause — but because the Young Generation is small and most allocations die young (the "generational hypothesis"), the pause is short and predictable. In a healthy JVM, you want most objects to die in Eden before they ever reach a Survivor space.

Object promotion happens when an object's age counter (incremented on each surviving GC) reaches the tenuring threshold, or when the Survivor space is too full to hold all surviving objects. Premature promotion — where short-lived but large objects get pushed to Old Gen before they should — is a common cause of premature full GCs and should be on your radar when diagnosing GC problems.

The flowchart below traces the full Minor GC mark-copy cycle from object allocation through to either collection in Eden or promotion to Old Generation.

flowchart TD
    Alloc["Object allocated in Eden Space via pointer-bump"] --> EdenFull{"Eden Space full?"}
    EdenFull -->|No| MoreAlloc["Continue allocating more objects in Eden"]
    MoreAlloc --> EdenFull
    EdenFull -->|Yes| STWPause["Stop-the-World pause begins - all threads halt"]
    STWPause --> MarkRoots["Trace GC Roots to identify all reachable live objects"]
    MarkRoots --> CopyLive["Copy live objects into the active Survivor space"]
    CopyLive --> IncrAge["Increment age counter on each surviving object"]
    IncrAge --> ClearEden["Clear Eden and the now-exhausted Survivor space"]
    ClearEden --> STWEnd["Stop-the-World pause ends - application threads resume"]
    IncrAge --> AgeCheck{"Object age reached tenuring threshold?"}
    AgeCheck -->|No| StaySurvivor["Object remains in Survivor space for next GC cycle"]
    AgeCheck -->|Yes| PromoteOld["Object promoted to Old Generation"]
    PromoteOld --> OldGenGrows["Old Generation fill level increases"]

🔍 Old Generation, Metaspace, and GC Roots: The Memory Management Foundation

The Old Generation accumulates promoted objects. Its collection algorithms vary by GC choice, but all share the same trigger: Old Gen fill percentage crosses a threshold (default 45% for G1, configurable for others).

Unlike Minor GC, which uses mark-copy on a small region, Old Generation collection must handle a much larger space with a different algorithm:

Mark-Sweep-Compact: mark live objects → sweep dead ones → compact remaining objects to eliminate fragmentation. Used by Serial and Parallel GC. Always stop-the-world.
Concurrent collection: CMS, G1, ZGC, and Shenandoah do most marking and sweeping concurrently with application threads running, reducing or eliminating stop-the-world pauses.

A Full GC is the worst case — it collects both Young and Old Generation together, using a stop-the-world compaction. Full GC events are typically triggered by:

Explicit System.gc() calls in application code
Concurrent GC mode failure (CMS or G1 can't complete a cycle before Old Gen fills)
Humongous object allocation failures (in G1)
Metaspace expansion triggering a full collection

The 30-second pause in the opening scenario was a Full GC event. The Old Generation had no concurrent collector keeping up, so the JVM fell back to a monolithic, single-threaded full heap scan.

🧬 Metaspace: The Class Metadata Region That Lives Outside the Heap

Metaspace replaced PermGen in Java 8. Before this change, class metadata was stored in a fixed-size heap region — a PermGen full error was a common deployment failure in heavy frameworks like JBoss or OSGi.

Metaspace solves the fixed-size problem by allocating class metadata in native (OS-managed) memory. It grows on demand. The downside: without an explicit cap, a class loader leak will grow Metaspace until the JVM crashes with OutOfMemoryError: Metaspace.

Metaspace is most often a problem in environments with dynamic class loading: hot-deploy application servers (Tomcat, JBoss), OSGi containers, scripting engines (Groovy, JRuby), or annotation processors that generate classes at runtime. Each redeploy that does not properly unload the old class loader's classes leaves Metaspace fragmented with orphaned class metadata.

🧠 The Six GC Algorithms: Serial, Parallel, CMS, G1, ZGC, and Shenandoah

The JVM ships with six distinct garbage collection implementations. Each makes a specific engineering trade-off. Here is how to understand each one.

GC Algorithm Internals: Mark, Copy, Sweep, and Compact

Every GC algorithm is built from four fundamental operations: mark (identify live objects by tracing from GC roots), copy (move live objects to a new region — used in Young Gen), sweep (scan a region and reclaim dead objects in place — used in Old Gen), and compact (slide live objects together to eliminate fragmentation — the most expensive step).

Young Gen collectors universally use mark-copy because it is fast: copying live objects to Survivor space implicitly clears the source (Eden) in a single pass. Old Gen collectors diverge here — Serial and Parallel use mark-sweep-compact (stop-the-world), CMS uses mark-sweep without compaction (leading to fragmentation), while G1, ZGC, and Shenandoah use concurrent marking with incremental or background compaction to avoid long stop-the-world phases.

Performance Analysis: Stop-the-World Pause Times by Algorithm

The performance of a GC algorithm is measured primarily by two metrics: STW pause duration (how long application threads stop) and throughput (percentage of CPU time spent on application work vs. GC). These two metrics are in direct tension — reducing pause times typically requires more concurrent GC work, which consumes CPU that could otherwise execute application code.

The table in the following section provides precise pause ranges per algorithm. The key insight is that Serial and Parallel GC trade pause time for throughput simplicity, while G1 uses region-based incremental collection to bound pauses, and ZGC/Shenandoah use pointer-level barriers to push nearly all GC work off the stop-the-world path entirely.

Serial GC — Single-Threaded Simplicity for Tiny Heaps

Problem it solves: Minimum GC overhead and memory footprint for embedded or microcontainer environments.

How it works: Serial GC uses a single GC thread for all collections. Young Gen uses mark-copy; Old Gen uses mark-sweep-compact. Because everything is single-threaded, there is zero synchronization overhead between GC threads — but application threads are fully paused for the entire collection.

When to use: Heap sizes under 256 MB, containers with a single CPU, development environments, serverless functions with fast cold starts. Never use in a service handling concurrent user traffic.

Enable with: -XX:+UseSerialGC

Parallel GC — Throughput-First for Batch Workloads

Problem it solves: Maximizing the percentage of CPU time spent doing actual work (not GC) for batch or analytics jobs.

How it works: Parallel GC uses multiple GC threads for both Young and Old Generation collections. Every collection is still stop-the-world, but with N GC threads (default: number of CPUs), collection time drops proportionally. Young Gen uses parallel mark-copy; Old Gen uses parallel mark-sweep-compact.

When to use: ETL pipelines, data processing jobs, batch report generation — any workload where total throughput matters more than individual request latency. LinkedIn's offline data pipelines historically used Parallel GC for exactly this reason. It was the default JVM GC before Java 9.

Enable with: -XX:+UseParallelGC
Tune with: -XX:ParallelGCThreads=<n>

CMS (Concurrent Mark-Sweep) — Low-Pause Pioneer, Now Legacy

Problem it solves: Reducing Old Generation pause times for interactive applications, while keeping Young Gen collection concurrent.

How it works: CMS splits Old Gen collection into phases:

Initial Mark (STW) — mark objects directly reachable from GC Roots
Concurrent Mark — trace the object graph while application threads run
Concurrent Preclean — find objects modified during step 2
Remark (STW) — finalize marking for objects changed during concurrent phase
Concurrent Sweep — reclaim dead objects while application threads run
Concurrent Reset — prepare for next cycle

The critical limitation: CMS does not compact the Old Generation. Over time, dead objects leave gaps ("fragmentation"). When a large object cannot find a contiguous free region despite enough total free memory, CMS falls into "concurrent mode failure" and triggers a full stop-the-world compaction — often the longest pause you will ever see.

Status: Deprecated in Java 9, removed in Java 14. Avoid for new systems.
Enable with: -XX:+UseConcMarkSweepGC (Java 8–13 only)

G1 GC (Garbage-First) — The Production-Ready Default

Problem it solves: Predictable pause times across large heaps without the fragmentation problems of CMS.

How it works: G1 abandons the contiguous Young/Old Gen model entirely. Instead, it divides the heap into hundreds or thousands of equal-sized regions (1–32 MB each, depending on total heap size). Each region is dynamically classified as Eden, Survivor, Old, or Humongous (for objects ≥ 50% of a region).

During a Young GC, G1 evacuates all Eden and Survivor regions — fully stop-the-world, but limited to only Young regions. During Mixed GC, G1 evacuates Young regions plus the Old regions with the highest ratio of dead objects (hence "Garbage-First" — it prioritizes the richest garbage). This allows G1 to reclaim Old Gen incrementally without a full sweep.

G1's concurrent marking phase runs in parallel with application threads, building a liveness map of all regions. This lets the collector predict pause times and limit collection to fit within the target pause window.

When to use: Default choice for any heap between 4 GB and 100 GB serving interactive traffic. LinkedIn's feed ranking service uses G1GC with -XX:MaxGCPauseMillis=150 to maintain sub-200ms response times under sustained load.

Enable with: -XX:+UseG1GC (default since Java 9)
Tune with: -XX:MaxGCPauseMillis=200, -XX:G1HeapRegionSize=16m, -XX:G1NewSizePercent=20

ZGC — Sub-Millisecond Pauses at Terabyte Scale

Problem it solves: Applications that cannot tolerate even 50ms pauses regardless of heap size — financial trading, real-time recommendation engines, gaming backends.

How it works: ZGC achieves its pause goals through two novel techniques:

Colored pointers: ZGC uses 3 bits in each 64-bit reference pointer to track GC metadata (marked, remapped, finalizable). This allows ZGC to process object references concurrently without stopping threads.
Load barriers: The JVM injects a small code check at every object reference load. When an application thread reads a reference, the load barrier checks whether the pointer needs to be updated (if the object was relocated during concurrent compaction). This happens inline, nanoseconds per operation.

With these two mechanisms, ZGC performs all major work — marking, relocation, and remapping — while application threads run. The only stop-the-world phases (initial mark and final mark) take under 1 ms regardless of heap size.

Trade-offs: ZGC's load barriers add approximately 5–15% throughput overhead compared to G1. For latency-sensitive workloads, this is a worthwhile trade. For pure throughput batch jobs, Parallel GC still wins.

Discord uses ZGC in their JVM services after reporting that G1GC's occasional 50–100ms pauses were causing missed heartbeat timeouts in their real-time message routing layer.

Enable with: -XX:+UseZGC (Java 15+ for production)
Tune with: -Xms<size> -Xmx<size> (always set equal), -XX:ConcGCThreads=<n>

Shenandoah GC — Concurrent Compaction from Red Hat

Problem it solves: The same sub-millisecond latency goals as ZGC, but with a different implementation strategy that includes concurrent compaction — avoiding the fragmentation that plagued CMS.

How it works: Shenandoah uses Brooks forwarding pointers — each object gets an extra header word pointing to its current location. During concurrent evacuation, Shenandoah copies live objects to new locations while application threads continue. Threads accessing a moved object find the forwarding pointer and transparently use the new address.

Unlike ZGC (which uses colored pointers), Shenandoah adds a write barrier that allows concurrent evacuation without the colored-pointer constraint, making it compatible with a wider range of platforms (32-bit included).

When to use: When you need concurrent compaction on platforms where ZGC is unavailable, or prefer Red Hat's maintenance and support model. Available in OpenJDK 12+ via Red Hat's upstream contribution.

Enable with: -XX:+UseShenandoahGC
Tune with: -XX:ShenandoahGCMode=iu (incremental-update mode), -XX:ShenandoahGarbageThreshold=25

📊 Visual Reference

flowchart TD
    DB["Database
(WAL/Log)"]
    Capture["Capture Engine"]
    Transform["Transform"]
    Sink["Sink/Destination"]

    DB -->|"Change Log"| Capture
    Capture -->|"CDC Event"| Transform
    Transform -->|"Normalized"| Sink

📊 Object Lifecycle from Allocation to Collection

The sequence diagram below traces a single object's lifecycle from the moment it is allocated to the moment it is either collected as garbage or promoted to Old Gen. Reading this diagram helps you understand exactly which events cause which GC phases.

sequenceDiagram
    participant App as Application Thread
    participant Eden as Eden Space
    participant Surv as Survivor Spaces
    participant Old as Old Generation
    participant GC as GC Thread

    App->>Eden: allocate new object
    Note over Eden: Eden fills to capacity
    GC->>App: Minor GC starts - all threads pause
    GC->>Eden: trace from GC Roots - mark live objects
    GC->>Surv: copy live objects to active Survivor - age = 1
    GC->>Eden: reclaim entire Eden space
    GC->>App: Minor GC ends - threads resume
    Note over Surv: object survives multiple Minor GC cycles
    GC->>Surv: Minor GC again - age increments each cycle
    GC->>Old: promote object when age reaches threshold (default 15)
    Note over Old: Old Gen fill threshold crossed
    GC->>App: concurrent marking begins - threads keep running
    GC->>Old: concurrent sweep - reclaim dead tenured objects
    GC->>App: short remark pause - finalize marking

Each step in this diagram corresponds to a measurable JVM metric. The pause between "Minor GC starts" and "threads resume" is what you see in gc.log as GC pause (young). The concurrent phases in Old Gen are what make G1 and ZGC fundamentally different from Serial/Parallel GC — those concurrent phases eliminate most of the stop-the-world time that would otherwise be a full GC pause.

🌍 Real-World Performance: How Each Collector Behaves Under Production Load

Choosing the right GC algorithm requires understanding each one's performance profile. The table below summarises the key dimensions:

GC Algorithm	Java Version	STW Pause	Throughput	Heap Range	Best Fit
Serial GC	Any	Very High (100ms–10s)	Low	< 256 MB	Containers, dev, tools
Parallel GC	Any (pre-9 default)	High (100ms–5s)	Highest	Any	Batch jobs, ETL
CMS	Java 6–13 (deprecated)	Medium (10–100ms)	Medium	Any	Legacy low-latency
G1 GC	Java 9+ (default)	Low-Medium (10–200ms)	High	4 GB–100 GB	General services
ZGC	Java 15+ (production)	Sub-millisecond (< 1ms)	Medium-High	Up to 16 TB	Ultra-low latency
Shenandoah	Java 12+ (OpenJDK)	Sub-millisecond (< 1ms)	Medium-High	Any	Low-latency + compaction

Heap sizing impacts pause times significantly. G1 with a 2 GB heap and G1 with a 32 GB heap will have very different pause times. G1 must evacuate proportionally more data during a Young GC as the heap grows. ZGC's sub-millisecond guarantee holds across heap sizes because its concurrent work scales with available CPU threads, not heap size.

Throughput trade-off with concurrent GCs: ZGC and Shenandoah achieve their low pauses by doing GC work concurrently — stealing CPU cycles from application threads. Under sustained allocation pressure, this can reduce throughput by 10–20% compared to Parallel GC. If your application runs batch computations on a single large instance, Parallel GC will complete the job faster in wall-clock time even with longer GC pauses.

The Generational Hypothesis and -XX:NewRatio: The JVM assumes most objects die young. The default NewRatio (2 for G1, 2–3 for others) allocates roughly one-third of the heap to Young Gen. If your workload allocates many large, long-lived objects (e.g., in-memory data grids), increasing Old Gen size with a higher NewRatio reduces promotion pressure. If your workload is ephemeral (web requests, microservices), a larger Young Gen means fewer Minor GCs trigger promotion.

⚖️ The GC Throughput vs. Latency vs. Footprint Trade-Off Triangle

Every GC algorithm sits on a triangle with three vertices: throughput, latency, and footprint. You can fully optimise two of the three, but not all three simultaneously.

graph TD
    Throughput[Throughput - time not spent on GC] -->|Parallel GC wins| Batch[Batch and analytics jobs]
    Latency[Latency - max pause time] -->|ZGC and Shenandoah win| RealTime[Trading systems and gaming]
    Footprint[Footprint - GC memory overhead] -->|Serial GC wins| Embedded[Containers and edge devices]
    Throughput -->|G1 balances well| G1[G1 GC - general services]
    Latency -->|G1 is acceptable| G1

Understanding where your workload sits on this triangle drives all GC selection decisions:

Throughput-dominated (batch ETL, report generation, ML training on JVM): Use Parallel GC. STW pauses during off-hours batch runs are tolerable; maximum throughput is the goal.
Latency-dominated (payment processing, real-time APIs, messaging brokers): Use G1GC as the baseline; switch to ZGC if P99 latency from GC pauses is still outside SLA.
Footprint-dominated (hundreds of microservice instances in Kubernetes, edge devices): Use Serial GC with small heaps (256–512 MB) and keep services stateless so restarts are cheap.

🧭 Choosing the Right GC Algorithm and Handling Failure Modes

OutOfMemoryError: Java heap space

The most common GC failure. The heap is full and the JVM cannot allocate a new object. Causes include:

Memory leak: objects are being retained unintentionally (static collections, listeners not deregistered, ThreadLocal values not removed)
Undersized heap for the workload (-Xmx too low)
Sudden spike in allocation rate overwhelming GC throughput

Diagnose with: -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/app/heapdump.hprof. Open the dump in Eclipse MAT or VisualVM to identify the retained object graph.

OutOfMemoryError: GC overhead limit exceeded

The JVM built-in safety valve. When GC consumes more than 98% of CPU but recovers less than 2% of heap across multiple consecutive cycles, the JVM gives up and throws this error rather than spinning indefinitely.

This error means the heap is severely undersized for the workload, or there is a severe memory leak. Adding more heap is a temporary fix. The real fix requires identifying why objects are not being collected.

OutOfMemoryError: Metaspace

The class metadata region is full. Classic causes:

Class loader leak in hot-deploy: Tomcat war redeployment that retains the old classloader's classes. Each redeploy leaks a classloader's worth of class definitions.
Dynamic code generation: Groovy or reflection-based frameworks generating classes at runtime without classloader cleanup.
Missing MaxMetaspaceSize cap: Metaspace grows into native memory without bound.

Diagnose with: -XX:+TraceClassLoading -XX:+TraceClassUnloading to spot classes that load but never unload. Set -XX:MaxMetaspaceSize=256m to fail fast rather than exhausting native memory.

🧪 Diagnosing a GC Problem: A Step-by-Step Production Walkthrough

This section demonstrates why the GC concepts above matter in a real scenario — not just as theory but as a diagnostic workflow you can apply the next time GC causes a production incident.

The scenario: A Spring Boot API service running on Java 17 with G1GC begins showing elevated P99 latency. Response times that were consistently under 150 ms are now occasionally spiking to 600–900 ms. The spikes happen roughly every two minutes. CPU is normal. Database query times are fine. The spike pattern is suspiciously regular.

Step 1 — Confirm it is GC-related. Enable unified GC logging and restart the service:

-Xlog:gc*:file=/var/log/app/gc.log:time,uptime,level,tags:filecount=5,filesize=20m

Open the log. You will see lines like:

[2026-04-10T02:47:31.241+0000][info][gc] GC(142) Pause Young (Normal) (G1 Evacuation Pause) 2048M->1204M(4096M) 482.341ms
[2026-04-10T02:49:44.517+0000][info][gc] GC(143) Pause Young (Normal) (G1 Evacuation Pause) 2048M->1209M(4096M) 509.112ms

The 480–510 ms Young GC pauses align perfectly with the latency spikes. G1's Young GC is stop-the-world, and these pauses are far too long for a 4 GB heap. Something is wrong with the region sizing or allocation rate.

Step 2 — Diagnose the cause. Look for promotion failure warnings:

[info][gc] GC(142) To-space exhausted
[info][gc] GC(142) Evacuation Failure

If you see evacuation failures, G1's Survivor spaces are overflowing — too many objects are being promoted per cycle. This is premature promotion. Likely cause: the heap is allocated at the default size (-Xmx defaulting to ~25% of RAM) and not pre-warmed (-Xms much smaller than -Xmx).

Step 3 — Apply targeted fixes. Based on the diagnosis:

# Fix 1: Pre-warm the heap (eliminates heap-resize Full GCs)
-Xms4g -Xmx4g

# Fix 2: Tighten the pause target (G1 will collect more frequently in smaller batches)
-XX:MaxGCPauseMillis=100

# Fix 3: Increase Survivor space ratio to absorb more short-lived objects
-XX:SurvivorRatio=6

Step 4 — Verify improvement. Re-read the GC log after the fix. Young GC pauses should drop to 30–80 ms. If they do not, use async-profiler to identify the allocation hotspot:

java -jar async-profiler.jar -e alloc -d 30 -f /tmp/alloc.html <pid>

This walkthrough shows that GC diagnosis is a skill built on knowing what each region does, recognising the log patterns for each GC type, and knowing which flag targets which root cause.

🛠️ GC Logging, Monitoring, and Tuning with JVM Flags

Understanding GC behavior in production requires structured logging and metrics. The following JVM flags enable production-grade GC visibility — these are configuration-level options, not application code.

Java 11+ GC logging (Unified JVM Logging):

# Recommended GC logging flags for Java 11+
-Xlog:gc*:file=/var/log/app/gc.log:time,uptime,level,tags:filecount=5,filesize=20m
-Xlog:gc+heap=debug
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/var/log/app/heapdump.hprof

G1GC tuning reference — the flags you should know:

# G1GC — general-purpose production baseline
-XX:+UseG1GC
-Xms4g
-Xmx4g
-XX:MaxGCPauseMillis=200
-XX:G1HeapRegionSize=16m
-XX:G1NewSizePercent=20
-XX:G1MaxNewSizePercent=40
-XX:InitiatingHeapOccupancyPercent=45

ZGC production configuration:

# ZGC — Java 17+ generational ZGC recommended
-XX:+UseZGC
-XX:+ZGenerational
-Xms8g
-Xmx8g
-XX:ConcGCThreads=4

Micrometer JVM metrics — expose GC pause times and heap usage to Prometheus:

# application.yml — Spring Boot 3.x with Micrometer
management:
  metrics:
    enable:
      jvm: true
  endpoints:
    web:
      exposure:
        include: metrics,prometheus

With this configuration, Micrometer automatically exports jvm_gc_pause_seconds, jvm_memory_used_bytes, jvm_memory_max_bytes, and jvm_gc_memory_promoted_bytes to your Prometheus scrape endpoint. Alert on jvm_gc_pause_seconds{cause="G1 Young Generation"} > 0.5 to catch GC degradation before users notice it.

GC Analyzer tools: Use GCEasy (https://gceasy.io) or Censum (Eclipse project) to parse GC log files and visualise pause distribution, promotion rates, and allocation velocity. These are the fastest ways to identify whether your performance problem is GC-related, and which specific event type (Young, Mixed, Full) is the culprit.

For a full deep-dive on Micrometer-based JVM observability in Spring Boot, including custom GC pause SLO alerts with Prometheus alerting rules, a companion post is planned in this series.

📚 Hard-Earned Lessons from JVM GC Production Incidents

Always set -Xms equal to -Xmx. When the JVM starts with a small heap and grows dynamically, it triggers frequent Full GC collections to resize. Fixed heap eliminates this churn and makes GC behavior predictable. This single change has resolved GC-related latency spikes at multiple companies.

Prefer G1GC as your starting point, not CMS. CMS is removed in Java 14. If you are on Java 11+, any new system should start with G1GC. The fragmentation problems and "concurrent mode failure" fallbacks of CMS make it harder to tune and reason about than G1.

Never suppress GCOverheadLimitExceeded with -XX:-UseGCOverheadLimit. This flag exists to protect you from infinite GC loops. Disabling it masks a real problem (memory leak or undersized heap) and delays the crash until an even worse moment.

Humongous objects in G1 bypass Eden entirely. In G1, any object larger than 50% of the region size is classified as "humongous" and allocated directly in Old Gen. If you have high allocation rates of byte arrays or strings above 8 MB (with a 16 MB region size), you are generating Old Gen pressure with every allocation. Increase -XX:G1HeapRegionSize to push the humongous threshold higher, or profile and reduce the object sizes.

Metaspace leaks are silent and slow. Unlike heap leaks that trigger frequent GC cycles, a Metaspace leak grows quietly in native memory. You will not see GC pause increases. The only signal is a steady upward trend in native memory usage, visible via jcmd <pid> VM.native_memory or the jvm_memory_used_bytes{area="nonheap"} Micrometer metric.

GC tuning is a last resort, not a first response. Before touching JVM flags, profile your allocation rate. Tools like async-profiler's allocation profiler (-e alloc) will show you exactly which code paths are generating the most garbage. Reducing allocation rate by 30% has a bigger impact than any GC flag change.

📌 Summary & Key Takeaways: What to Remember About JVM Garbage Collection

Heap structure drives GC behavior. Eden → Survivor → Old Gen is the standard promotion path. Understanding object age and promotion thresholds explains why long-lived caches cause major GC pressure.
Minor GC is fast; Full GC is the enemy. Keep Old Gen under 70% occupancy and G1's concurrent marking will prevent Full GC events. Full GC is almost always the result of a tuning failure or memory leak, not normal GC behavior.
Generational hypothesis applies to your workload. Web services allocating and discarding HTTP request objects within milliseconds benefit from large Eden and frequent Minor GC. Long-running batch jobs with large working sets benefit from large Old Gen and higher tenuring thresholds.
G1GC is the safe default; tune from there. For most services, -XX:+UseG1GC -Xms<n>g -Xmx<n>g -XX:MaxGCPauseMillis=200 is the starting configuration. Only move to ZGC when G1 cannot meet your latency SLO.
Metaspace needs an explicit cap. Always set -XX:MaxMetaspaceSize=256m (or higher based on your class count) to prevent native memory exhaustion from class loader leaks.
Measure before tuning. Enable -Xlog:gc* logging and parse the output with GCEasy before changing any flags. Know whether your problem is pause frequency, pause duration, allocation rate, or premature promotion.
The throughput/latency/footprint triangle is a real constraint. ZGC's sub-millisecond pauses come at the cost of 10–15% throughput overhead. Make this trade-off consciously, not by default.

Quiet AI help

Explain simpler Compare approaches What next?

Article metadata

Written by

Abstract Algorithms

@abstractalgorithms

Reader feedback

Was this article useful?

Rate it if it helped, then continue with the next deep dive when you are ready.

Related deep dives

Java 21 to 25: Virtual Threads, Pattern Matching, and Structured Concurrency

22 min read

Java 14 to 17: Records, Sealed Classes, Text Blocks, and Pattern Matching

25 min read

ANN Index Types Explained: When to Choose Flat, HNSW, IVF, or IVF-PQ

14 min read

Java 8 to 11: Lambdas, Streams, Modules, and the End of Boilerplate

16 min · Java · best next step

Open Collection