JVM: garbage collector - Part 2

Oak_Cassia·2024년 11월 12일
0

introduction

In this post, I will delv into more specific details and examples of GC in the JVM. First, I'll explain the main concepts of GC and then examine how various garbage collectors operate.

Main Concepts

Root Node Enumeration

This process involves searching for reference chains starting from the GC root set.

  • Examples of GC roots include global refereces and variables in the excecution context.
  • To maintain consistency, this task is performed on a snapshot of the program state.

Ordinary Object Pointer Map

An OopMap stores information about references in the object's stack and registers once loading is complete. It includes details about which instructions modify references or OopMap data. Creating an OopMap for every instruction would be highly inefficent. Instead, OopMaps are only created for safe points, which are specific points in the code where the program may run for an extended time, such as during method calls, loops, or exception handling.

There are two methods for reaching safe points:

  • Preemptive Suspension: User threads are interrupted and repeatedly checked until they reach a safe point.
  • Voluntary Suspension: Threads poll a flag during execution and voluntarily pause at the next safe point.

A polling mechanism can use a single assembly instruction. For example, preventing memory access triggers a memory protection trap, which throws an exception handled by an exception handler to temporarily suspend the thead.

Safe Region

A safe region is a state or area where:

  • No references to objects are made.
  • No references to objects are modified.
    When a thread is in a state like wating or ready, it cannot respond to interrupts. This is why the concept of a safe region is necessary. threads in a safe region do not need to be considered during GC.
    Before a thread exits the safe region, it mus:
  1. Ensure that root node enumeration has been completed
  2. Check if any GC steps are still running that require the thread to pause.
    Once thease checks are done, the thread can either resume execution or remain in a waiting state as needed.

Memory Set and Card Table

The goal of these techniques is to reduce the scope of root scans, which can be expensive when scanning references across generations.

Instead of scanning all references between generations, memory sets record only cross-generational references, allowing the GC to efficiently identify live objects.

A common appraoch to optimize this process is by reducing precision mapping records to memory blocks rather than individual objects.

  • Dividing memroy into indexed blocks and implementing a byte array to track references allows the use of simple division operations to calculate indices.

Write Barrier

If an object in another generation references a block in a differenct generation, the refernce is marked in the card table either before or after the reference occurs. While this increases the cost of updating references, it is still lower than scanning the entire generation.
False sharing can occur when physically different varibales are stored in the same cache line. Since the card table manages regions of 512 bytes(the size of card), false sharing can have a significant impact within the range of cache line size multiplied by the card size.

Tricolor Marking Algorithm

The Garbage Collector can search the object graph after root node enumeration.

  • White: Unreachable object
  • Black: Object that doesn't reference any white object
  • Gray: Reachable but the objects it references have not been checked yet
  1. Mark a gray object as black and mark all objects it references as gray
  2. Repeate until there are no gray objects left
  3. White objects are released

It is more critical to mistakenly consider a live object as dead than to consider a dead object as live. This situation can occur when a user thread adds a reference form a black object to a white object and simultaneously removes a reference form a gray object to a white object. Preventing just one of these two scenarios is sufficient to avoid this critical problem.

Incremental Update

  • If a black object adds a reference to a white object, the change is recorded
  • After the concurrent scan is complete, the recorded black objects are scanned again
  • The white objects are marked as gray

Snapshot at the Beginning

  • If a reference from a gray object to a white object is removed, the change is recorded
  • After the concurrent scan is complete, the recorded gray objects (and their references) are scanned again
  • The white objects are marked as gray

How Garbage Collectors Operate

G1

  • Allocates memory in regions
    • Large objects use multiple regions
    • Objects Larger than half the size of a region are considered huge
  • Collect the most effective regions
    • Effectivness is evaluated with greater weight given to recent data
  • Adapts collection to allocation speed
  1. Initial Marking
    • User threads are paused
    • Snapshot at the beginning
    • Marks objects directly referenced by roots
  2. Concurrent Marking
    • Analyze reachability from GC roots
    • Recursively marks obejcts
  3. Remarking
    • User threads are paused
    • Rescan objects recorded during concurrent marking
  4. Copying and Cleanup
  • User thread are paused
  • Selects the most effective regions within the target pause time
  • Copies surviving objects and deletes the regions

Shenandoah

  • Improved version of G1
  • Memory is divided into regions
  • Use huge regions for large objects
  • collect regions with effectiveness
  • Doesn't pause user threds during collection
  • Doesn't distinguish generations
  • Doesn't use memory sets, uses a connection matrix
    • n×mn \times m matrix tracks references between regions

Forwarding Pointers

  • Indirect pointers
  • Inefficient existing method
    • Sets memory protections traps on original memory during evacuation
    • Triggers a trap on access and redirects to the new object, involving a costly kernel mode transition
  • Adds a reference field at the top of the object layout (later moved to the object header)
  • Points to itself when not evacuated
  • Updates to the address upon evacuation
  • Avoids thread pausing when addresses change
  • Introduces overhead for indirect access

Read Barriers

  • Ensures correctness when user threads modify objects during the copying phase
    • Synchronizes access to forwarding pointers, allowing only one thread at a time
  • Load reference barrier: Triggers when reading writing reference-type data
  • Doesn't trigger for object comparisons, or object locks

사용자 스레드에서 읽기 장벽이 트리거 되면 Forwarding pointer를 확인 하고 객체 참조 업데이트

Stack Watermark and Concurrent Stack Processing

  • Purpose
    • The top stack frame is being used by the user thread and shouldn't be accessed by the GC
    • The stack frames below it are safe for scanning
  • At Runtime
    • At the start of the GC process, a watermark is set at the top stack frame
    • The watermark ensures that the GC does not access to the top stack frame
    • If the user thread destroys top frame, it moves the watermark down by one frame
    • When the user thread leaves a safe point, the watermark ensures safety

Operation

  1. Initial Marking
    • marks objects directly referenced by roots
    • very short pause duration
  2. Concurrent Marking
    • Analyzes reachability of objects from root
  3. Final Marking
    • Completes pending marks
    • Rescan GC root set
    • Short pause duration
  4. Concurrent Cleanup
    • Clean immediate garbage regions(regions with all dead objects)
  5. Concurrent Evacuation
    • Copies objects for reclaimable regions to other regions
    • if user thread are paused, proceeds without barriers.
    • or not(running), uses read barrieres and *forwarding pointers
    • Time proportional to the number of references in memory
  6. Initial Reference Update
    • Updates references pointing old addresses in the heap to new addresses after copying
    • User threads are paused
    • Ensure all threads complete evacuation
  7. Concurrent Reference Update
    • Perform reference updates concurrently with user threads
    • Updates references linearly based on physical memory order
  8. Final Reference Update
    • Updates references in the GC root set
  9. Reclaims regions in the reclaim set

ZGC

  • Memory is divided into regions
  • Region are categorized into large, medium, small
    • Large regions can dynamically resize, and they can become smaller than medium regions

Colored Pointer

  • 64bit system
    • Pointers don't use the full address space
    • ZGC allocates 44 bits for addressing and reserves 4 extra bits for state indicator
    • Finalizable
    • Remapped
    • Marked1
    • Marked0
  • Advantages
    • Immediate euse of regions with surviving objects upon relocation
    • Eliminates the need for reference updates to moved objects
  • Read Barriers
    • Trigger: When user threads access an object
    • Detection: Identifies the remapped state via the color bits
    • Redirection
      • Redirects the pointer to the new memory location
      • Uses forward mapping to update references seamlessly

Multi Mapping Memory

  • Maps the same heap memory to three independent virtual address
    • |(Remapped, Marked1, Marked0)| -----|
  • Once all references to an old memory location are updated
    • The original memory region is safely reclaimed
  • ZGC creates new memory regions for relocated objects by creating new cirtual memory aliases
  • Original mappings remain accessible, ensuring uninterrupted user thread access

Operation

  1. Marking Start
    • Same as G1 and Shenandoah
  2. Concurrent Marking
    • Pause
    • Update root set marking to color pointers
    • Updates the Marked0 and Marked1 bits
  3. Concurrent Relocation Preparation:
    • Identifies regions to clean up and forms a relocation set
    • Scans all regions during every collection cycle
    • Decides whether to reclaim or retain regions after copying survivors
    • Handles class unloading and weak reference processing
  4. Concurrent Relocation
    • Copies survivors from the relocation set to new memory locations
    • Records old and new object addresses in the forward table for the region
    • Determines relocation based on colored pointer reference to the relocation set
    • Implements self-healing:
    • Application threads accessing relocated objects read the forward table via memory barriers
    • Updates references to point to the new address (occurs only on the first access)
    • Colored pointers enable immediate region reuse post-copying, though forward tables remain until references are updated
  5. Concurrent Remapping
    • Updates all references to old objects within the heap to point to relocated objects
    • Deferred to the next collection cycle, leveraging marking scans for efficiency
profile
꿈꾸는 것 자체로 즐겁다.

0개의 댓글