Implementing Your Own Garbage Collector in Java

Implementing Your Own Garbage Collector in Java
Photo by Blake Wisz / Unsplash

One of the most notable features of Java is the automatic memory management which provides Java Developers with the convenience of not having to manually manage the allocations and deallocations of memory in their code. At the same time, there can be cases where a Java Developer needs to create a custom memory management system for their software application – this could be due to needing specific requirements or even constraints in the application or code.

This guide is for giving Java Developers a better idea to understanding how to design and implement their own custom Java Garbage Collector.

Understanding Java's Memory Model

The most important thing when it comes to rolling your own application software such as rolling your own garbage collector is to fully understand The Heap and the Stack. The heap will store objects – while that is happening, the stack will hold all of the variables – this includes the local and the method call information. When you are rolling your own custom memory management system, you definitely need to design the application to work within this memory model.


We recommend the following article about Understanding How to Use Cryptography in Java as a good starting point to taking on this adventure.
Understanding How to Use Cryptography in Java
I will explain the Java Cryptographic Architecture (JCA) for a better understanding of how it works. The JCA is designed to simplify the creation of protocols like encryption, hashing, digital signatures, and key generation for Java developers Now let’s take a look at how the API works for this process.

Custom Memory Allocation

When creating a custom memory allocation script, you need to remember that it is responsible for reserving memory for new objects. When creating the functions, you should strongly consider the following.

  • Choose between a fixed-size block, variable sized block, or even a combination of both of them together.
  • Ensure that the memory is correctly aligned based on the hardware and Java Virtual Machine requirements for the software application.
  • Try to consider different strategies to minimize the fragmentation such as allocation of objects that are similar or using a segregated free list.

As an example, I have designed a very simple memory allocator that simulates the management of a fixed-size memory pool. I will use two classes for this being MemoryPool and MemoryBlock in the example code. I will simulate the memory management within a predefined array to represent the memory pool.

public class MemoryPool {
    private final MemoryBlock[] blocks;
    private final boolean[] available;
    private final int blockSize;
    private final int numberOfBlocks;

    public MemoryPool(int blockSize, int numberOfBlocks) {
        this.blockSize = blockSize;
        this.numberOfBlocks = numberOfBlocks;
        this.blocks = new MemoryBlock[numberOfBlocks];
        this.available = new boolean[numberOfBlocks];

        for (int i = 0; i < numberOfBlocks; i++) {
            blocks[i] = new MemoryBlock(new byte[blockSize], blockSize);
            available[i] = true;
        }
    }

    public MemoryBlock allocate() {
        for (int i = 0; i < numberOfBlocks; i++) {
            if (available[i]) {
                available[i] = false;
                return blocks[i];
            }
        }
        return null; // No available blocks
    }

    public void deallocate(MemoryBlock block) {
        for (int i = 0; i < numberOfBlocks; i++) {
            if (blocks[i] == block) {
                available[i] = true;
                return;
            }
        }
        throw new IllegalArgumentException("The memory block is not found.");
    }
}

public class MemoryBlock {
    private final byte[] data;
    private final int size;

    public MemoryBlock(byte[] data, int size) {
        this.data = data;
        this.size = size;
    }

    public byte[] getData() {
        return data;
    }

    public int getSize() {
        return size;
    }
}

You can use this memory pool by creating an instance of MemoryPool and then allocating and deallocating memory blocks right below.

public class MemoryAllocatorDemo {
    public static void main(String[] args) {
        int blockSize = 128;  // Each block is 128 bytes
        int numberOfBlocks = 10;  // Pool has 10 blocks

        MemoryPool pool = new MemoryPool(blockSize, numberOfBlocks);

        MemoryBlock block1 = pool.allocate();
        if (block1 != null) {
            System.out.println("Allocated block of size: " + block1.getSize());
        }

        // Use the block for something...

        // Now deallocate
        pool.deallocate(block1);
        System.out.println("Block deallocated");
    }
}

Implement Reference Tracking

To manage the object lifecycles, you will need a mechanism to help track the different object references. You will be able to implement reference tracking using the reference counting or a tracing mechanism. As for reference counting, each of the objects has to maintain a counter of the number of the references to it – where as, at the same time, the memory manager scans the memory to identify the objects.

In the following example, I'll explain and demonstrate how to implement reference counting in Java, as tracing mechanisms are more aligned with the internals of garbage collection systems like those found in JVM or .NET, and are typically not manually implemented.

We'll define a class that includes a reference counter and methods to increment and decrement this counter. When the counter hits zero, the object can be considered for deallocation.

public class RefCountedObject {
    private int refCount;
    private String data;  // Example data held by this object

    public RefCountedObject(String data) {
        this.data = data;
        this.refCount = 1;  // Initially, there is one reference after creation
    }

    public synchronized void addReference() {
        refCount++;
    }

    public synchronized void removeReference() {
        refCount--;
        if (refCount == 0) {
            destroy();
        }
    }

    private void destroy() {
        System.out.println("Destroying object with data: " + data);
        this.data = null;  // Clear data or close resources
    }

    public String getData() {
        return data;
    }
}

Then, I will use the reference counted objects, you must explicitly manage references. This means calling addReference when you assign a reference to a variable, and removeReference when you're done with it.

public class ReferenceManagement {
    public static void main(String[] args) {
        RefCountedObject obj = new RefCountedObject("Example Data");

        // Simulate sharing the object:
        RefCountedObject anotherReference = obj;
        anotherReference.addReference();  // Increment since another reference is made

        useData(anotherReference);
        anotherReference.removeReference();  // Decrement after finishing use

        // Main reference is released last
        obj.removeReference();
    }

    private static void useData(RefCountedObject obj) {
        System.out.println("Using data: " + obj.getData());
    }
}

There is some limitations which are worth noting about references.

  • Circular References: Reference counting is not capable of handling circular references (where two or more objects refer to each other, creating a cycle). These objects will never reach a reference count of zero naturally, leading to memory leaks unless additional mechanisms (like weak references) are introduced.
  • Performance Overhead: Managing reference counts requires additional CPU work, which can impact performance, particularly where objects are frequently created and destroyed.
  • Complexity in Multithreaded Environments: Synchronizing reference counts in a multithreaded context (as shown with synchronized methods) can further impact performance and complexity.

Despite the limitations listed right above, reference counting is a pretty damn powerful tool to use. This is extremely useful in environments where deterministic destruction of objects is crucial – such as in the management of resources in operating systems or in native applications. This is where manual memory management is absolutely required. In managed environments like Java, reference counting can be used for managing non-memory resources, such as file handles or network connections, where the developer needs more control over when resources are released.

Garbage Collection Algorithm

When a Java Developer chooses to select a garbage collection algorithm that is going to suit the software application requirements, there are some common functions that the algorithm should include:

Mark and Sweep Algorithm

This is one of the simplest forms of garbage collection techniques. It operates in two phases different phases:

  • Mark: Traverse all object references starting from the root (e.g., local variable stacks, static fields) and mark every reachable object.
  • Sweep: Scan the memory heap and reclaim the memory of all unmarked objects.
// Assuming a hypothetical Object structure
class Object {
    boolean marked = false;
    Object[] references;
}

void mark(Object root) {
    if (root == null || root.marked) return;
    root.marked = true;
    for (Object ref : root.references) {
        mark(ref);
    }
}

void sweep(List<Object> allObjects) {
    Iterator<Object> it = allObjects.iterator();
    while (it.hasNext()) {
        Object obj = it.next();
        if (!obj.marked) {
            it.remove(); // This simulates freeing the object
        } else {
            obj.marked = false; // Reset for the next GC cycle
        }
    }
}

Mark and Compact Algorithm

This algorithm improves upon the Mark and Sweep by dramatically reducing the memory fragmentation. After marking live objects, it will compact them towards the beginning of the memory heap.

void markAndCompact(Object[] heap) {
    // Mark phase (same as before)
    mark(heap[0]);  // Assuming heap[0] is the root

    // Compact phase
    int freeIndex = 0;
    for (int i = 0; i < heap.length; i++) {
        if (heap[i].marked) {
            heap[i].marked = false; // Unmark for next GC cycle
            heap[freeIndex] = heap[i]; // Move object
            freeIndex++;
        }
    }
    // Nullify all elements after freeIndex to free memory
    for (int i = freeIndex; i < heap.length; i++) {
        heap[i] = null;
    }
}

Copying Algorithm

This method divides the heap into two halves and only uses one half at a time. During garbage collection, live objects are copied to the other half, and then the roles of the halves are switched.

void copyGC(Object[] fromSpace, Object[] toSpace) {
    int toIndex = 0;
    // Assuming a simplistic way of obtaining root references
    for (Object root : getRoots()) {
        toIndex = copy(root, toSpace, toIndex);
    }

    // Switch spaces
    Object[] temp = fromSpace;
    fromSpace = toSpace;
    toSpace = temp;
    // Reset toSpace to be empty
    Arrays.fill(toSpace, null);
}

int copy(Object obj, Object[] toSpace, int toIndex) {
    if (obj == null || contains(toSpace, obj)) return toIndex; // Avoid duplicates
    toSpace[toIndex] = obj;
    toIndex++;
    for (Object ref : obj.references) {
        toIndex = copy(ref, toSpace, toIndex);
    }
    return toIndex;
}

These code examples are simplified and do not handle many real-world concerns such as concurrent modifications, detailed root identification, or complex data structures. They will provide a conceptual base for understanding how garbage collection algorithms can be structured. In the production environment, choosing and tuning a garbage collection strategy in Java involves selecting from the available algorithms implemented in the Java Virtual Machine (JVM), such as G1, CMS, or evenZGC.

Implementing a Root Object Identification


Now we are going to implement a root object identification in Java for a custom memory management system like a garbage collector. First, we will need to establish a way to determine which objects are roots. You should remember that Roots are usually objects referenced by global variables, objects on the thread stacks, static fields, and other persistent references that are active in the software application.

Since Java itself manages memory through its own garbage collector, we will have to simulate a simplified scenario where we explicitly track root objects in a small custom framework. This will only be a sample demonstration since in practice, overriding Java's own memory management at a low level isn't straightforward without using native languages or Java Virtual Machine APIs.

import java.util.HashSet;
import java.util.Set;

public class MemoryManager {
    // Set to store root objects
    private Set<Object> roots = new HashSet<>();

    // Set to store all objects for demonstration purposes (not typically feasible in a real-world large scale application)
    private Set<Object> allObjects = new HashSet<>();

    // Register a root object
    public void registerRootObject(Object obj) {
        roots.add(obj);
        allObjects.add(obj);  // Assume all roots are also in all objects for simplification
    }

    // Unregister a root object
    public void unregisterRootObject(Object obj) {
        roots.remove(obj);
    }

    // Mark and trace from roots
    public void garbageCollect() {
        Set<Object> reachable = new HashSet<>();
        for (Object root : roots) {
            trace(root, reachable);
        }

        allObjects.retainAll(reachable);  // Simulate collection by retaining only reachable objects
    }

    // Recursive trace method
    private void trace(Object obj, Set<Object> reachable) {
        if (!reachable.contains(obj)) {
            reachable.add(obj);
            // For demonstration, pretend objects have fields that are other objects
            // In a real scenario, reflection might be needed to find these references
            // This is a placeholder for demonstration
            for (Object ref : getReferences(obj)) {
                trace(ref, reachable);
            }
        }
    }

    // Dummy method to simulate getting references from an object
    private Set<Object> getReferences(Object obj) {
        // This should be replaced with actual logic to determine object references
        return new HashSet<>();
    }
}

// Application to use the MemoryManager
public class Application {
    public static void main(String[] args) {
        MemoryManager memoryManager = new MemoryManager();
        Object root1 = new Object();
        Object root2 = new Object();
        Object nonRoot = new Object();

        memoryManager.registerRootObject(root1);
        memoryManager.registerRootObject(root2);
        memoryManager.allObjects.add(nonRoot);  // Adding an object not registered as a root

        memoryManager.garbageCollect();
    }
}

Implement a Sweeping Algorithm

To continue with our simplified example of a custom memory management system in Java, let's expand on the implementation to include a sweeping algorithm. The sweeping phase will handle reclaiming memory occupied by objects that are not marked as live.

In a typical garbage collection system, the sweeping phase comes after the marking phase. Since Java manages memory and we cannot directly free memory, our sweeping simulation will involve removing unmarked objects from our simulated set of all objects, which represents our managed memory.

Let's continue with the previous example and add a sweeping functionality to it. This will involve identifying which objects have not been marked as reachable and then simulating their removal.

import java.util.HashSet;
import java.util.Set;

public class MemoryManager {
    // Set to store root objects
    private Set<Object> roots = new HashSet<>();

    // Set to store all objects for demonstration purposes
    private Set<Object> allObjects = new HashSet<>();

    // Set to store reachable (marked) objects during GC
    private Set<Object> reachable = new HashSet<>();

    // Register a root object
    public void registerRootObject(Object obj) {
        roots.add(obj);
        allObjects.add(obj);  // Assume all roots are also in all objects for simplification
    }

    // Unregister a root object
    public void unregisterRootObject(Object obj) {
        roots.remove(obj);
    }

    // Simulate garbage collection process
    public void garbageCollect() {
        // Reset the reachable set for fresh marking
        reachable.clear();

        // Mark phase
        for (Object root : roots) {
            trace(root);
        }

        // Sweep phase
        sweep();
    }

    // Recursive trace (mark) method
    private void trace(Object obj) {
        if (!reachable.contains(obj)) {
            reachable.add(obj);
            // For demonstration, pretend objects have fields that are other objects
            // In a real scenario, reflection might be needed to find these references
            // This is a placeholder for demonstration
            for (Object ref : getReferences(obj)) {
                trace(ref);
            }
        }
    }

    // Sweep phase: remove unmarked objects
    private void sweep() {
        Set<Object> deadObjects = new HashSet<>(allObjects);
        deadObjects.removeAll(reachable);
        allObjects.removeAll(deadObjects);  // Only keep marked objects

        System.out.println("Sweeping completed. Removed " + deadObjects.size() + " objects.");
    }

    // Dummy method to simulate getting references from an object
    private Set<Object> getReferences(Object obj) {
        // This should be replaced with actual logic to determine object references
        return new HashSet<>();
    }
}

// Application to use the MemoryManager
public class Application {
    public static void main(String[] args) {
        MemoryManager memoryManager = new MemoryManager();
        Object root1 = new Object();
        Object root2 = new Object();
        Object nonRoot = new Object();

        memoryManager.registerRootObject(root1);
        memoryManager.registerRootObject(root2);
        memoryManager.allObjects.add(nonRoot);  // Adding an object not registered as a root

        memoryManager.garbageCollect();  // Should collect `nonRoot` since it's not reachable
    }
}

Implementing some Compaction

Implementing a compaction algorithm in Java is largely for learning and design over production based since Java manages its own memory and does not expose direct memory manipulation APIs to the programmer. However, for educational purposes, we can simulate a memory compaction process. The idea here is to demonstrate how you might defragment memory by moving live objects closer together in a hypothetical memory space represented by a list.

For our Java code example, we'll consider a List where the index in the list might represent memory addresses in a simplistic manner. By compacting the list, we will try to move all liveobjects – represented as not null – to the start of the list, thus mimicking the removal of fragmentation.

import java.util.ArrayList;
import java.util.List;
import java.util.Optional;

public class MemoryManager {
    private List

    Optional<Object>> simulatedMemory = new ArrayList<>();

    // Register an object, simulating allocation
public void allocateObject(Object obj) {
    simulatedMemory.add(Optional.ofNullable(obj));  // Optional.empty() can represent freed memory slots
}

// Mark and Sweep + Compaction
public void garbageCollect() {
    // Mark phase (we will simulate marking by returning a set of live objects)
    Set<Object> reachable = markPhase();

    // Sweep phase
    sweepPhase(reachable);

    // Compaction phase
    compactMemory();
}

// Mark phase: Identify all reachable objects
private Set<Object> markPhase() {
    Set<Object> reachable = new HashSet<>();
    // Normally you would traverse object graph from roots, here we simulate direct reachability
    for (Optional<Object> cell : simulatedMemory) {
        cell.ifPresent(reachable::add);
    }
    return reachable;
}

// Sweep phase: Remove unreachable objects
private void sweepPhase(Set<Object> reachable) {
    for (int i = 0; i < simulatedMemory.size(); i++) {
        Optional<Object> obj = simulatedMemory.get(i);
        obj.ifPresent(o -> {
            if (!reachable.contains(o)) {
                simulatedMemory.set(i, Optional.empty());  // Marking it as empty (freed)
            }
        });
    }
}

// Compaction phase: Defragment the simulated memory
private void compactMemory() {
    int targetIndex = 0;
    for (int i = 0; i < simulatedMemory.size(); i++) {
        if (simulatedMemory.get(i).isPresent()) {
            // Move live object to the closest available position at the beginning of the list
            simulatedMemory.set(targetIndex, simulatedMemory.get(i));
            if (i != targetIndex) {
                simulatedMemory.set(i, Optional.empty());
            }
            targetIndex++;
        }
    }
    System.out.println("Memory compaction completed. Live objects are moved to the start.");
}

// Application to use the MemoryManager
public class Application {
public static void main(String[] args) {
MemoryManager memoryManager = new MemoryManager();

    // Simulating object allocation
    memoryManager.allocateObject(new Object());
    memoryManager.allocateObject(null);  // Simulate a free slot
    memoryManager.allocateObject(new Object());
    memoryManager.allocateObject(new Object());
    memoryManager.allocateObject(null);  // Simulate another free slot
    memoryManager.allocateObject(new Object());

    System.out.println("Before GC:");
    memoryManager.simulatedMemory.forEach(slot ->
        System.out.println(slot.isPresent() ? "Object" : "Free Slot"));

    // Run garbage collection including compaction
    memoryManager.garbageCollect();

    System.out.println("After GC:");
    memoryManager.simulatedMemory.forEach(slot ->
        System.out.println(slot.isPresent() ? "Object" : "Free Slot"));
}

The Conclusion

The following code example provides a simple foundation for implementing a custom garbage collector in Java, covering various aspects such as memory allocation, reference tracking, root object identification, garbage collection (mark-sweep algorithm), and compaction. It's a simplified version and would need enhancements for use in real-world scenarios, but it serves as a starting point for further development and understanding.

import java.util.*;

// Represents an object with reference counting
class RefCountedObject {
    private int refCount;
    private String data;  // Example data held by this object

    public RefCountedObject(String data) {
        this.data = data;
        this.refCount = 1;  // Initially, there is one reference after creation
    }

    public synchronized void addReference() {
        refCount++;
    }

    public synchronized void removeReference() {
        refCount--;
        if (refCount == 0) {
            destroy();
        }
    }

    private void destroy() {
        System.out.println("Destroying object with data: " + data);
        this.data = null;  // Clear data or close resources
    }

    public String getData() {
        return data;
    }
}

// Represents a custom memory pool for allocation
class MemoryPool {
    private final MemoryBlock[] blocks;
    private final boolean[] available;
    private final int blockSize;
    private final int numberOfBlocks;

    public MemoryPool(int blockSize, int numberOfBlocks) {
        this.blockSize = blockSize;
        this.numberOfBlocks = numberOfBlocks;
        this.blocks = new MemoryBlock[numberOfBlocks];
        this.available = new boolean[numberOfBlocks];

        for (int i = 0; i < numberOfBlocks; i++) {
            blocks[i] = new MemoryBlock(new byte[blockSize], blockSize);
            available[i] = true;
        }
    }

    public MemoryBlock allocate() {
        for (int i = 0; i < numberOfBlocks; i++) {
            if (available[i]) {
                available[i] = false;
                return blocks[i];
            }
        }
        return null; // No available blocks
    }

    public void deallocate(MemoryBlock block) {
        for (int i = 0; i < numberOfBlocks; i++) {
            if (blocks[i] == block) {
                available[i] = true;
                return;
            }
        }
        throw new IllegalArgumentException("The memory block is not found.");
    }
}

// Represents a block of memory
class MemoryBlock {
    private final byte[] data;
    private final int size;

    public MemoryBlock(byte[] data, int size) {
        this.data = data;
        this.size = size;
    }

    public byte[] getData() {
        return data;
    }

    public int getSize() {
        return size;
    }
}

// Represents a custom garbage collector
public class CustomGarbageCollector {
    private Set<Object> roots = new HashSet<>();
    private Set<Object> allObjects = new HashSet<>();
    private Set<Object> reachable = new HashSet<>();
    private MemoryPool memoryPool;

    public CustomGarbageCollector(int blockSize, int numberOfBlocks) {
        this.memoryPool = new MemoryPool(blockSize, numberOfBlocks);
    }

    public void registerRootObject(Object obj) {
        roots.add(obj);
        allObjects.add(obj);
    }

    public void unregisterRootObject(Object obj) {
        roots.remove(obj);
    }

    public void allocateObject() {
        MemoryBlock block = memoryPool.allocate();
        if (block != null) {
            RefCountedObject obj = new RefCountedObject("Example Data");
            allObjects.add(obj);
        } else {
            System.out.println("Memory pool is full. Cannot allocate object.");
        }
    }

    public void garbageCollect() {
        reachable.clear();
        for (Object root : roots) {
            trace(root);
        }
        sweep();
        compactMemory();
    }

    private void trace(Object obj) {
        if (!reachable.contains(obj)) {
            reachable.add(obj);
            // Simulate object references (should be replaced with actual logic)
            for (Object ref : getReferences(obj)) {
                trace(ref);
            }
        }
    }

    private void sweep() {
        Set<Object> deadObjects = new HashSet<>(allObjects);
        deadObjects.removeAll(reachable);
        allObjects.removeAll(deadObjects);
    }

    private void compactMemory() {
        // Compaction logic (simulate moving objects to the start of memory pool)
    }

    // Dummy method to simulate getting references from an object
    private Set<Object> getReferences(Object obj) {
        // Simulate object references (should be replaced with actual logic)
        return new HashSet<>();
    }

    public static void main(String[] args) {
        CustomGarbageCollector gc = new CustomGarbageCollector(128, 10);
        gc.allocateObject();
        gc.allocateObject();
        gc.registerRootObject(new Object()); // Register root object
        gc.garbageCollect(); // Perform garbage collection
    }
}

Do you like what you're reading from the CoderOasis Technology Blog? We recommend reading our Implementing RSA in Python from Scratch series next.
Implementing RSA in Python from Scratch
Please note that it is essential for me to emphasize that the code and techniques presented here are intended solely for educational purposes and should never be employed in real-world applications without careful consideration and expert guidance. At the same time, understanding the principles of RSA cryptography and exploring various

The CoderOasis Community

Did you know we have a Community Forums and Discord Server? which we invite everyone to join us? Want to discuss this article with other members of our community? Want to join a laid back place to chill and discuss topics like programming, cybersecurity, web development, and Linux? Consider joining us today!