Building the Garbage Collector

Why Not Manual Memory Management?

One of Vexel's core design goals is that writing code should feel like Python. That means you should not have to think about memory. You create a struct, put it in an array, pass it to a function, and it exists as long as you need it. When nothing refers to it anymore, it disappears.

Manual memory management was never on the table. The language is designed for game developers who want to focus on gameplay, not lifetime annotations. So the only question was: which GC strategy?

Attempt 1: Reference Counting

Reference counting is the simplest GC strategy. Every heap object carries a counter that records how many references point to it. When the count reaches zero, the object is freed.

It worked. For a while. My first test program created structs, assigned them to variables, and freed them correctly when the variables went out of scope. I felt good about it.

Then I tried to create a linked list. A struct that holds a value and a reference to the next node:

struct Node:
    value: int
    next: Node

I built a list with two nodes that pointed to each other in a cycle. When the list variable went out of scope, neither node's reference count reached zero. Both nodes leaked. My reference-counting GC had a classic cycle problem.

You can solve cycles in reference counting by adding a cycle detector, but that essentially means adding a full tracing GC on top anyway. I decided to just build the tracing GC from the start.

Mark-and-Sweep: The Concept

A mark-and-sweep garbage collector works in two phases:

Mark: Starting from all known live roots (stack variables, global variables), recursively follow every pointer and mark each reachable object as live.
Sweep: Walk every object in the heap. Any object that was not marked as live is dead and should be freed. Clear all marks for the next cycle.

The algorithm is correct by construction: if an object is reachable from a root, it is live. If it is not reachable, nothing in the program can access it, so freeing it is safe.

The Hard Part: Object Layout

The conceptual challenge with mark-and-sweep in a compiled language is that the GC needs to traverse pointers, but at runtime it only sees raw memory. How does it know which bytes in a struct are pointers versus integers?

In languages with a managed runtime (Java, Python, Go), this is handled by the runtime itself which always knows the type of every object. In a language that compiles to native code, you have to build this infrastructure yourself.

My solution was to give every heap object a header. Before the actual struct data, the compiler emits a small header that contains:

; GC object header layout
struct GCHeader {
  i8*      next_in_heap;    ; intrusive linked list of all allocations
  i8       marked;          ; 1 if reached in current mark phase
  TypeInfo* type_info;      ; pointer to static descriptor for this type
}

The TypeInfo descriptor is a statically emitted constant that the compiler generates for every struct type. It contains the number of pointer fields and their byte offsets within the struct. The GC reads this at runtime to know which fields to follow during the mark phase.

The Heap List

Every allocation is added to a global linked list: the heap list. The list is threaded through the next_in_heap pointer in each object's header.

When the sweep phase runs, it walks this list from the beginning. For each object that is not marked, it removes it from the list and frees the memory. For each object that is marked, it clears the mark bit. By the end of the sweep, the heap list contains only live objects with clear mark bits, ready for the next cycle.

When Does the GC Run?

Currently the GC runs every N allocations (N is tunable but defaults to 1024). This is a simple heuristic. More sophisticated strategies, like running when the heap exceeds a size threshold or running in a background thread, are planned for a future version.

For most game programs, which allocate heavily at startup and minimally during the game loop, this strategy works well. The GC pause is not noticeable at 60fps.

The Stack Scanner

The root set for the mark phase consists of every live variable in every active stack frame. Vexel does not try to scan the actual machine stack (which would require precise stack maps or conservative scanning). Instead, the compiler emits explicit root registration: every time a pointer-type local variable is created, it is registered with the GC root table. When it goes out of scope, it is unregistered.

This is more code to emit but means the GC mark phase is precise: it only follows real pointers, never false positives. Conservative GCs can accidentally keep dead objects alive if an integer on the stack happens to look like a valid heap address.

Does It Work?

Yes. The test suite includes programs that create thousands of cyclic structures and verify that memory usage stays bounded after the GC runs. It handles arrays of structs, structs that contain arrays, deeply nested linked structures, and all of the above in combination.

The GC was the hardest part of building Vexel by a significant margin. But getting it right changed the character of the language. You can write Vexel programs that create and discard heap objects freely, just like Python, and the compiler handles everything underneath.