← /projects

Stratified Autonomous
Memory Manager

[ Undergraduate Thesis ] [ 2026 – Present ] [ Zig · Python · Node.js ]
Zig Python Node.js K-Means mmap Systems Memory Research

SAMM is my undergraduate thesis — an offline profile-guided memory allocator written in Zig that uses K-Means clustering to predict allocation lifetimes before runtime, then pre-bakes arena selection into the binary. The goal: reduce P99 tail latency and peak resident set size in Node.js backend services operating under a 1 GB memory constraint.

The Problem

Modern general-purpose allocators like glibc malloc and V8's heap are designed for the average case. They're fast on average, but their per-allocation metadata overhead, fragmentation behavior, and GC pause patterns make them unpredictable at the tail. For latency-sensitive backend services — think request handlers that must respond in under 10ms — P99 matters more than mean.

The insight behind SAMM: most allocation patterns in a backend service are not random. A request context allocates objects that all die at the end of the request. A database row parser allocates short-lived scratch space. An HTTP response buffer lives for exactly one send call. If you can learn these patterns offline, you don't need a runtime GC to clean them up — you can assign each allocation to the right arena at compile time.

How It Works

Phase 1 — Profiling

A Python profiling harness instruments the target Node.js process using heap snapshots and allocation tracing. For each allocation site, we record: size, call stack fingerprint, and time-to-death (measured as wall time between allocation and the last reference drop). This produces a dataset of (site_id, size_class, lifetime_ms) triples.

Phase 2 — Clustering

K-Means is run over the lifetime distribution for each size class. The output is a small number of lifetime buckets — typically 3 to 5 — that each map to a Zig arena strategy: short-lived allocations go into a thread-local bump arena that's reset per-request; medium allocations go into a slab; long-lived allocations get routed to a general region arena that's collected at process idle time.

// Arena selection logic in Zig (simplified)
pub fn selectArena(site_id: u32, size: usize) Arena {
    const bucket = baked_lifetime_table[site_id];
    return switch (bucket) {
        .short  => &thread_bump,
        .medium => &slab_pool,
        .long   => &region_arena,
    };
}

Phase 3 — Baking

The cluster assignments are serialized into a static lookup table and compiled into the Zig allocator module as a const. At runtime, every allocation routes through selectArena() with zero dynamic branching cost beyond a single table lookup.

Current Status

The profiling harness and clustering pipeline are complete. The Zig arena implementation (bump + slab + region) is functional. The N-API bridge for Node.js integration is in progress — this is the part that lets a Node.js backend actually use SAMM as its allocator for specific object types.

Initial benchmarks against a synthetic request-handler workload show promising results in peak RSS. P99 latency measurements are still being collected — the evaluation setup requires careful isolation to avoid noise from scheduler jitter.

Why Zig

Zig's allocator interface is first-class — every standard library type that allocates takes an Allocator parameter. This makes it straightforward to swap in SAMM's arena strategies without patching the runtime. Comptime evaluation lets the baked lookup table be resolved at compile time with no runtime overhead. And std.heap.page_allocator gives direct mmap access without libc wrappers.


Read the companion blog post: Predicting Allocation Lifetimes With K-Means →