Performance

Memory Management & Leak Prevention

Understanding how runtimes allocate and reclaim memory, and spotting patterns that prevent reclamation.

Overview

Memory management is the process of allocating, using, and freeing memory efficiently. In garbage-collected languages (JavaScript, Ruby, Java), the GC reclaims unreachable memory automatically, but leaks occur when references persist longer than needed. Common issues: memory leaks (event listeners not removed, closures capturing large objects), excessive allocations triggering GC pauses, and working sets too large for available RAM.

Origin

Manual memory management (malloc/free) dominated until Lisp introduced garbage collection in the late 1950s. Java's generational GC (1995) and Ruby's mark-and-sweep GC became standards. V8's Orinoco GC (2017-2019) introduced concurrent and incremental collection to reduce main-thread pauses. Ruby 3.x introduced variable-width allocation and Ractors for true parallel execution with isolated heaps.

Examples

Detecting and fixing memory leaks in Node.js

import { EventEmitter } from 'events';

// LEAK: listener added but never removed; grows with each call
class LeakyService extends EventEmitter {
  private data: Buffer[] = [];

  startProcessing(emitter: EventEmitter) {
    // This listener captures 'this' and is never removed
    emitter.on('data', (chunk: Buffer) => {
      this.data.push(chunk);
    });
  }
}

// FIXED: use once() or remove listener when done
class FixedService extends EventEmitter {
  startProcessing(emitter: EventEmitter, onDone: () => void): void {
    const handler = (chunk: Buffer) => {
      this.emit('chunk', chunk);
    };
    emitter.on('data', handler);
    emitter.once('end', () => {
      emitter.off('data', handler); // Remove listener to release closure
      onDone();
    });
  }
}

// Detect leaks: setMaxListeners warning fires at 10+ listeners
// Override to a higher value only if genuinely needed; the warning is a leak indicator
EventEmitter.defaultMaxListeners = 15; // Raise if you have >10 intentional listeners

Node.js emits a "MaxListenersExceededWarning" at 10 listeners per event on a single emitter; this is a heuristic for leak detection, not a limit. WeakRef (ES2021) and FinalizationRegistry allow holding references that do not prevent GC.

Memory-efficient stream processing in Ruby

require 'csv'
require 'json'

# MEMORY-INTENSIVE: loads entire file into memory
def process_csv_bad(path)
  CSV.read(path, headers: true).each do |row|  # Entire file in RAM
    process_row(row.to_h)
  end
end

# MEMORY-EFFICIENT: processes one row at a time, O(1) memory
def process_csv_stream(path)
  CSV.foreach(path, headers: true) do |row|  # Yields one row
    process_row(row.to_h)
  end
end

# For large JSON files: use streaming JSON parser (yajl-ruby)
require 'yajl'

def process_json_stream(path)
  File.open(path, 'r') do |f|
    parser = Yajl::Parser.new
    parser.on_parse_complete = method(:process_record)
    parser.parse(f)  # Streams through the file, not loaded into memory
  end
end

# ObjectSpace for leak investigation (development only)
require 'objspace'
def memory_snapshot
  ObjectSpace.count_objects
  # {:T_OBJECT=>12000, :T_STRING=>85000, :T_ARRAY=>8000, ...}
end

CSV.foreach yields one row at a time; memory usage is proportional to the largest single row, not the file size. Processing a 10GB CSV file with foreach uses ~1MB of RAM; using CSV.read uses ~10GB. This distinction determines whether a task can run on a small server.

Use Cases

  • 01Long-running Node.js servers where event listener leaks accumulate over hours, eventually consuming all available heap
  • 02Background jobs processing large files (CSV exports, image batches) where streaming prevents OOM crashes
  • 03React applications where components subscribe to stores, sockets, or intervals and forget to unsubscribe in cleanup functions
  • 04Serverless functions where cold start memory initialisation affects cost and latency; minimising the module footprint via tree-shaking reduces both

When Not to Use

  • //Do not over-optimise memory usage for short-lived processes (CLI scripts, one-time migrations) where simplicity matters more
  • //Do not avoid all object allocation in hot paths without profiling; premature memory micro-optimisation can harm readability without measurable benefit
  • //Do not manually null references in JavaScript to "help" the GC; modern GCs handle circular references correctly since the mark-and-sweep approach became standard

Technical Notes

  • V8's heap is split into young generation (new space, ~1-8MB) and old generation. Objects that survive two GC scavenges are promoted to old gen. A retained reference in old gen prevents an entire object graph from being collected
  • process.memoryUsage() in Node.js reports: rss (resident set size, total process memory), heapUsed, heapTotal, and external (C++ object memory). heapUsed growing indefinitely without corresponding load increase indicates a leak
  • Ruby's GC.stat returns heap_live_slots, heap_dead_slots, minor_gc_count, major_gc_count. The ratio of major to minor GC cycles indicates promotion rate; high promotion means many objects are long-lived and surviving into the old generation
  • WeakMap and WeakSet in JavaScript hold keys without preventing GC; useful for per-object metadata caches where the metadata should be collected when the object is collected. WeakRef (ES2021) holds weak references to values