Memory Management & Leak Prevention
Understanding how runtimes allocate and reclaim memory, and spotting patterns that prevent reclamation.
Overview
Memory management is the process of allocating, using, and freeing memory efficiently. In garbage-collected languages (JavaScript, Ruby, Java), the GC reclaims unreachable memory automatically, but leaks occur when references persist longer than needed. Common issues: memory leaks (event listeners not removed, closures capturing large objects), excessive allocations triggering GC pauses, and working sets too large for available RAM.
Origin
Manual memory management (malloc/free) dominated until Lisp introduced garbage collection in the late 1950s. Java's generational GC (1995) and Ruby's mark-and-sweep GC became standards. V8's Orinoco GC (2017-2019) introduced concurrent and incremental collection to reduce main-thread pauses. Ruby 3.x introduced variable-width allocation and Ractors for true parallel execution with isolated heaps.
Examples
Detecting and fixing memory leaks in Node.js
import { EventEmitter } from 'events';
// LEAK: listener added but never removed; grows with each call
class LeakyService extends EventEmitter {
private data: Buffer[] = [];
startProcessing(emitter: EventEmitter) {
// This listener captures 'this' and is never removed
emitter.on('data', (chunk: Buffer) => {
this.data.push(chunk);
});
}
}
// FIXED: use once() or remove listener when done
class FixedService extends EventEmitter {
startProcessing(emitter: EventEmitter, onDone: () => void): void {
const handler = (chunk: Buffer) => {
this.emit('chunk', chunk);
};
emitter.on('data', handler);
emitter.once('end', () => {
emitter.off('data', handler); // Remove listener to release closure
onDone();
});
}
}
// Detect leaks: setMaxListeners warning fires at 10+ listeners
// Override to a higher value only if genuinely needed; the warning is a leak indicator
EventEmitter.defaultMaxListeners = 15; // Raise if you have >10 intentional listenersNode.js emits a "MaxListenersExceededWarning" at 10 listeners per event on a single emitter; this is a heuristic for leak detection, not a limit. WeakRef (ES2021) and FinalizationRegistry allow holding references that do not prevent GC.
Memory-efficient stream processing in Ruby
require 'csv'
require 'json'
# MEMORY-INTENSIVE: loads entire file into memory
def process_csv_bad(path)
CSV.read(path, headers: true).each do |row| # Entire file in RAM
process_row(row.to_h)
end
end
# MEMORY-EFFICIENT: processes one row at a time, O(1) memory
def process_csv_stream(path)
CSV.foreach(path, headers: true) do |row| # Yields one row
process_row(row.to_h)
end
end
# For large JSON files: use streaming JSON parser (yajl-ruby)
require 'yajl'
def process_json_stream(path)
File.open(path, 'r') do |f|
parser = Yajl::Parser.new
parser.on_parse_complete = method(:process_record)
parser.parse(f) # Streams through the file, not loaded into memory
end
end
# ObjectSpace for leak investigation (development only)
require 'objspace'
def memory_snapshot
ObjectSpace.count_objects
# {:T_OBJECT=>12000, :T_STRING=>85000, :T_ARRAY=>8000, ...}
endCSV.foreach yields one row at a time; memory usage is proportional to the largest single row, not the file size. Processing a 10GB CSV file with foreach uses ~1MB of RAM; using CSV.read uses ~10GB. This distinction determines whether a task can run on a small server.
Use Cases
- 01Long-running Node.js servers where event listener leaks accumulate over hours, eventually consuming all available heap
- 02Background jobs processing large files (CSV exports, image batches) where streaming prevents OOM crashes
- 03React applications where components subscribe to stores, sockets, or intervals and forget to unsubscribe in cleanup functions
- 04Serverless functions where cold start memory initialisation affects cost and latency; minimising the module footprint via tree-shaking reduces both
When Not to Use
- //Do not over-optimise memory usage for short-lived processes (CLI scripts, one-time migrations) where simplicity matters more
- //Do not avoid all object allocation in hot paths without profiling; premature memory micro-optimisation can harm readability without measurable benefit
- //Do not manually null references in JavaScript to "help" the GC; modern GCs handle circular references correctly since the mark-and-sweep approach became standard
Technical Notes
- V8's heap is split into young generation (new space, ~1-8MB) and old generation. Objects that survive two GC scavenges are promoted to old gen. A retained reference in old gen prevents an entire object graph from being collected
- process.memoryUsage() in Node.js reports: rss (resident set size, total process memory), heapUsed, heapTotal, and external (C++ object memory). heapUsed growing indefinitely without corresponding load increase indicates a leak
- Ruby's GC.stat returns heap_live_slots, heap_dead_slots, minor_gc_count, major_gc_count. The ratio of major to minor GC cycles indicates promotion rate; high promotion means many objects are long-lived and surviving into the old generation
- WeakMap and WeakSet in JavaScript hold keys without preventing GC; useful for per-object metadata caches where the metadata should be collected when the object is collected. WeakRef (ES2021) holds weak references to values
More in Performance