Performance

Caching Strategies

In-process caches, Redis, HTTP caching headers, and CDN edge caching, choosing the right layer.

Overview

Caching stores the result of expensive computations (database queries, API calls, rendered HTML) so subsequent requests can be served without repeating the work. The key challenge is cache invalidation: deciding when a cached value is stale. Cache levels include CDN (edge), application (Redis/Memcached), database query cache, and in-process memory. Each level has different latency, capacity, and invalidation semantics.

Origin

CPU caching (L1/L2) was fundamental from the 1980s. Squid (1996) popularised HTTP proxy caching. Memcached (Brad Fitzpatrick, 2003, built for LiveJournal) was the first widely-adopted distributed application cache. Redis (Salvatore Sanfilippo, 2009) added persistence, data structures, and pub/sub. Phil Karlton's quip "there are only two hard things in computer science: cache invalidation and naming things" dates to the 1990s.

Examples

Cache-aside pattern with Redis in TypeScript

import Redis from 'ioredis';

const redis = new Redis(process.env.REDIS_URL!);

async function cacheAside<T>(
  key: string,
  ttlSeconds: number,
  fetch: () => Promise<T>
): Promise<T> {
  const cached = await redis.get(key);
  if (cached !== null) {
    return JSON.parse(cached) as T;
  }
  const value = await fetch();
  await redis.setex(key, ttlSeconds, JSON.stringify(value));
  return value;
}

// Usage: cache user profiles for 5 minutes
async function getUserProfile(userId: string) {
  return cacheAside(
    'user:profile:' + userId,
    300,
    () => db.users.findUniqueOrThrow({
      where: { id: userId },
      include: { subscription: true, preferences: true }
    })
  );
}

// Cache invalidation: call on user profile update
async function invalidateUserCache(userId: string): Promise<void> {
  await redis.del('user:profile:' + userId);
}

Cache-aside (lazy loading) populates the cache on demand. Write-through populates on every write, keeping the cache warm at the cost of write latency. The TTL is the safety net for invalidation failures; a 5-minute TTL means stale data is never older than 5 minutes.

Russian doll caching with fragment cache in Rails

# View: app/views/orders/show.html.erb
# Outer cache: order summary, keyed by order cache_key_with_version
# cache @order do
#   = render @order  # order_id, status, created_at
#
#   Inner cache: each line item, independently invalidated
#   @order.line_items.each do |item|
#     cache item do
#       = render item  # product name, qty, unit price
#     end
#   end
# end

# config/initializers/cache.rb
Rails.cache = ActiveSupport::Cache::RedisCacheStore.new(
  url: ENV.fetch('REDIS_URL'),
  pool_size: 10,
  compress: true,
  compress_threshold: 1.kilobyte
)

# ActiveRecord model: cache_key_with_version includes updated_at
# Order.find(42).cache_key_with_version
# => "orders/42-20250601123456789"
# Updating any field on the order changes updated_at, busting the outer cache
# Updating a line item only busts that item's inner cache, not all items

Russian doll caching nests cache keys so outer caches automatically invalidate when inner content changes (because the inner key changes). Rails' cache_key_with_version uses the model's updated_at timestamp, so any attribute change busts the cache automatically.

Use Cases

01Database query results for expensive aggregations (dashboard stats, report totals) that are read far more often than they are computed
02External API responses (geolocation lookups, currency rates, third-party product data) where the data changes infrequently and each call has cost or latency
03Rendered HTML fragments for content that many users see identically (category pages, popular product pages)
04Session data stored in Redis for fast retrieval without a database round-trip on every authenticated request

When Not to Use

//Do not cache user-specific data at a shared cache key; different users receiving each other's data is a critical security vulnerability
//Do not cache data that must always reflect current state (stock availability, account balances in a transactional context)
//Do not add caching as the first performance fix without profiling; caching a query that takes 50ms does not help if the page has 200 queries

Technical Notes

Cache stampede (thundering herd) occurs when a popular cache key expires and many concurrent requests all miss, all running the expensive query simultaneously. Mitigations: probabilistic early expiration, lock-based recompute (one request rebuilds, others wait), or Fetch-On-Miss with a background refresh
Redis eviction policies (maxmemory-policy): allkeys-lru evicts the least-recently-used key when memory is full. volatile-lru evicts only keys with a TTL set. noeviction returns an error when full. The correct policy depends on whether all keys should be evictable
Write-through cache ensures the cache is always populated on writes, eliminating cold misses. Write-behind (write-back) caches writes asynchronously, improving write latency but risking data loss on cache failure
CDN caching (Cloudflare, Fastly, AWS CloudFront) serves responses from edge nodes geographically close to users. Cache-Control headers (max-age, s-maxage, stale-while-revalidate) control CDN behaviour. Surrogate-Key/Cache-Tag headers allow purging specific groups of cached responses