Rate Limiting & Brute-Force Defence
Throttling requests to protect endpoints from automated abuse.
Overview
Rate limiting restricts how many requests a client can make in a given time window, protecting APIs from abuse, DDoS amplification, credential stuffing, and brute force attacks. Common algorithms are fixed window, sliding window, token bucket, and leaky bucket. Rate limits should be enforced at the edge (CDN, API gateway) and at the application layer. Redis is the standard backing store for distributed rate limit state.
Origin
Rate limiting predates web APIs; SMTP servers implemented limits in the 1990s to reduce spam. API rate limiting became mainstream with Twitter's 1.0 API (2006) and its 150-requests-per-hour limit. Token bucket and leaky bucket algorithms were described in networking literature in the 1980s. Redis-based implementations (redis-cell, Upstash rate limiting) made distributed limiting practical.
Examples
Redis-backed sliding window rate limiter in TypeScript
import Redis from 'ioredis';
import { Request, Response, NextFunction } from 'express';
const redis = new Redis(process.env.REDIS_URL!);
async function slidingWindowLimit(
key: string,
maxRequests: number,
windowSec: number
): Promise<{ allowed: boolean; remaining: number; resetAt: number }> {
const now = Date.now();
const windowStart = now - windowSec * 1000;
const pipeline = redis.pipeline();
pipeline.zremrangebyscore(key, 0, windowStart); // remove old entries
pipeline.zadd(key, now, now.toString()); // add current request
pipeline.zcard(key); // count in window
pipeline.expire(key, windowSec); // auto-cleanup
const results = await pipeline.exec();
const count = (results?.[2]?.[1] as number) ?? 0;
return {
allowed: count <= maxRequests,
remaining: Math.max(0, maxRequests - count),
resetAt: now + windowSec * 1000,
};
}
export function rateLimitMiddleware(maxRequests: number, windowSec: number) {
return async (req: Request, res: Response, next: NextFunction) => {
const key = 'rl:' + (req.ip ?? 'unknown') + ':' + req.path;
const result = await slidingWindowLimit(key, maxRequests, windowSec);
res.setHeader('X-RateLimit-Limit', maxRequests);
res.setHeader('X-RateLimit-Remaining', result.remaining);
res.setHeader('X-RateLimit-Reset', Math.ceil(result.resetAt / 1000));
if (!result.allowed) {
res.status(429).json({ error: 'Rate limit exceeded. Try again later.' });
return;
}
next();
};
}Sorted set (ZSET) with score=timestamp stores each request; ZREMRANGEBYSCORE prunes entries outside the window before counting. All operations run in a single pipeline, reducing round trips. The sliding window is more accurate than fixed windows, which allow burst traffic at window boundaries.
Rate limiting in Rails with rack-attack
# config/initializers/rack_attack.rb
class Rack::Attack
# Use Redis for distributed counting
Rack::Attack.cache.store = ActiveSupport::Cache::RedisCacheStore.new(
url: ENV.fetch('REDIS_URL')
)
# Throttle login attempts per IP: 5 per 20 seconds
throttle('login/ip', limit: 5, period: 20.seconds) do |req|
req.ip if req.path == '/api/v1/auth/login' && req.post?
end
# Throttle login attempts per email: 10 per hour (prevent credential stuffing)
throttle('login/email', limit: 10, period: 1.hour) do |req|
if req.path == '/api/v1/auth/login' && req.post?
req.params['email'].to_s.downcase.strip.presence
end
end
# Allow health checks to bypass throttling
safelist('health-check') do |req|
req.path == '/health'
end
# Custom response for throttled requests
self.throttled_responder = lambda do |_env|
[429, { 'Content-Type' => 'application/json' }, ['{"error":"Rate limit exceeded"}']]
end
endrack-attack (v6.7+) operates at the Rack middleware layer, before Rails routing, ensuring rate limits apply even to malformed requests. Per-email throttling prevents credential stuffing even when attackers distribute attempts across many IPs.
Use Cases
- 01Authentication endpoints where limiting attempts per IP and per email prevents brute force and credential stuffing attacks
- 02Public API endpoints where per-API-key limits enforce fair use and prevent a single consumer from monopolising resources
- 03Password reset and OTP endpoints where unlimited attempts would allow enumeration or brute force of codes
- 04File upload and expensive computation endpoints where rate limits prevent resource exhaustion
When Not to Use
- //Do not rate limit internal service-to-service calls at the application layer; use network policy and circuit breakers instead
- //Do not use IP-based rate limiting as the sole defence for authenticated endpoints; users behind a corporate NAT share an IP and would be collectively blocked
- //Do not return information about remaining attempts for security-sensitive endpoints (login, password reset); this leaks information useful to automated attacks
Technical Notes
- Token bucket allows bursting up to the bucket capacity, then sustains a steady rate. Leaky bucket (output queue) enforces a constant outflow rate with no burst. Sliding window log is accurate but memory-intensive; sliding window counter approximates it with less memory
- Retry-After header (RFC 7231) should accompany 429 responses indicating how many seconds to wait. Many HTTP clients and API frameworks automatically honour it for retry logic
- Cloudflare Rate Limiting (managed, layer 7) and AWS API Gateway throttling operate at the edge before traffic reaches the origin; application-level limiting is a backstop for traffic that bypasses the CDN
- Redis Lua scripts (used by the redis-cell module implementing GCRA) ensure atomic rate limit evaluation without pipeline race conditions; the pipeline approach above has a small race window between ZCARD and the allow decision
More in Safety