Performance

N+1 Query Problem

Identifying and eliminating the most common source of accidental database load in ORM-heavy applications.

Overview

The N+1 query problem occurs when fetching a list of N records and then executing an additional query for each record to fetch associated data, resulting in N+1 total queries. For a list of 100 orders with their customer names, a naive implementation executes 1 query for orders and 100 queries for customers. The fix is eager loading: fetching all required associations in a fixed number of queries upfront.

Origin

The problem was identified in early ORM systems (Hibernate, ActiveRecord) where lazy association loading was the default. The Bullet gem (Rails, created by flyerhzm, 2010) popularised automatic N+1 detection. Shopify's Graphql-batch (2015) solved the equivalent problem in GraphQL with the dataloader pattern. Facebook's DataLoader (2015) brought the same pattern to Node.js.

Examples

N+1 detection and fix with eager loading in Rails

# N+1 PROBLEM: 1 query for orders, then 1 query per order for customer
# Total: 1 + N queries
def orders_with_customers_bad
  orders = Order.where(status: 'confirmed').limit(50)
  orders.map do |order|
    { id: order.id, customer_name: order.customer.name } # N queries here
  end
end

# FIXED: eager load associations upfront
# Total: 2 queries (1 for orders, 1 for all customers)
def orders_with_customers_good
  orders = Order.includes(:customer).where(status: 'confirmed').limit(50)
  orders.map do |order|
    { id: order.id, customer_name: order.customer.name } # No query; loaded in memory
  end
end

# DEEPER: eager load nested associations
def orders_with_full_details
  Order
    .includes(customer: :address, line_items: :product)
    .where(status: 'confirmed')
    .limit(50)
    .map do |order|
      {
        id: order.id,
        customer: order.customer.name,
        city: order.customer.address.city,
        items: order.line_items.count
      }
    end
end

includes in ActiveRecord uses either a separate IN query (for simple associations) or a LEFT OUTER JOIN (when includes is combined with references or joins). For large association sets, joins + select is more efficient than includes which loads full models into Ruby objects.

DataLoader for N+1 prevention in GraphQL with TypeScript

import DataLoader from 'dataloader';
import { db } from '../database';

// Batch function: receives array of keys, returns array of values in the same order
async function batchLoadUsers(userIds: readonly string[]) {
  const users = await db.users.findMany({
    where: { id: { in: [...userIds] } },
  });
  const userMap = new Map(users.map(u => [u.id, u]));
  return userIds.map(id => userMap.get(id) ?? new Error('User not found: ' + id));
}

const userLoader = new DataLoader(batchLoadUsers, {
  maxBatchSize: 100,
  cache: true, // Within a single request, same ID returns cached result
});

// GraphQL resolver: no N+1 - DataLoader batches all load() calls within one tick
const resolvers = {
  Order: {
    customer: async (order: { customerId: string }) => {
      return userLoader.load(order.customerId);
    },
  },
  Query: {
    orders: async () => db.orders.findMany({ where: { status: 'confirmed' } }),
  },
};

// Requesting 50 orders with customer names:
// Without DataLoader: 51 queries
// With DataLoader: 2 queries (1 for orders, 1 batched for all customers)

DataLoader collects all load() calls within a single event loop tick and fires the batch function once. The cache ensures the same userId in one request is only fetched once. Create a fresh DataLoader per request, not as a singleton, to prevent cross-request data leakage.

Use Cases

  • 01REST API list endpoints that render related resources (orders with customers, posts with authors) are the most common N+1 source
  • 02GraphQL resolvers without DataLoader produce N+1 per field on list types by default; DataLoader is required for any GraphQL API at scale
  • 03Background jobs that iterate over records and call methods on associations
  • 04Admin interfaces (ActiveAdmin, Administrate) where association data appears in index views; these frameworks require explicit includes configuration

When Not to Use

  • //Do not eager load associations that are not used in the current query path; unnecessary includes fetch data that occupies memory and wastes query time
  • //Do not use includes for filtering (WHERE clause) on the associated table; use joins for filtering and includes for loading
  • //Do not eager load associations in recursive tree structures; use a dedicated query (WITH RECURSIVE in PostgreSQL) or a tree library (ancestry gem) instead

Technical Notes

  • The Bullet gem (Rails) detects N+1 queries, unused eager loads, and missing counter caches at runtime during development. It can raise exceptions, log warnings, or send notifications. Configure in config/environments/development.rb
  • ActiveRecord preload forces separate IN queries regardless of other join conditions; eager_load forces a LEFT OUTER JOIN. includes chooses between them based on whether references or where clauses reference the association
  • DataLoader's batching works by collecting calls within a single Promise microtask queue drain. If load() calls are separated by await, they land in separate batches; restructure code to call load() before any await when possible
  • Counter caches (counter_cache: true in ActiveRecord belongs_to) store the count of associated records on the parent record, eliminating COUNT queries for association sizes in list views