Pagination & Cursor-Based Loading
Fetching data in bounded chunks rather than loading unbounded result sets.
Overview
Pagination limits the number of records returned per request and provides a mechanism to retrieve additional pages. It is essential for any endpoint that can return unbounded result sets. The two dominant approaches are offset pagination (LIMIT/OFFSET) and cursor-based pagination (WHERE id > last_seen_id). Cursor-based pagination is superior for large, live datasets but requires more careful API design.
Origin
LIMIT/OFFSET pagination was introduced in SQL in the early 1980s. The cursor-based approach was popularised by Twitter's streaming API (2006) and Facebook's Graph API Cursor (2012). GitHub's REST API and Stripe's API both use cursor-based pagination as the default. GraphQL connections (Relay Cursor Connections specification, Lee Byron, 2015) standardised cursor pagination for GraphQL.
Examples
Cursor-based pagination in TypeScript with Prisma
interface PageInput {
cursor?: string; // base64-encoded last item ID
limit?: number;
}
interface PageResult<T> {
items: T[];
nextCursor: string | null;
hasMore: boolean;
}
async function getOrders(
customerId: string,
{ cursor, limit = 20 }: PageInput
): Promise<PageResult<Order>> {
const take = Math.min(limit, 100); // Cap at 100 to prevent abuse
const cursorId = cursor ? Buffer.from(cursor, 'base64').toString('utf8') : undefined;
const orders = await prisma.order.findMany({
where: { customerId },
take: take + 1, // Fetch one extra to detect if there is a next page
cursor: cursorId ? { id: cursorId } : undefined,
skip: cursorId ? 1 : 0, // Skip the cursor item itself
orderBy: { createdAt: 'desc' },
});
const hasMore = orders.length > take;
const items = hasMore ? orders.slice(0, take) : orders;
const lastItem = items[items.length - 1];
const nextCursor = hasMore && lastItem
? Buffer.from(lastItem.id).toString('base64')
: null;
return { items, nextCursor, hasMore };
}Fetching take + 1 items is the canonical way to detect if there is a next page without a separate COUNT query. The cursor is an opaque base64 string; clients treat it as a bookmark, not a page number. This allows the underlying implementation to change.
Offset vs cursor pagination trade-offs in Ruby/ActiveRecord
# OFFSET pagination: simple but degrades at large offsets
class Order < ApplicationRecord
scope :paginated_offset, ->(page:, per_page: 25) {
limit(per_page).offset((page - 1) * per_page)
}
end
# Problem: OFFSET 10000 LIMIT 25 scans 10,025 rows to return 25
# Cost grows linearly with page number
# Also: items can shift between page loads (insertions/deletions cause duplicates/gaps)
orders = Order.where(customer_id: 1).order(created_at: :desc)
.paginated_offset(page: 400, per_page: 25)
# EXPLAIN: Seq Scan on orders, rows=10025 before LIMIT
# CURSOR pagination: consistent O(log n) regardless of page depth
class Order < ApplicationRecord
def self.paginated_cursor(after_id: nil, limit: 25)
scope = order(id: :desc).limit(limit + 1)
scope = scope.where('id < ?', after_id) if after_id
items = scope.to_a
{
items: items.first(limit),
next_cursor: items.size > limit ? items[limit - 1].id : nil
}
end
end
# Uses: WHERE id < 12345 LIMIT 26
# With an index on id (always present for primary key), cost is O(log n)The WHERE id < cursor + LIMIT approach uses the primary key index for O(log n) access at any depth. OFFSET 10000 requires PostgreSQL to scan and discard 10,000 rows even with an index; cursor avoids this entirely.
Use Cases
- 01Any API endpoint returning lists of records to prevent unbounded response sizes and timeouts
- 02Infinite scroll UIs where cursor pagination loads the next batch seamlessly without page numbers
- 03API rate-limited consumers that process records in batches; cursors allow resuming from a checkpoint after a pause
- 04Data export jobs that paginate through millions of records without loading all into memory
When Not to Use
- //Do not use cursor pagination when users need to jump to an arbitrary page number (search results page 47); cursors are only traversable sequentially
- //Do not use OFFSET pagination for deep pages (beyond page 100) in high-volume tables; the performance degrades badly and results are inconsistent with live data
- //Do not add pagination to endpoints that return small bounded result sets (a user's 3 active sessions, settings options); it adds API complexity without benefit
Technical Notes
- Keyset pagination (WHERE (created_at, id) < (last_created_at, last_id)) handles ties in the sort column by including a secondary unique column (id) in the cursor; this is necessary when the primary sort column is not unique
- Relay Cursor Connections specification defines edges/node/cursor structure, pageInfo with hasNextPage/hasPreviousPage, and startCursor/endCursor. GraphQL clients (Apollo, Relay) have built-in support for this format
- PostgreSQL's LIMIT/OFFSET uses the same index as the WHERE clause but must skip over OFFSET rows. The Seek Method (use WHERE instead of OFFSET) consistently outperforms OFFSET at depth; this is the same principle as cursor pagination
- COUNT(*) for total pages is expensive on large tables; avoid it unless required. Facebook Graph API omits total counts entirely; GitHub REST API includes a Link header with rel="last" for the last page URL based on a heuristic
More in Performance