Test-Driven Development
Writing tests before code to drive design and guarantee correctness incrementally.
Overview
Test-Driven Development is a development process where tests are written before the implementation code. The cycle is Red (write a failing test), Green (write the minimum code to pass), Refactor (clean up while keeping tests green). Kent Beck popularised TDD in "Test-Driven Development: By Example" (2002) and it became central to Extreme Programming (XP).
Origin
Kent Beck rediscovered TDD while programming in Smalltalk in the 1990s, inspired by an old programming guide suggesting writing test cases before code. He codified it in "Extreme Programming Explained" (1999) and "TDD: By Example" (2002). JUnit (Beck and Erich Gamma, 1997) provided the tooling that made Java TDD practical.
Examples
Red-Green-Refactor cycle in TypeScript with Jest
// Step 1: RED - write the failing test first
describe('FizzBuzz', () => {
it('returns "Fizz" for multiples of 3', () => {
expect(fizzBuzz(3)).toBe('Fizz');
expect(fizzBuzz(9)).toBe('Fizz');
});
it('returns "Buzz" for multiples of 5', () => {
expect(fizzBuzz(5)).toBe('Buzz');
expect(fizzBuzz(10)).toBe('Buzz');
});
it('returns "FizzBuzz" for multiples of 15', () => {
expect(fizzBuzz(15)).toBe('FizzBuzz');
});
it('returns the number as a string otherwise', () => {
expect(fizzBuzz(1)).toBe('1');
expect(fizzBuzz(7)).toBe('7');
});
});
// Step 2: GREEN - minimal implementation
function fizzBuzz(n: number): string {
if (n % 15 === 0) return 'FizzBuzz';
if (n % 3 === 0) return 'Fizz';
if (n % 5 === 0) return 'Buzz';
return String(n);
}The tests define the contract before the function exists. This forces thinking about edge cases (multiples of both 3 and 5) upfront. The % 15 check must precede % 3 and % 5; TDD's tests catch a wrong ordering immediately.
Testing behaviour not implementation in RSpec
# spec/services/password_reset_service_spec.rb
RSpec.describe PasswordResetService do
let(:user) { create(:user, email: 'alice@example.com') }
let(:mailer) { instance_double(ResetMailer) }
let(:service) { described_class.new(mailer: mailer) }
describe '#request_reset' do
it 'generates a token and sends the email' do
allow(mailer).to receive(:send_reset).and_return(true)
token = service.request_reset(user.email)
expect(token).to match(/A[a-f0-9]{64}z/)
expect(mailer).to have_received(:send_reset).with(
email: user.email, token: token
)
end
it 'returns nil for unknown emails without raising' do
expect(service.request_reset('nobody@example.com')).to be_nil
end
end
endinstance_double verifies that ResetMailer actually responds to send_reset; plain double would not catch interface drift. Testing behaviour (a token is generated and email is sent) rather than implementation (a specific method is called internally) keeps tests resilient to refactoring.
Use Cases
- 01Library and SDK development where the public API must be nailed down before internal implementation
- 02Bug fixing: writing a failing test that reproduces the bug before fixing it proves the fix works and prevents regression
- 03Refactoring legacy code: a test suite written first provides a safety net for structural changes
- 04Complex business logic (pricing rules, discount calculations) where exhaustive case coverage is important and tests serve as living documentation
When Not to Use
- //Exploratory prototyping where the interface is not yet known; writing tests for a moving target wastes time and produces brittle tests
- //UI/UX work where the visual design drives the implementation and pixel-level assertions are fragile
- //Integrations with external APIs during initial discovery where the schema is uncertain and tests require live services
Technical Notes
- The canonical TDD rhythm is: write one failing test, implement just enough to pass, refactor. Writing multiple tests before implementing is "test-first" but not strictly TDD; the cycle granularity matters
- Test coverage (Istanbul/nyc for JavaScript, SimpleCov for Ruby) measures lines/branches executed. 100% coverage does not imply correctness; coverage is a floor, not a ceiling
- Sociable tests (testing a real object graph) vs solitary tests (isolating with test doubles) is a spectrum; Martin Fowler's "unit test" definition is intentionally broad. Over-mocking produces tests that pass despite broken integrations
- Mutation testing (Stryker for JS/TS, mutant for Ruby) alters code and checks whether tests catch the change. It measures test suite quality, not just coverage, by detecting tests that never actually fail
More in Types of Programming