Mutation Testing: Testing the Quality of Your Tests
Have you ever shipped a bug from a project with 100% code coverage? I have. At first I thought, "How did this pass all the tests?" Turned out the tests existed but weren't actually asserting anything meaningful.
// 100% coverage, but testing nothing
it('discount function runs', () => {
const result = calculateDiscount(100000);
expect(result).toBeDefined(); // just checks it's not undefined
});
This doesn't verify whether result is 90000, 50000, or 0. The function can return anything and the test passes. Coverage tools only check "was this line executed?" — so this hits 100%.
That's the fundamental limitation of code coverage. And that's exactly what mutation testing solves.
What Is Mutation Testing?
Like a sci-fi mutant, mutation testing intentionally introduces small defects (mutants) into your source code, then checks whether your tests detect them.
It borrows from genetics. Take a real virus (mutant), weaken it (mutate), inject it, and watch whether the immune system (tests) reacts. If it reacts, you have immunity (tests correctly verify behavior). If it doesn't, you're vulnerable (tests miss this case).
Types of Mutations
Mutation tools generate various kinds of changes:
// Original code
function calculateDiscount(amount: number): number {
if (amount >= 100000) { // (1) candidate for conditional mutation
return amount * 0.9; // (2) candidate for arithmetic mutation
}
return amount;
}
Conditional Mutations
// Original: amount >= 100000
// Mutant 1: amount > 100000 (>= becomes >)
// Mutant 2: amount <= 100000 (inverted)
// Mutant 3: amount < 100000 (fully inverted)
// Mutant 4: true (always true)
// Mutant 5: false (always false)
Arithmetic Mutations
// Original: amount * 0.9
// Mutant 1: amount + 0.9
// Mutant 2: amount - 0.9
// Mutant 3: amount / 0.9
// Mutant 4: amount * 0.1 (constant mutation)
Return Value Mutations
// Original: return amount;
// Mutant 1: return 0;
// Mutant 2: return -amount;
For each mutant, your test suite runs. If a test fails: killed. If tests still pass: survived.
Mutation Score
Mutation Score = (Killed Mutants / Total Mutants) × 100
A higher mutation score means your tests reliably detect changes in behavior.
| Score | Interpretation |
|---|---|
| 90%+ | Excellent test suite |
| 70–90% | Good, room for improvement |
| 50–70% | Weak, likely missing important cases |
| Below 50% | Dangerous, major gaps in test coverage |
Setting Up Stryker.js
Stryker.js is the most popular mutation testing tool for JavaScript/TypeScript.
Installation
npm install --save-dev @stryker-mutator/core @stryker-mutator/vitest-runner
# For Jest:
# npm install --save-dev @stryker-mutator/core @stryker-mutator/jest-runner
Configuration
npx stryker init
Or write stryker.config.mjs manually:
// stryker.config.mjs
/** @type {import('@stryker-mutator/api/core').PartialStrykerOptions} */
const config = {
packageManager: 'npm',
reporters: ['html', 'clear-text', 'progress'],
testRunner: 'vitest',
coverageAnalysis: 'perTest',
// Files to mutate
mutate: [
'src/**/*.ts',
'!src/**/*.test.ts',
'!src/**/*.spec.ts',
'!src/index.ts', // exclude entry points
],
// Parallel execution (tune to your CPU count)
concurrency: 4,
timeoutMS: 60000,
};
export default config;
Running It
npx stryker run
Reading Results
Stryker generates an HTML report in reports/mutation/. Terminal output example:
Ran 312 tests in 12 seconds (30 survived out of 89 mutants)
----------|---------|----------|---------|---------|
File | % Score | # Killed | # Survived | # Timeout |
----------|---------|----------|---------|---------|
discount.ts | 66.3 | 59 | 30 | 0 |
cart.ts | 88.7 | 79 | 10 | 1 |
----------|---------|----------|---------|---------|
All files | 77.5 | 138 | 40 | 1 |
----------|---------|----------|---------|---------|
discount.ts has a mutation score of 66.3%. Thirty mutants survived — meaning 30 variations of the code go undetected by your tests.
Hands-On: Analyzing Surviving Mutants
The HTML report shows exactly which mutants survived. Example:
// Original code (discount.ts)
export function calculateDiscount(amount: number): number {
if (amount < 0) {
throw new Error('Amount must be 0 or greater');
}
if (amount >= 100000) { // Mutant: >= changed to > — tests still pass!
return amount * 0.9;
}
return amount;
}
Surviving mutant:
// Mutant: amount >= 100000 → amount > 100000
// Why it survived: no test covers amount === 100000
This reveals a missing boundary test:
// Before (missing boundary test)
describe('calculateDiscount', () => {
it('applies 10% discount for 150,000', () => {
expect(calculateDiscount(150000)).toBe(135000);
});
it('no discount for 50,000', () => {
expect(calculateDiscount(50000)).toBe(50000);
});
});
// Tests the mutant revealed were missing
it('applies 10% discount at exactly 100,000 (boundary)', () => {
expect(calculateDiscount(100000)).toBe(90000);
});
it('no discount for 99,999 (just below threshold)', () => {
expect(calculateDiscount(99999)).toBe(99999);
});
Surviving mutants point precisely at untested cases.
A More Complex Example: E-Commerce Pricing
// pricing.ts
interface DiscountRule {
minimumAmount: number;
discountRate: number;
maximumDiscount?: number;
}
const DISCOUNT_RULES: DiscountRule[] = [
{ minimumAmount: 300000, discountRate: 0.15, maximumDiscount: 50000 },
{ minimumAmount: 100000, discountRate: 0.10 },
{ minimumAmount: 50000, discountRate: 0.05 },
];
export function calculateFinalPrice(
amount: number,
couponCode?: string
): number {
if (amount <= 0) throw new Error('Invalid amount');
const applicableRule = DISCOUNT_RULES.find(
rule => amount >= rule.minimumAmount
);
let discountAmount = 0;
if (applicableRule) {
discountAmount = amount * applicableRule.discountRate;
if (applicableRule.maximumDiscount) {
discountAmount = Math.min(discountAmount, applicableRule.maximumDiscount);
}
}
if (couponCode === 'EXTRA10') {
discountAmount += amount * 0.1;
}
return Math.max(0, amount - discountAmount);
}
Tests that fail mutation testing:
// These tests let many mutants survive
describe('calculateFinalPrice', () => {
it('works normally', () => {
expect(calculateFinalPrice(200000)).toBeDefined();
});
it('coupon reduces price', () => {
const withCoupon = calculateFinalPrice(100000, 'EXTRA10');
const withoutCoupon = calculateFinalPrice(100000);
expect(withCoupon).toBeLessThan(withoutCoupon);
});
});
Tests that kill the mutants:
describe('calculateFinalPrice', () => {
describe('base discount rules', () => {
it('300,000+: 15% discount capped at 50,000', () => {
// 300,000 × 15% = 45,000 discount
expect(calculateFinalPrice(300000)).toBe(255000);
});
it('400,000: discount capped at 50,000 max', () => {
// 400,000 × 15% = 60,000 → capped at 50,000
expect(calculateFinalPrice(400000)).toBe(350000);
});
it('100,000 to 299,999: 10% discount', () => {
expect(calculateFinalPrice(200000)).toBe(180000);
});
it('50,000 to 99,999: 5% discount', () => {
expect(calculateFinalPrice(80000)).toBe(76000);
});
it('under 50,000: no discount', () => {
expect(calculateFinalPrice(30000)).toBe(30000);
});
// Boundary tests (from surviving mutants)
it('exactly 300,000: 15% discount applies', () => {
expect(calculateFinalPrice(300000)).toBe(255000);
});
it('299,999: only 10% discount', () => {
expect(calculateFinalPrice(299999)).toBe(269999.1);
});
});
describe('coupon application', () => {
it('EXTRA10 coupon: additional 10% on top of base discount', () => {
// 100,000: base 10% = 10,000, coupon 10% = 10,000, total = 80,000
expect(calculateFinalPrice(100000, 'EXTRA10')).toBe(80000);
});
it('invalid coupon: only base discount', () => {
expect(calculateFinalPrice(100000, 'INVALID')).toBe(90000);
});
});
describe('error cases', () => {
it('throws for amount of 0 or less', () => {
expect(() => calculateFinalPrice(0)).toThrow('Invalid amount');
expect(() => calculateFinalPrice(-1000)).toThrow('Invalid amount');
});
});
});
Performance Considerations
Mutation testing runs your test suite hundreds or thousands of times. It can be slow.
100 mutants × 30 seconds per test run = 3,000 seconds (50 minutes!)
Practical optimization strategies:
1. Use coverageAnalysis
const config = {
coverageAnalysis: 'perTest', // tracks which tests cover which code
// → skips tests unrelated to a mutant (major speedup)
};
2. Narrow the mutation target
const config = {
mutate: [
'src/lib/pricing.ts', // core business logic only
'src/lib/validation.ts',
// exclude UI components, config files, etc.
],
};
3. Incremental mode in CI
const config = {
incremental: true,
incrementalFile: '.stryker-tmp/incremental.json',
};
4. Tune concurrency
const config = {
concurrency: Math.max(1, os.cpus().length - 1),
};
Realistically, run full mutation tests before major releases or when changing core modules — not on every commit.
Rollout Strategy
Step 1: Pilot on One File
npx stryker run --mutate src/lib/pricing.ts
Share results with the team. "These cases weren't being tested" is immediately persuasive.
Step 2: Set Thresholds
const config = {
thresholds: {
high: 80,
low: 60,
break: 50, // CI fails below this
},
};
Step 3: CI Integration
# .github/workflows/mutation.yml
name: Mutation Tests
on:
push:
branches: [main]
paths:
- 'src/lib/**' # only on core module changes
jobs:
mutation-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
- run: npm ci
- run: npx stryker run
- uses: actions/upload-artifact@v4
with:
name: stryker-report
path: reports/mutation/
Coverage vs Mutation Score
| Metric | Measures | Limitation |
|---|---|---|
| Code Coverage | % of code executed during tests | Only checks execution, not assertions |
| Mutation Score | % of mutations detected by tests | Slow and computationally expensive |
You need both. Coverage quickly surfaces untested areas. Mutation testing deeply validates whether tests are actually meaningful.
Low coverage + low mutation score → no real tests
High coverage + low mutation score → tests exist but verify nothing (most dangerous)
High coverage + high mutation score → genuinely good tests
Closing: Strengthen Your Immune System
Mutation testing is slow and expensive to compute. You don't need it everywhere. But core business logic — billing, discounts, authorization — deserves this level of scrutiny.
Your first run will probably be humbling. 100% coverage with a 40% mutation score is a gut punch. But knowing is strictly better than not knowing.
This is immune system work. Making sure mutants don't survive into production.