Mutation Testing: Finding Weak Tests
A practical guide to mutation testing in JavaScript using Stryker Mutator to find tests that pass when they should fail, improving test suite effectiveness.
Mutation Testing: Finding Weak Tests
Code coverage tells you what your tests execute. Mutation testing tells you what your tests actually verify. There is a significant difference.
A test suite with 100% coverage can pass while the code is fundamentally broken — if the tests execute every line but never check the results. Mutation testing proves your tests are effective by introducing bugs and checking whether the tests catch them. If you change > to >= and every test still passes, those tests are not testing what you think they are testing.
Prerequisites
- Node.js installed (v16+)
- A project with existing tests (Jest or Mocha)
- Understanding of code coverage concepts
How Mutation Testing Works
- Parse — Stryker reads your source code and identifies locations where mutations can be applied
- Mutate — For each location, Stryker creates a modified version (a "mutant") by changing one thing: replacing
+with-,>with>=,truewithfalse, etc. - Test — Stryker runs your test suite against each mutant
- Score — If a test fails, the mutant is "killed" (good). If all tests pass, the mutant "survived" (bad — your tests missed a bug)
Original: if (age >= 18) { return "adult"; }
Mutant 1: if (age > 18) { return "adult"; } ← Tests catch this? Killed ✓
Mutant 2: if (age <= 18) { return "adult"; } ← Tests catch this? Killed ✓
Mutant 3: if (age >= 18) { return ""; } ← Tests catch this? Survived ✗
If Mutant 3 survives, your tests never check the return value for adults. They might check that the function runs without errors, but they do not verify the actual output.
Setting Up Stryker
Installation
npm install --save-dev @stryker-mutator/core @stryker-mutator/jest-runner
For Mocha:
npm install --save-dev @stryker-mutator/core @stryker-mutator/mocha-runner
Configuration
npx stryker init
Or create stryker.conf.js manually:
// stryker.conf.js
module.exports = {
mutator: "javascript",
packageManager: "npm",
reporters: ["html", "clear-text", "progress"],
testRunner: "jest",
jest: {
configFile: "jest.config.js"
},
coverageAnalysis: "perTest",
mutate: [
"src/**/*.js",
"!src/**/*.test.js",
"!src/test/**"
],
thresholds: {
high: 80,
low: 60,
break: 50
}
};
Running Mutation Tests
npx stryker run
Reading Mutation Reports
Terminal Output
All files
Mutation score: 72.5%
Mutants:
Killed: 58
Survived: 18
No coverage: 4
Timeout: 0
Runtime errors: 0
src/calculator.js
Mutation score: 85.7%
14 Killed, 2 Survived, 0 No coverage
src/validator.js
Mutation score: 60.0%
12 Killed, 8 Survived, 0 No coverage
src/userService.js
Mutation score: 72.7%
32 Killed, 8 Survived, 4 No coverage
Key terms:
- Killed — A test failed when this mutation was applied (good)
- Survived — All tests passed with this mutation (bad — test gap)
- No coverage — No test covers this code at all
- Timeout — The mutation caused an infinite loop (counts as killed)
- Mutation score — Killed / (Killed + Survived) as a percentage
HTML Report
The HTML report shows each mutant in context:
src/calculator.js
Line 5: if (b === 0) {
✗ Survived: ConditionalExpression — replaced b === 0 with true
✓ Killed: ConditionalExpression — replaced b === 0 with false
Line 8: return a / b;
✓ Killed: ArithmeticOperator — replaced / with *
✓ Killed: ArithmeticOperator — replaced / with +
Practical Example: Finding Weak Tests
The Code
// pricing.js
function calculatePrice(basePrice, quantity, customerType) {
if (quantity <= 0) {
throw new Error("Quantity must be positive");
}
var subtotal = basePrice * quantity;
var discount = 0;
if (customerType === "premium") {
discount = 0.15;
} else if (customerType === "wholesale") {
discount = 0.25;
}
if (quantity >= 100) {
discount = discount + 0.05;
}
var discountedPrice = subtotal * (1 - discount);
var tax = discountedPrice * 0.08;
var total = discountedPrice + tax;
return {
subtotal: Math.round(subtotal * 100) / 100,
discount: Math.round(discount * 100),
tax: Math.round(tax * 100) / 100,
total: Math.round(total * 100) / 100
};
}
module.exports = { calculatePrice: calculatePrice };
The Weak Tests
// pricing.test.js — tests that look good but have gaps
var pricing = require("./pricing");
describe("calculatePrice", function() {
test("calculates price for regular customer", function() {
var result = pricing.calculatePrice(10, 5, "regular");
expect(result.total).toBeGreaterThan(0);
});
test("applies premium discount", function() {
var result = pricing.calculatePrice(100, 1, "premium");
expect(result.discount).toBe(15);
});
test("throws for zero quantity", function() {
expect(function() {
pricing.calculatePrice(10, 0, "regular");
}).toThrow();
});
test("handles large quantities", function() {
var result = pricing.calculatePrice(10, 200, "regular");
expect(result).toBeDefined();
});
});
Mutation Results
Survived mutants:
1. Line 7: replaced basePrice * quantity with basePrice + quantity
Test "calculates price for regular customer" still passes
→ The test checks total > 0, not the actual value
2. Line 14: replaced 0.25 with 0.26
No test checks wholesale discount
→ Missing test for wholesale customer type
3. Line 17: replaced >= with >
Test "handles large quantities" still passes
→ The test uses quantity=200, never tests the boundary at 100
4. Line 17: replaced discount + 0.05 with discount - 0.05
Test "handles large quantities" still passes
→ The test checks result.toBeDefined(), not the actual discount
5. Line 20: replaced 0.08 with 0.09
Test "calculates price for regular customer" still passes
→ No test verifies the tax calculation
6. Line 25: replaced subtotal with 0
Test "calculates price for regular customer" still passes
→ The subtotal field is never checked
The Improved Tests
// pricing.test.js — tests that kill all mutants
var pricing = require("./pricing");
describe("calculatePrice", function() {
test("calculates subtotal correctly", function() {
var result = pricing.calculatePrice(10, 5, "regular");
expect(result.subtotal).toBe(50);
});
test("applies no discount for regular customers", function() {
var result = pricing.calculatePrice(100, 1, "regular");
expect(result.discount).toBe(0);
expect(result.total).toBe(108); // 100 + 8% tax
});
test("applies 15% discount for premium customers", function() {
var result = pricing.calculatePrice(100, 1, "premium");
expect(result.discount).toBe(15);
expect(result.total).toBe(91.80); // 85 + 8% tax
});
test("applies 25% discount for wholesale customers", function() {
var result = pricing.calculatePrice(100, 1, "wholesale");
expect(result.discount).toBe(25);
expect(result.total).toBe(81); // 75 + 8% tax
});
test("adds 5% bulk discount at exactly 100 quantity", function() {
var result = pricing.calculatePrice(10, 100, "regular");
expect(result.discount).toBe(5);
expect(result.subtotal).toBe(1000);
});
test("no bulk discount at 99 quantity", function() {
var result = pricing.calculatePrice(10, 99, "regular");
expect(result.discount).toBe(0);
});
test("stacks bulk and customer discounts", function() {
var result = pricing.calculatePrice(10, 100, "premium");
expect(result.discount).toBe(20); // 15% + 5%
});
test("calculates tax at 8%", function() {
var result = pricing.calculatePrice(100, 1, "regular");
expect(result.tax).toBe(8);
});
test("throws for zero quantity", function() {
expect(function() {
pricing.calculatePrice(10, 0, "regular");
}).toThrow("Quantity must be positive");
});
test("throws for negative quantity", function() {
expect(function() {
pricing.calculatePrice(10, -1, "regular");
}).toThrow("Quantity must be positive");
});
});
After these improvements, the mutation score jumps from 45% to 100%.
Mutation Types
Stryker applies many types of mutations:
Arithmetic Operators
// Original // Mutated
a + b a - b
a - b a + b
a * b a / b
a / b a * b
a % b a * b
Comparison Operators
// Original // Mutated
a > b a >= b, a < b
a >= b a > b, a <= b
a < b a <= b, a > b
a <= b a < b, a >= b
a === b a !== b
a !== b a === b
Logical Operators
// Original // Mutated
a && b a || b
a || b a && b
!a a
Conditional Expressions
// Original // Mutated
if (condition) { ... } if (true) { ... }
if (condition) { ... } if (false) { ... }
condition ? a : b true ? a : b
condition ? a : b false ? a : b
String and Array Mutations
// Original // Mutated
"hello" ""
"" "Stryker was here!"
array.length 0
Increments and Decrements
// Original // Mutated
i++ i--
i-- i++
++i --i
--i ++i
Configuring Mutators
Disable specific mutation types if they generate too much noise:
// stryker.conf.js
module.exports = {
mutator: {
plugins: null, // Use all built-in mutators
excludedMutations: [
"StringLiteral", // Skip string mutations (often noisy)
"ObjectLiteral" // Skip object literal mutations
]
}
};
Performance Optimization
Mutation testing is slow because it runs your test suite once per mutant. A project with 100 mutants and a 10-second test suite takes 16+ minutes.
Limit Scope
// Only mutate changed files
module.exports = {
mutate: [
"src/pricing.js",
"src/validator.js"
]
};
# Or via command line
npx stryker run --mutate "src/pricing.js"
Use Coverage Analysis
module.exports = {
coverageAnalysis: "perTest"
// "perTest" — only runs tests that cover each mutant (fastest)
// "all" — runs all tests for every mutant (slowest, most thorough)
// "off" — runs all tests, no optimization
};
Increase Concurrency
module.exports = {
concurrency: 4 // Run 4 mutants in parallel
};
Set Timeouts
module.exports = {
timeoutMS: 10000, // Kill mutants that take longer than 10s
timeoutFactor: 1.5 // Or 1.5x the normal test run time
};
Interpreting Results
High Mutation Score (80%+)
Your tests are effective. They verify behavior, not just execution. Most mutations that could mask bugs are caught.
Medium Mutation Score (60-80%)
Common causes:
- Tests check that functions run without errors but do not check return values
- Edge cases at boundaries (
>=vs>) are not tested - Some code paths have no assertions
Low Mutation Score (below 60%)
Your test suite has significant gaps. Many tests execute code without verifying results. Focus on:
- Adding assertions to existing tests
- Testing boundary conditions
- Testing negative cases and error paths
Surviving Mutants Are Not All Equal
Some surviving mutants matter more than others:
// HIGH PRIORITY — business logic mutation survived
// Original: discount = 0.15
// Mutant: discount = 0.16
// Impact: Customers are charged wrong amounts
// LOW PRIORITY — logging mutation survived
// Original: logger.info("Processing order " + orderId)
// Mutant: logger.info("")
// Impact: Log messages are less informative
Focus on killing mutants in business-critical code. Surviving mutants in logging, comments, or developer-facing output are less concerning.
Integrating with CI/CD
Break the Build on Low Scores
// stryker.conf.js
module.exports = {
thresholds: {
high: 80, // Green in report
low: 60, // Yellow in report
break: 50 // Fail the build below this score
}
};
GitHub Actions
name: Mutation Tests
on: [pull_request]
jobs:
mutation:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- run: npm install
- run: npx stryker run
- name: Upload mutation report
if: always()
uses: actions/upload-artifact@v4
with:
name: mutation-report
path: reports/mutation/
Run on Changed Files Only
# Get changed files compared to main branch
CHANGED=$(git diff --name-only main...HEAD -- 'src/**/*.js' | grep -v '.test.js' | tr '\n' ',')
# Run Stryker only on changed files
if [ -n "$CHANGED" ]; then
npx stryker run --mutate "$CHANGED"
fi
Complete Working Example
// cart.js
function Cart() {
this.items = [];
}
Cart.prototype.addItem = function(product, quantity) {
if (!product || !product.id) {
throw new Error("Invalid product");
}
if (quantity <= 0) {
throw new Error("Quantity must be positive");
}
var existing = null;
for (var i = 0; i < this.items.length; i++) {
if (this.items[i].product.id === product.id) {
existing = this.items[i];
break;
}
}
if (existing) {
existing.quantity = existing.quantity + quantity;
} else {
this.items.push({ product: product, quantity: quantity });
}
};
Cart.prototype.removeItem = function(productId) {
this.items = this.items.filter(function(item) {
return item.product.id !== productId;
});
};
Cart.prototype.getTotal = function() {
var total = 0;
for (var i = 0; i < this.items.length; i++) {
total = total + (this.items[i].product.price * this.items[i].quantity);
}
return Math.round(total * 100) / 100;
};
Cart.prototype.getItemCount = function() {
var count = 0;
for (var i = 0; i < this.items.length; i++) {
count = count + this.items[i].quantity;
}
return count;
};
Cart.prototype.isEmpty = function() {
return this.items.length === 0;
};
module.exports = Cart;
// cart.test.js — tests designed to kill all mutants
var Cart = require("./cart");
describe("Cart", function() {
var cart;
var apple = { id: 1, name: "Apple", price: 1.50 };
var banana = { id: 2, name: "Banana", price: 0.75 };
beforeEach(function() {
cart = new Cart();
});
describe("addItem", function() {
test("adds a new item", function() {
cart.addItem(apple, 3);
expect(cart.items.length).toBe(1);
expect(cart.items[0].quantity).toBe(3);
expect(cart.items[0].product.id).toBe(1);
});
test("increases quantity for existing item", function() {
cart.addItem(apple, 2);
cart.addItem(apple, 3);
expect(cart.items.length).toBe(1);
expect(cart.items[0].quantity).toBe(5);
});
test("handles multiple different items", function() {
cart.addItem(apple, 1);
cart.addItem(banana, 2);
expect(cart.items.length).toBe(2);
});
test("throws for null product", function() {
expect(function() { cart.addItem(null, 1); }).toThrow("Invalid product");
});
test("throws for product without id", function() {
expect(function() { cart.addItem({ name: "X" }, 1); }).toThrow("Invalid product");
});
test("throws for zero quantity", function() {
expect(function() { cart.addItem(apple, 0); }).toThrow("Quantity must be positive");
});
test("throws for negative quantity", function() {
expect(function() { cart.addItem(apple, -1); }).toThrow("Quantity must be positive");
});
});
describe("removeItem", function() {
test("removes an existing item", function() {
cart.addItem(apple, 1);
cart.addItem(banana, 2);
cart.removeItem(1);
expect(cart.items.length).toBe(1);
expect(cart.items[0].product.id).toBe(2);
});
test("does nothing when removing non-existent item", function() {
cart.addItem(apple, 1);
cart.removeItem(999);
expect(cart.items.length).toBe(1);
});
});
describe("getTotal", function() {
test("returns 0 for empty cart", function() {
expect(cart.getTotal()).toBe(0);
});
test("calculates total for single item", function() {
cart.addItem(apple, 3);
expect(cart.getTotal()).toBe(4.50);
});
test("calculates total for multiple items", function() {
cart.addItem(apple, 2);
cart.addItem(banana, 4);
expect(cart.getTotal()).toBe(6);
});
});
describe("getItemCount", function() {
test("returns 0 for empty cart", function() {
expect(cart.getItemCount()).toBe(0);
});
test("counts total quantity across items", function() {
cart.addItem(apple, 3);
cart.addItem(banana, 2);
expect(cart.getItemCount()).toBe(5);
});
});
describe("isEmpty", function() {
test("returns true for empty cart", function() {
expect(cart.isEmpty()).toBe(true);
});
test("returns false after adding an item", function() {
cart.addItem(apple, 1);
expect(cart.isEmpty()).toBe(false);
});
test("returns true after removing all items", function() {
cart.addItem(apple, 1);
cart.removeItem(1);
expect(cart.isEmpty()).toBe(true);
});
});
});
Common Issues and Troubleshooting
Mutation testing takes too long
Large projects with many mutants and slow test suites compound into hours:
Fix: Start with coverageAnalysis: "perTest" to only run relevant tests per mutant. Limit mutate to specific directories or changed files. Increase concurrency to parallelize. Set reasonable timeoutMS to kill stuck mutants quickly.
Too many surviving mutants in trivial code
Logging, string messages, and display-only code generate noise:
Fix: Exclude specific mutation types with excludedMutations: ["StringLiteral"]. Focus on business logic files in your mutate configuration. Accept that not every surviving mutant needs a test.
Tests fail without any mutations applied
Your test suite has existing failures that Stryker sees as baseline:
Fix: Ensure all tests pass before running Stryker. Run npm test and fix any failures first. Stryker assumes a green test suite as the baseline.
"No tests found" for some mutants
Coverage analysis cannot map mutants to tests:
Fix: Switch from coverageAnalysis: "perTest" to "all" temporarily to verify. Ensure test files are included in the test runner configuration. Check that the file patterns in mutate match actual source files.
Best Practices
- Run mutation testing on critical code first. Start with payment processing, authentication, data validation — code where bugs are expensive. Expand to other areas once the critical code has a high mutation score.
- Use mutation testing to improve existing tests, not to write new ones from scratch. The surviving mutants report tells you exactly which assertions are missing. Add targeted tests for those specific gaps.
- Do not aim for 100% mutation score. Some surviving mutants are acceptable — string literal changes in log messages, for example. Focus on surviving mutants in conditional logic and calculations.
- Integrate gradually into CI/CD. Start with a low
breakthreshold (50%) and increase it as your test suite improves. Breaking the build at 80% from day one will frustrate the team. - Run on changed files in pull requests. Full project mutation testing is for scheduled runs. PR checks should only mutate the files that changed.
- Review surviving mutants as a team. Each surviving mutant is a learning opportunity about what makes a test effective. Use them in code review discussions.
- Combine with code coverage. Coverage shows what is executed. Mutation testing shows what is verified. Together they give a complete picture of test effectiveness.
- Keep mutation test runs fast. If mutation testing takes 30 minutes, developers will not run it. Optimize with per-test coverage analysis, concurrency, and scope limiting.