Mutation Testing: Finding Weak Tests

Shane

2/14/2026

14 min read

A practical guide to mutation testing in JavaScript using Stryker Mutator to find tests that pass when they should fail, improving test suite effectiveness.

code-quality testing mutation-testing stryker test-quality

Mutation Testing: Finding Weak Tests

Code coverage tells you what your tests execute. Mutation testing tells you what your tests actually verify. There is a significant difference.

A test suite with 100% coverage can pass while the code is fundamentally broken — if the tests execute every line but never check the results. Mutation testing proves your tests are effective by introducing bugs and checking whether the tests catch them. If you change > to >= and every test still passes, those tests are not testing what you think they are testing.

Prerequisites

Node.js installed (v16+)
A project with existing tests (Jest or Mocha)
Understanding of code coverage concepts

How Mutation Testing Works

Parse — Stryker reads your source code and identifies locations where mutations can be applied
Mutate — For each location, Stryker creates a modified version (a "mutant") by changing one thing: replacing + with -, > with >=, true with false, etc.
Test — Stryker runs your test suite against each mutant
Score — If a test fails, the mutant is "killed" (good). If all tests pass, the mutant "survived" (bad — your tests missed a bug)

Original:  if (age >= 18) { return "adult"; }
Mutant 1:  if (age >  18) { return "adult"; }  ← Tests catch this? Killed ✓
Mutant 2:  if (age <= 18) { return "adult"; }  ← Tests catch this? Killed ✓
Mutant 3:  if (age >= 18) { return "";      }  ← Tests catch this? Survived ✗

If Mutant 3 survives, your tests never check the return value for adults. They might check that the function runs without errors, but they do not verify the actual output.

Setting Up Stryker

Installation

npm install --save-dev @stryker-mutator/core @stryker-mutator/jest-runner

For Mocha:

npm install --save-dev @stryker-mutator/core @stryker-mutator/mocha-runner

Configuration

npx stryker init

Or create stryker.conf.js manually:

// stryker.conf.js
module.exports = {
  mutator: "javascript",
  packageManager: "npm",
  reporters: ["html", "clear-text", "progress"],
  testRunner: "jest",
  jest: {
    configFile: "jest.config.js"
  },
  coverageAnalysis: "perTest",
  mutate: [
    "src/**/*.js",
    "!src/**/*.test.js",
    "!src/test/**"
  ],
  thresholds: {
    high: 80,
    low: 60,
    break: 50
  }
};

Running Mutation Tests

npx stryker run

Reading Mutation Reports

Terminal Output

All files
  Mutation score: 72.5%
  Mutants:
    Killed:     58
    Survived:   18
    No coverage: 4
    Timeout:     0
    Runtime errors: 0

src/calculator.js
  Mutation score: 85.7%
  14 Killed, 2 Survived, 0 No coverage

src/validator.js
  Mutation score: 60.0%
  12 Killed, 8 Survived, 0 No coverage

src/userService.js
  Mutation score: 72.7%
  32 Killed, 8 Survived, 4 No coverage

Key terms:

Killed — A test failed when this mutation was applied (good)
Survived — All tests passed with this mutation (bad — test gap)
No coverage — No test covers this code at all
Timeout — The mutation caused an infinite loop (counts as killed)
Mutation score — Killed / (Killed + Survived) as a percentage

HTML Report

The HTML report shows each mutant in context:

src/calculator.js

Line 5:  if (b === 0) {
  ✗ Survived: ConditionalExpression — replaced b === 0 with true
  ✓ Killed:   ConditionalExpression — replaced b === 0 with false

Line 8:  return a / b;
  ✓ Killed:   ArithmeticOperator — replaced / with *
  ✓ Killed:   ArithmeticOperator — replaced / with +

Practical Example: Finding Weak Tests

The Code

// pricing.js
function calculatePrice(basePrice, quantity, customerType) {
  if (quantity <= 0) {
    throw new Error("Quantity must be positive");
  }

  var subtotal = basePrice * quantity;
  var discount = 0;

  if (customerType === "premium") {
    discount = 0.15;
  } else if (customerType === "wholesale") {
    discount = 0.25;
  }

  if (quantity >= 100) {
    discount = discount + 0.05;
  }

  var discountedPrice = subtotal * (1 - discount);
  var tax = discountedPrice * 0.08;
  var total = discountedPrice + tax;

  return {
    subtotal: Math.round(subtotal * 100) / 100,
    discount: Math.round(discount * 100),
    tax: Math.round(tax * 100) / 100,
    total: Math.round(total * 100) / 100
  };
}

module.exports = { calculatePrice: calculatePrice };

The Weak Tests

// pricing.test.js — tests that look good but have gaps
var pricing = require("./pricing");

describe("calculatePrice", function() {
  test("calculates price for regular customer", function() {
    var result = pricing.calculatePrice(10, 5, "regular");
    expect(result.total).toBeGreaterThan(0);
  });

  test("applies premium discount", function() {
    var result = pricing.calculatePrice(100, 1, "premium");
    expect(result.discount).toBe(15);
  });

  test("throws for zero quantity", function() {
    expect(function() {
      pricing.calculatePrice(10, 0, "regular");
    }).toThrow();
  });

  test("handles large quantities", function() {
    var result = pricing.calculatePrice(10, 200, "regular");
    expect(result).toBeDefined();
  });
});

Mutation Results

Survived mutants:

1. Line 7: replaced basePrice * quantity with basePrice + quantity
   Test "calculates price for regular customer" still passes
   → The test checks total > 0, not the actual value

2. Line 14: replaced 0.25 with 0.26
   No test checks wholesale discount
   → Missing test for wholesale customer type

3. Line 17: replaced >= with >
   Test "handles large quantities" still passes
   → The test uses quantity=200, never tests the boundary at 100

4. Line 17: replaced discount + 0.05 with discount - 0.05
   Test "handles large quantities" still passes
   → The test checks result.toBeDefined(), not the actual discount

5. Line 20: replaced 0.08 with 0.09
   Test "calculates price for regular customer" still passes
   → No test verifies the tax calculation

6. Line 25: replaced subtotal with 0
   Test "calculates price for regular customer" still passes
   → The subtotal field is never checked

The Improved Tests

// pricing.test.js — tests that kill all mutants
var pricing = require("./pricing");

describe("calculatePrice", function() {
  test("calculates subtotal correctly", function() {
    var result = pricing.calculatePrice(10, 5, "regular");
    expect(result.subtotal).toBe(50);
  });

  test("applies no discount for regular customers", function() {
    var result = pricing.calculatePrice(100, 1, "regular");
    expect(result.discount).toBe(0);
    expect(result.total).toBe(108); // 100 + 8% tax
  });

  test("applies 15% discount for premium customers", function() {
    var result = pricing.calculatePrice(100, 1, "premium");
    expect(result.discount).toBe(15);
    expect(result.total).toBe(91.80); // 85 + 8% tax
  });

  test("applies 25% discount for wholesale customers", function() {
    var result = pricing.calculatePrice(100, 1, "wholesale");
    expect(result.discount).toBe(25);
    expect(result.total).toBe(81); // 75 + 8% tax
  });

  test("adds 5% bulk discount at exactly 100 quantity", function() {
    var result = pricing.calculatePrice(10, 100, "regular");
    expect(result.discount).toBe(5);
    expect(result.subtotal).toBe(1000);
  });

  test("no bulk discount at 99 quantity", function() {
    var result = pricing.calculatePrice(10, 99, "regular");
    expect(result.discount).toBe(0);
  });

  test("stacks bulk and customer discounts", function() {
    var result = pricing.calculatePrice(10, 100, "premium");
    expect(result.discount).toBe(20); // 15% + 5%
  });

  test("calculates tax at 8%", function() {
    var result = pricing.calculatePrice(100, 1, "regular");
    expect(result.tax).toBe(8);
  });

  test("throws for zero quantity", function() {
    expect(function() {
      pricing.calculatePrice(10, 0, "regular");
    }).toThrow("Quantity must be positive");
  });

  test("throws for negative quantity", function() {
    expect(function() {
      pricing.calculatePrice(10, -1, "regular");
    }).toThrow("Quantity must be positive");
  });
});

After these improvements, the mutation score jumps from 45% to 100%.

Mutation Types

Stryker applies many types of mutations:

Arithmetic Operators

// Original         // Mutated
a + b               a - b
a - b               a + b
a * b               a / b
a / b               a * b
a % b               a * b

Comparison Operators

// Original         // Mutated
a > b               a >= b, a < b
a >= b              a > b, a <= b
a < b               a <= b, a > b
a <= b              a < b, a >= b
a === b             a !== b
a !== b             a === b

Logical Operators

// Original         // Mutated
a && b              a || b
a || b              a && b
!a                  a

Conditional Expressions

// Original                      // Mutated
if (condition) { ... }           if (true) { ... }
if (condition) { ... }           if (false) { ... }
condition ? a : b                true ? a : b
condition ? a : b                false ? a : b

String and Array Mutations

// Original                      // Mutated
"hello"                          ""
""                               "Stryker was here!"
array.length                     0

Increments and Decrements

// Original         // Mutated
i++                 i--
i--                 i++
++i                 --i
--i                 ++i

Configuring Mutators

Disable specific mutation types if they generate too much noise:

// stryker.conf.js
module.exports = {
  mutator: {
    plugins: null, // Use all built-in mutators
    excludedMutations: [
      "StringLiteral",     // Skip string mutations (often noisy)
      "ObjectLiteral"      // Skip object literal mutations
    ]
  }
};

Performance Optimization

Mutation testing is slow because it runs your test suite once per mutant. A project with 100 mutants and a 10-second test suite takes 16+ minutes.

Limit Scope

// Only mutate changed files
module.exports = {
  mutate: [
    "src/pricing.js",
    "src/validator.js"
  ]
};

# Or via command line
npx stryker run --mutate "src/pricing.js"

Use Coverage Analysis

module.exports = {
  coverageAnalysis: "perTest"
  // "perTest" — only runs tests that cover each mutant (fastest)
  // "all" — runs all tests for every mutant (slowest, most thorough)
  // "off" — runs all tests, no optimization
};

Increase Concurrency

module.exports = {
  concurrency: 4  // Run 4 mutants in parallel
};

Set Timeouts

module.exports = {
  timeoutMS: 10000,      // Kill mutants that take longer than 10s
  timeoutFactor: 1.5     // Or 1.5x the normal test run time
};

Interpreting Results

High Mutation Score (80%+)

Your tests are effective. They verify behavior, not just execution. Most mutations that could mask bugs are caught.

Medium Mutation Score (60-80%)

Common causes:

Tests check that functions run without errors but do not check return values
Edge cases at boundaries (>= vs >) are not tested
Some code paths have no assertions

Low Mutation Score (below 60%)

Your test suite has significant gaps. Many tests execute code without verifying results. Focus on:

Adding assertions to existing tests
Testing boundary conditions
Testing negative cases and error paths

Surviving Mutants Are Not All Equal

Some surviving mutants matter more than others:

// HIGH PRIORITY — business logic mutation survived
// Original: discount = 0.15
// Mutant:   discount = 0.16
// Impact: Customers are charged wrong amounts

// LOW PRIORITY — logging mutation survived
// Original: logger.info("Processing order " + orderId)
// Mutant:   logger.info("")
// Impact: Log messages are less informative

Focus on killing mutants in business-critical code. Surviving mutants in logging, comments, or developer-facing output are less concerning.

Integrating with CI/CD

Break the Build on Low Scores

// stryker.conf.js
module.exports = {
  thresholds: {
    high: 80,    // Green in report
    low: 60,     // Yellow in report
    break: 50    // Fail the build below this score
  }
};

GitHub Actions

name: Mutation Tests
on: [pull_request]

jobs:
  mutation:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
      - run: npm install
      - run: npx stryker run
      - name: Upload mutation report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: mutation-report
          path: reports/mutation/

Run on Changed Files Only

# Get changed files compared to main branch
CHANGED=$(git diff --name-only main...HEAD -- 'src/**/*.js' | grep -v '.test.js' | tr '\n' ',')

# Run Stryker only on changed files
if [ -n "$CHANGED" ]; then
  npx stryker run --mutate "$CHANGED"
fi

Complete Working Example

// cart.js
function Cart() {
  this.items = [];
}

Cart.prototype.addItem = function(product, quantity) {
  if (!product || !product.id) {
    throw new Error("Invalid product");
  }

  if (quantity <= 0) {
    throw new Error("Quantity must be positive");
  }

  var existing = null;
  for (var i = 0; i < this.items.length; i++) {
    if (this.items[i].product.id === product.id) {
      existing = this.items[i];
      break;
    }
  }

  if (existing) {
    existing.quantity = existing.quantity + quantity;
  } else {
    this.items.push({ product: product, quantity: quantity });
  }
};

Cart.prototype.removeItem = function(productId) {
  this.items = this.items.filter(function(item) {
    return item.product.id !== productId;
  });
};

Cart.prototype.getTotal = function() {
  var total = 0;
  for (var i = 0; i < this.items.length; i++) {
    total = total + (this.items[i].product.price * this.items[i].quantity);
  }
  return Math.round(total * 100) / 100;
};

Cart.prototype.getItemCount = function() {
  var count = 0;
  for (var i = 0; i < this.items.length; i++) {
    count = count + this.items[i].quantity;
  }
  return count;
};

Cart.prototype.isEmpty = function() {
  return this.items.length === 0;
};

module.exports = Cart;

// cart.test.js — tests designed to kill all mutants
var Cart = require("./cart");

describe("Cart", function() {
  var cart;
  var apple = { id: 1, name: "Apple", price: 1.50 };
  var banana = { id: 2, name: "Banana", price: 0.75 };

  beforeEach(function() {
    cart = new Cart();
  });

  describe("addItem", function() {
    test("adds a new item", function() {
      cart.addItem(apple, 3);
      expect(cart.items.length).toBe(1);
      expect(cart.items[0].quantity).toBe(3);
      expect(cart.items[0].product.id).toBe(1);
    });

    test("increases quantity for existing item", function() {
      cart.addItem(apple, 2);
      cart.addItem(apple, 3);
      expect(cart.items.length).toBe(1);
      expect(cart.items[0].quantity).toBe(5);
    });

    test("handles multiple different items", function() {
      cart.addItem(apple, 1);
      cart.addItem(banana, 2);
      expect(cart.items.length).toBe(2);
    });

    test("throws for null product", function() {
      expect(function() { cart.addItem(null, 1); }).toThrow("Invalid product");
    });

    test("throws for product without id", function() {
      expect(function() { cart.addItem({ name: "X" }, 1); }).toThrow("Invalid product");
    });

    test("throws for zero quantity", function() {
      expect(function() { cart.addItem(apple, 0); }).toThrow("Quantity must be positive");
    });

    test("throws for negative quantity", function() {
      expect(function() { cart.addItem(apple, -1); }).toThrow("Quantity must be positive");
    });
  });

  describe("removeItem", function() {
    test("removes an existing item", function() {
      cart.addItem(apple, 1);
      cart.addItem(banana, 2);
      cart.removeItem(1);
      expect(cart.items.length).toBe(1);
      expect(cart.items[0].product.id).toBe(2);
    });

    test("does nothing when removing non-existent item", function() {
      cart.addItem(apple, 1);
      cart.removeItem(999);
      expect(cart.items.length).toBe(1);
    });
  });

  describe("getTotal", function() {
    test("returns 0 for empty cart", function() {
      expect(cart.getTotal()).toBe(0);
    });

    test("calculates total for single item", function() {
      cart.addItem(apple, 3);
      expect(cart.getTotal()).toBe(4.50);
    });

    test("calculates total for multiple items", function() {
      cart.addItem(apple, 2);
      cart.addItem(banana, 4);
      expect(cart.getTotal()).toBe(6);
    });
  });

  describe("getItemCount", function() {
    test("returns 0 for empty cart", function() {
      expect(cart.getItemCount()).toBe(0);
    });

    test("counts total quantity across items", function() {
      cart.addItem(apple, 3);
      cart.addItem(banana, 2);
      expect(cart.getItemCount()).toBe(5);
    });
  });

  describe("isEmpty", function() {
    test("returns true for empty cart", function() {
      expect(cart.isEmpty()).toBe(true);
    });

    test("returns false after adding an item", function() {
      cart.addItem(apple, 1);
      expect(cart.isEmpty()).toBe(false);
    });

    test("returns true after removing all items", function() {
      cart.addItem(apple, 1);
      cart.removeItem(1);
      expect(cart.isEmpty()).toBe(true);
    });
  });
});

Common Issues and Troubleshooting

Mutation testing takes too long

Large projects with many mutants and slow test suites compound into hours:

Fix: Start with coverageAnalysis: "perTest" to only run relevant tests per mutant. Limit mutate to specific directories or changed files. Increase concurrency to parallelize. Set reasonable timeoutMS to kill stuck mutants quickly.

Too many surviving mutants in trivial code

Logging, string messages, and display-only code generate noise:

Fix: Exclude specific mutation types with excludedMutations: ["StringLiteral"]. Focus on business logic files in your mutate configuration. Accept that not every surviving mutant needs a test.

Tests fail without any mutations applied

Your test suite has existing failures that Stryker sees as baseline:

Fix: Ensure all tests pass before running Stryker. Run npm test and fix any failures first. Stryker assumes a green test suite as the baseline.

"No tests found" for some mutants

Coverage analysis cannot map mutants to tests:

Fix: Switch from coverageAnalysis: "perTest" to "all" temporarily to verify. Ensure test files are included in the test runner configuration. Check that the file patterns in mutate match actual source files.

Best Practices

Run mutation testing on critical code first. Start with payment processing, authentication, data validation — code where bugs are expensive. Expand to other areas once the critical code has a high mutation score.
Use mutation testing to improve existing tests, not to write new ones from scratch. The surviving mutants report tells you exactly which assertions are missing. Add targeted tests for those specific gaps.
Do not aim for 100% mutation score. Some surviving mutants are acceptable — string literal changes in log messages, for example. Focus on surviving mutants in conditional logic and calculations.
Integrate gradually into CI/CD. Start with a low break threshold (50%) and increase it as your test suite improves. Breaking the build at 80% from day one will frustrate the team.
Run on changed files in pull requests. Full project mutation testing is for scheduled runs. PR checks should only mutate the files that changed.
Review surviving mutants as a team. Each surviving mutant is a learning opportunity about what makes a test effective. Use them in code review discussions.
Combine with code coverage. Coverage shows what is executed. Mutation testing shows what is verified. Together they give a complete picture of test effectiveness.
Keep mutation test runs fast. If mutation testing takes 30 minutes, developers will not run it. Optimize with per-test coverage analysis, concurrency, and scope limiting.

References

Stryker Mutator
Stryker JavaScript
Mutation Testing
Stryker Mutators
PITest (Java Mutation Testing) — for comparison

Mutation Testing: Finding Weak Tests

Prerequisites

How Mutation Testing Works

Setting Up Stryker

Installation

Configuration

Running Mutation Tests

Reading Mutation Reports

Terminal Output

HTML Report

Practical Example: Finding Weak Tests

The Code

The Weak Tests

Mutation Results

The Improved Tests

Mutation Types

Arithmetic Operators

Comparison Operators

Logical Operators

Conditional Expressions

String and Array Mutations

Increments and Decrements

Configuring Mutators

Performance Optimization

Limit Scope

Use Coverage Analysis

Increase Concurrency

Set Timeouts

Interpreting Results

High Mutation Score (80%+)

Medium Mutation Score (60-80%)

Low Mutation Score (below 60%)

Surviving Mutants Are Not All Equal

Integrating with CI/CD

Break the Build on Low Scores

GitHub Actions

Run on Changed Files Only

Complete Working Example

Common Issues and Troubleshooting

Best Practices

References

Quick Links

Need Expert Help?