# QA Advisor — Test Quality Reference

*Part of the QA Advisor skill: https://wavect.io/.well-known/agent-skills/qa-advisor/SKILL.md*

Test doubles, assertion quality, TDD schools, property-based and contract testing, BDD, and snapshot testing.

## Test Double Taxonomy — Are You Using the Right Tool?

Martin Fowler's taxonomy of test doubles is the single most misunderstood topic
in automated testing. Using the wrong double is not a style issue — it is a
correctness issue. The wrong double makes a test pass even when the real system
would fail.

### The Five Types

**Dummy**
An object passed to satisfy a parameter signature. It is never used in the test.
```typescript
// Bad: using a real Logger just to satisfy a constructor parameter
const service = new OrderService(new Logger(), paymentGateway);

// Good: dummy — type compatibility with no behavior
const dummyLogger = {} as Logger;
const service = new OrderService(dummyLogger, paymentGateway);
```

**Stub**
Returns a pre-configured answer to a specific call. Has no logic, no verification.
Use when: the test needs to control what a dependency returns.
```typescript
const paymentStub = { charge: async () => ({ success: true }) };
```

**Spy**
A real or partial object that also records how it was called. Assertions happen
after the fact by checking the recorded interactions.
```typescript
const emailSpy = jest.spyOn(emailService, 'send');
await orderService.complete(order);
expect(emailSpy).toHaveBeenCalledWith(order.userEmail, expect.any(String));
```

**Mock**
Pre-programmed with expectations. Verifies behavior during the test run, not after.
The mock FAILS the test if an expected call did not happen — this is different from a spy.
Use when: the interaction pattern itself IS the thing being tested.
```typescript
const mockQueue = createMock<MessageQueue>();
mockQueue.expects('enqueue').once().withArgs({ type: 'ORDER_CREATED' });
await orderService.complete(order);
mockQueue.verify(); // fails if enqueue wasn't called exactly once
```

**Fake**
A real, working implementation that takes shortcuts inappropriate for production.
The canonical example is an in-memory database, an in-memory message queue,
or an in-memory file system.
```typescript
class FakeUserRepository implements UserRepository {
  private store = new Map<string, User>();
  async findById(id: string) { return this.store.get(id); }
  async save(user: User) { this.store.set(user.id, user); }
}
```

Fakes are underused and often better than mocks for testing code that does
complex data access patterns — they let you test sequences (create → update →
find) without mocking each step individually.

### The Critical Anti-Pattern: Mocking What You Own

**Never mock your own domain objects or internal services.** If you mock the
thing you are testing to make it easier to test, you are no longer testing it.

```typescript
// WRONG — mocking internal service to test the service that uses it
const mockOrderService = jest.mock('./orderService');
// What are you actually testing? Nothing about orderService's real behavior.

// RIGHT — use a fake or real instance; mock only the external boundary
const fakePaymentGateway = new FakePaymentGateway();
const orderService = new OrderService(fakePaymentGateway);
```

**The mock boundary rule:** Mock (or stub) only at system boundaries — HTTP
clients, databases, file systems, queues, clocks, external APIs. Never mock
modules that your own code owns. If your code owns it, test it with the real
implementation or a fake.

### Builder Pattern for Test Fixtures

Repeated construction of test objects with slight variations is the primary
source of test suite maintenance burden. The builder pattern eliminates it.

```typescript
// Anti-pattern: copy-paste construction everywhere
const order = { id: '1', user: { id: 'u1', email: 'test@test.com' },
  items: [{ sku: 'A', qty: 1, price: 10 }], status: 'PENDING' };

// Correct: builder with sensible defaults + override methods
class OrderBuilder {
  private data = {
    id: 'order-1',
    user: { id: 'user-1', email: 'test@example.com' },
    items: [{ sku: 'SKU-A', qty: 1, price: 1000 }],
    status: 'PENDING' as OrderStatus,
  };

  withStatus(status: OrderStatus): this { this.data.status = status; return this; }
  withItems(items: OrderItem[]): this { this.data.items = items; return this; }
  withUser(user: Partial<User>): this { this.data.user = { ...this.data.user, ...user }; return this; }
  build(): Order { return { ...this.data }; }
}

// In tests:
const order = new OrderBuilder().withStatus('COMPLETED').build();
```

---

## Test Quality — Evaluating Assertions

Test coverage is a trailing indicator. The leading indicator is assertion quality.

### The Assertion Spectrum

| Assertion quality | Example | Risk |
|---|---|---|
| **No assertion** | `it('runs without error', () => { fn(); })` | Zero value — any crash passes |
| **Existence check** | `expect(result).toBeDefined()` | Weak — undefined is almost never the only wrong answer |
| **Type check** | `expect(typeof result).toBe('string')` | Weak — still passes with wrong strings |
| **Shape check** | `expect(result).toHaveProperty('id')` | Moderate — misses wrong values |
| **Exact value** | `expect(result.total).toBe(1099)` | Strong |
| **Behavioral sequence** | Assert state before, trigger, assert state after | Strongest |

The most common test quality failure is asserting presence when value should
be asserted, and asserting value when behavior should be asserted.

### Red-Flag Patterns to Explicitly Call Out

**Asserting the input:**
```typescript
// WRONG — this tests nothing; `name` is what you passed in
const user = await createUser({ name: 'Alice' });
expect(user.name).toBe('Alice'); // trivially true in any implementation
```

**Asserting mocks instead of outcomes:**
```typescript
// WRONG — you are testing that you called your mock, not that the system works
expect(mockDatabase.save).toHaveBeenCalled(); // proves nothing about real behavior
// RIGHT — assert the state change is observable
const found = await repo.findById(savedUser.id);
expect(found).toEqual(expect.objectContaining({ email: savedUser.email }));
```

**Testing implementation instead of contract:**
```typescript
// WRONG — if you rename the private method, this test breaks even if behavior is unchanged
expect(service['_calculateDiscount']).toHaveBeenCalled();
// RIGHT — test the observable outcome
expect(invoice.totalAfterDiscount).toBe(90);
```

**The false negative test:** A test that can never fail is not a test. Run
mutation testing (Stryker, mutmut, PIT) to verify your tests would catch real
bugs. If the mutation survival rate is above 30%, the tests have significant
coverage theater despite the coverage number.

---

## London School vs. Chicago School of TDD

These are two legitimate and incompatible schools. Knowing which one the
codebase is following (or accidentally mixing) is essential for coherent advice.

### Chicago School (Inside-Out / Classical TDD)

- Write the test first, implement to pass, refactor
- Prefer real implementations; use test doubles only for slow or external dependencies
- Focus: correct behavior of real objects
- Output: tests that survive refactoring
- Risk: slow tests when real implementations are heavy; harder to achieve isolation

### London School (Outside-In / Mockist TDD)

- Design interfaces first via mocks; write implementations to satisfy mock contracts
- Mock all collaborators, even internal ones
- Focus: correct collaboration between objects; emergence of good design
- Output: fast, isolated tests; explicit dependency contracts
- Risk: tests are coupled to implementation structure; heavy refactors break tests even when behavior is correct

**How to detect which school is being used (often unintentionally):**
- Count the mock-to-assertion ratio. London School codebases have 3:1 or higher.
- Look at whether mocks verify calls (`toHaveBeenCalledWith`) or outcomes (`expect(result)`).
- Look at how many tests break when a private method is renamed.

**The mixing anti-pattern:** Many codebases accidentally combine both schools —
using mocks for internal services (London) and real databases (Chicago). This
creates tests that are slow AND brittle. Pick a school, apply it consistently,
and document the choice.

---

## Property-Based Testing — Finding Edges You Cannot Imagine

Unit tests verify examples you thought of. Property-based tests verify
invariants across thousands of randomly generated inputs. The canonical
finding: "I didn't know that input was possible."

**Frameworks:** QuickCheck (Haskell), Hypothesis (Python), fast-check
(TypeScript/JavaScript), jqwik (Java), ScalaCheck (Scala).

**The three property categories:**

1. **Invariants** — properties that must always hold
```python
# Hypothesis (Python)
from hypothesis import given, strategies as st

@given(st.lists(st.integers()))
def test_sort_is_idempotent(lst):
    assert sorted(sorted(lst)) == sorted(lst)

@given(st.lists(st.integers()))
def test_sort_preserves_length(lst):
    assert len(sorted(lst)) == len(lst)
```

2. **Round-trip properties** — encode → decode must reproduce original
```typescript
// fast-check (TypeScript)
fc.assert(fc.property(fc.record({
  id: fc.uuid(),
  amount: fc.integer({ min: 0, max: 1_000_000 }),
  currency: fc.constantFrom('EUR', 'USD', 'GBP'),
}), (order) => {
  const decoded = deserialize(serialize(order));
  expect(decoded).toEqual(order);
}));
```

3. **Oracle properties** — compare against a known-correct reference implementation
```python
@given(st.lists(st.integers(), min_size=1))
def test_custom_max_matches_builtin(lst):
    assert custom_max(lst) == max(lst)
```

**When to add property-based tests:**
- Parsing, serialization, encoding/decoding functions
- Mathematical or financial calculations
- Sort, filter, aggregation functions
- Any function with non-trivial edge cases on numeric ranges
- Protocol implementations

Property-based tests have found bugs in TLS implementations, database query
engines, and distributed consensus algorithms. If the codebase has none, it is
likely missing an entire class of edge-case bugs.

---

## Contract Testing — Preventing Silent API Breakage

In microservices and API-first systems, integration tests are often too slow and
too fragile. Contract testing solves this by verifying that a producer's API
matches what each consumer expects — without requiring both to run simultaneously.

**Pact (most common contract testing framework):**

Consumer writes a test that defines what it expects from the provider:
```javascript
// Consumer test (e.g., frontend calling /api/orders/:id)
const { like, term } = Pact.Matchers;

provider.addInteraction({
  state: 'order 42 exists',
  uponReceiving: 'a request for order 42',
  withRequest: { method: 'GET', path: '/api/orders/42' },
  willRespondWith: {
    status: 200,
    body: {
      id: like('42'),
      total: like(1099),
      status: term({ generate: 'PENDING', matcher: 'PENDING|COMPLETED|CANCELLED' }),
    },
  },
});
```

Provider runs the consumer contract against its real implementation and verifies
compliance. A breaking change in the provider fails the consumer's contract
test — before deployment.

**Audit questions for contract testing:**
- Does the codebase have any API between services? If yes and there are no
  contract tests, every provider change is a potential silent consumer break.
- Are the contracts stored in a Pact Broker or equivalent (PactFlow)?
- Are provider contract tests part of the CI pipeline on every PR?
- Is there a "can I deploy?" check that queries the Pact Broker before release?

---

## BDD — Behavior-Driven Development

BDD (Given-When-Then) is not primarily a testing syntax — it is a
communication protocol between business and engineering. Tests that use
technical implementation language instead of business domain language signal
that requirements translation is happening inside the test, which is late
and expensive.

### The Given-When-Then Structure

```gherkin
# Cucumber / Gherkin (any language)
Feature: Order payment processing
  Scenario: Successful payment for in-stock order
    Given an order with 2 units of SKU-WIDGET at €49.99 each
    And the customer has a valid payment method on file
    When the customer completes checkout
    Then the order status should be CONFIRMED
    And an email confirmation should be sent to the customer
    And inventory for SKU-WIDGET should be reduced by 2
```

**The BDD audit questions:**
- Are acceptance tests written in business language or technical language?
- Can a non-engineer read a failing test and understand what broke?
- Are the scenarios mapping to real user stories, or to implementation branches?

**When BDD is the wrong tool:** BDD adds ceremony. Use it for high-value
flows where business stakeholders need to verify behavior. Do not use it for
low-level algorithmic tests — that is specification by scenario, not BDD.

---

## Snapshot Testing — When It Helps and When It Lies

Snapshot testing (Jest `.toMatchSnapshot()`, Storybook visual regression) records
current output and fails when output changes. This sounds like a safety net but
is often a trap.

**When snapshot testing is appropriate:**
- Visual regression testing on UI components where pixel-level change is meaningful
- Serialized output that is complex and rarely intentionally changed
- API response shapes where a change in structure (not values) would be a bug

**When snapshot testing creates false confidence:**

```typescript
// Dangerous snapshot test
it('renders checkout page', () => {
  const { container } = render(<CheckoutPage />);
  expect(container).toMatchSnapshot(); // 300-line HTML blob
});
```

This test fails on every intentional UI change, training developers to run
`jest --updateSnapshot` reflexively. Once that habit forms, the test is no
longer a safety net — it is a noise generator. It also passes on wrong values
as long as the wrong value is consistent.

**The correct decision framework:**

| Condition | Use snapshot? |
|---|---|
| Testing visual pixel accuracy | Yes (visual regression tools) |
| Testing component renders without crashing | No — use `expect(screen.getByRole('button')).toBeInTheDocument()` |
| Testing serialized config output with known shape | Yes — but commit snapshot review as required |
| Testing API response with dynamic values (dates, IDs) | No — extract and assert specific fields |