Skip to content

testing

When tests fail, the goal is NEVER to tweak code blindly until green. Every failure is one of:

  1. A BUG in the code — code doesn’t do what it should
  2. A GAP in the code — code is missing something the test expects
  3. A BUG in the test — test itself is wrong or outdated
  4. An ENVIRONMENTAL issue — missing env var, wrong config, external dependency

Diagnose which one it is, then fix the ROOT CAUSE.

  1. READ the failing test. Understand what it expects and why. Read the ENTIRE test file.
  2. READ the code under test line by line. Trace actual execution path for the failing case.
  3. READ imports and dependencies. Check shared state, utilities, side effects.
  4. DIAGNOSE the category. Is it a code bug, code gap, test bug, or environmental?
  5. VERIFY framework behavior. If unclear, search official docs for the test framework.
  6. APPLY the root-cause fix. Fix the actual problem, not the symptom.
  7. RE-RUN and verify. Run the specific test, then the FULL suite to catch regressions.

Maximum 3 diagnostic cycles per failure. If still stuck after 3 attempts, escalate to user with findings.

Task TypeTest Requirement
Bug fixRegression test proving the bug is fixed
New featureUnit tests for core logic + integration for API/UI
RefactorExisting tests must still pass (no behavior change)
API endpointRequest/response validation, error cases, auth checks
Schema/content changeBuild validation passes
  • Every new feature or bugfix requires tests where applicable.
  • Run the full test suite before committing. Fix all failures.
  • NEVER skip, disable, or delete tests to make a commit pass.
  • NEVER use test.skip() or test.todo() without a tracked TODO explaining why.
  • Test edge cases: empty inputs, null/undefined, boundary values, error paths.
  • Snapshot tests are a last resort. Prefer explicit assertions.
  • Mock external services in unit tests.
  • Use describe blocks to group related tests logically.
Anti-PatternWhy It’s WrongDo Instead
Changing code randomly until tests passHides real bug, creates new onesFollow 7-step diagnostic process
Deleting a failing testRemoves safety netFix the root cause
Adding // @ts-ignore to pass type testsMasks type errorsFix the type issue
Testing implementation detailsBreaks on refactorTest behavior and outcomes
No assertions in testFalse confidenceEvery test must assert something

Activates when the project has a frontend stack — detected from stack.json, package.json deps (react, next, astro, svelte, vue), or file patterns (src/components/, *.tsx, *.jsx).

Pure backend/CLI projects: Skip this section entirely.

When frontend work is detected, the TDD cycle becomes:

RED → GREEN → VISUAL → REFACTOR
  • RED — failing functional test (behavior, not appearance)
  • GREEN — implementation passes functional test
  • VISUAL — capture/verify visual baseline
    • Playwright toHaveScreenshot() for automated regression
    • Cross-browser: Chromium + Firefox + WebKit (minimum)
    • Viewports: mobile (375px), tablet (768px), desktop (1280px)
    • Deterministic: animations: 'disabled', fonts loaded, time frozen
    • mask option for dynamic elements (timestamps, avatars, ads)
  • REFACTOR — clean up with both functional + visual tests as safety net
ScenarioAction
Intentional visual changenpx playwright test --update-snapshots
Unintentional visual diffTreat as RED — it’s a regression, fix it
New component (no baseline)First run creates baseline, commit snapshots
ContextScope
Local developmentSelective — only changed components’ visual tests
CI pipelineFull visual suite across all browsers + viewports
Pre-commitAffected visual tests only
Pre-pushFull visual suite

When claude --chrome is available:

  • Use for live design verification during development
  • Complementary to automated tests, not a replacement
  • Good for subjective quality checks automated tests can’t catch

Before capturing screenshots:

  • animations: 'disabled' in Playwright config
  • Fonts loaded (page.waitForLoadState('networkidle') or font-face check)
  • Time frozen (page.clock.setFixedTime() for timestamps)
  • Dynamic content masked (mask: [page.locator('.avatar')])
  • Viewport set explicitly (page.setViewportSize())
  • Color scheme set (page.emulateMedia({ colorScheme: 'light' }))
SkillHow it uses this rule
test-driven-developmentAdds VISUAL step after GREEN when frontend detected
verification-before-completionAdds visual regression check to completion gate
playwrightViewport presets + cross-browser config templates