Tests that pass immediately: the two valid TDD inversions
Standard TDD discipline (Layer 2 superpowers convention fromtest-driven-development): write the test, watch it fail (the RED step), then write the code to make it pass (the GREEN step). The skill’s iron law: “Tests passing immediately prove nothing.”
This is correct in most cases. Karpathy-wiki v2.2 surfaced two specific cases where the inversion (write the test, watch it pass on the first run) is not just valid but is the right discipline. This page names them, gives the v2.2 evidence, and tells you how to recognize when you are in one of these cases versus when you are skipping TDD.
Source: LESSONS 5; REVIEWER “What the analyzer got right” #6 (TDD-inversion is genuinely new).
Case A: regression-pin
The test pins existing correct behavior so a future refactor cannot silently break it. Karpathy-wiki Task 59 example (test-archive-strips-processing.sh):
The helper wiki_capture_archive at scripts/wiki-capture.sh:47 already strips the .processing suffix when archiving a capture file. Audit Finding 03 surfaced 4 legacy archive files that still had .md.processing suffixes from older code paths. The test exists not to drive the implementation (the implementation is already correct) but to prevent a future refactor from breaking the contract again.
The test:
- The audit found a class of bugs (legacy artifacts, drift, broken state).
- The current implementation is correct.
- A future refactor is plausible (the module is still active code, not frozen).
Case B: mechanism-rehearsal
The test rehearses a SKILL.md snippet because the actual ingester is aclaude -p subprocess executing prose (the snippet is the contract; the LLM is the implementation).
Karpathy-wiki Task 56 example (test-index-threshold-fires.sh):
The actual ingester is a detached claude -p process that reads SKILL.md and runs the bash literally. The test cannot run the ingester (it would cost an LLM call). Instead, the test rehearses the mechanism in pure bash, asserting that the bash produces the right artifact.
The post-0e0f815 test goes further: it copies the SKILL.md snippet verbatim into the test, so any change to the SKILL.md snippet must be mirrored to the test (and vice versa). This is closer to a contract test than a feature test.
- The actual mechanism is an LLM call (a subprocess with
claude -por equivalent). - The mechanism’s contract is expressible as code (bash, python).
- Running the LLM call is too expensive for routine testing.
docs/05-authoring/prose-discipline.md.
When the inversion is invalid
Most of the time, “tests passing immediately” is a TDD violation. The test author has written the implementation first (or the test author and the implementer are the same person, the implementer wrote first and the test author rationalized). The test passes because it tests the existing code, not the contract. The TDD skill is right about this case. Two specific failure shapes:- Test would have driven a new feature’s design. If the test is for a feature that did not exist yet, write the test first. Watch it fail. Then write the feature. The “RED” step is the design feedback; skipping it skips the feedback.
- Test exists to catch a bug. Bug-fix tests must fail before the fix lands. If the test passes immediately, you have not proven the test would have caught the bug. The test is decorative; the fix may or may not work.
How to tell which case you are in
Ask yourself, in order:- Does the implementation already exist and is it correct? If yes, you are in Case A (regression-pin) or Case B (mechanism-rehearsal). Pick the one that fits.
- Does the implementation not yet exist? You are in canonical TDD. Watch the test fail before writing the implementation.
- Is there a bug being fixed? You are in canonical bug-fix TDD. Write the test, watch it fail (the test catches the bug), apply the fix, watch the test pass.
Plan-shape implications
If you are writing a plan that includes a test step, mark “test passes immediately” tasks explicitly:Step 2: Run the test to verify it passes immediately. (This test is a regression pin / mechanism rehearsal; see [link]. The implementation already exists; the test prevents future regression / verifies the contract.)Without this annotation, the implementer will think they are violating TDD and may rewrite the test to fail first (defeating the regression-pin or mechanism-rehearsal purpose). Source:
LESSONS 2.0 (writing-plans gap; “TDD-doesn’t-fit gating”).
Sources
LESSONS5 (the two valid TDD inversions; Tasks 56 and 59).REVIEWER“What the analyzer got right” #6 (TDD-inversion is genuinely new; not in superpowers test-driven-development or writing-skills RED-GREEN-REFACTOR).
docs/06-testing/red-green-for-prose.md.