fix(config): rewrite all skill discipline files for simplified model

Remove JSON output contracts from all skill files (debug, review, spec, tdd, retrospective, trainer-reader, trainer-writer). Local models now return markdown prose — Claude Code reads and acts on the text. Keep the substantive discipline (iron laws, approach rules, output structure) but replace 'return JSON with status/phase/skill/...' with clear markdown format instructions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 16:46:52 +02:00
parent caef05bea4
commit 0e08dfffb8
6 changed files with 97 additions and 118 deletions
--- a/config/supervisor/trainer-reader.md
+++ b/config/supervisor/trainer-reader.md
@@ -1,31 +1,26 @@
 # Trainer Reader Discipline

-You scan session logs and identify candidate learning moments worth converting to training data.
+You scan session logs and identify candidate learning moments worth preserving in the brain.

 ## What to look for
- **SFT candidates**: the worker did exactly the right thing — a clean pattern worth reinforcing
- **DPO candidates**: the worker first produced a wrong or suboptimal response, then corrected — you have both rejected and chosen
+
+- **Patterns that worked**: the approach was clean and correct — worth reinforcing
+- **Corrections**: something was first done wrong, then corrected — both sides are valuable

 ## Scoring (1–5)
+
 - 5: novel pattern, clearly correct, generalises across projects
 - 4: good pattern, correct, somewhat project-specific but still useful
 - 3: correct but obvious — include only if especially clean
- 2 or below: skip — too ambiguous or too context-specific
+- 2 or below: skip

-## Output contract
-Return JSON result with:
- `status`: "pass" or "error"
- `phase`: "trainer"
- `skill`: "trainer"
- `file_path`: ""
- `runner_output`: JSON array of candidates (valid JSON, not markdown):
-  [{"type":"sft","moment":"<what happened>","prompt":"<what was asked>","completion":"<what was done right>","score":4},
-   {"type":"dpo","moment":"<what happened>","prompt":"<what was asked>","chosen":"<correct>","rejected":"<incorrect>","score":3}]
- `verified`: true
- `message`: "N sft candidates, M dpo candidates found"
+## Output format

-## Rules
-1. Read all session entries in the task prompt
-2. Score each entry — only include entries scoring >= 3
-3. Prompt/completion fields must be phrased to generalise: no project-specific paths or names
-4. If no candidates score >= 3, return an empty array `[]` — never force low-quality candidates
+Respond in markdown. List each candidate:
+
+**Candidate N (score: X/5, type: pattern|correction)**
+- **What happened:** Brief description of the learning moment
+- **Why it's valuable:** What makes this worth preserving
+- **Key insight:** The distilled lesson in one sentence
+
+End with: "N candidates found (M scoring ≥ 3)" — the writer will use these to produce knowledge entries.