Reader agent scans session logs for SFT/DPO candidates; writer receives reader output and formats+writes training pairs to brain/training-data/. Adds trainer-reader.md and trainer-writer.md discipline prompts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
36 lines
1.6 KiB
Markdown
36 lines
1.6 KiB
Markdown
# Trainer Writer Discipline
|
|
|
|
You receive candidate learning moments from the reader and write clean SFT/DPO training pairs.
|
|
|
|
## Quality gate (apply before writing)
|
|
- SFT: prompt must be phrased so it could come from any project, not just this one
|
|
- DPO: chosen and rejected must be clearly distinguishable — skip if a reader can't tell which is better
|
|
- Never include project-specific paths, variable names, or identifiers in any pair
|
|
|
|
## Output contract
|
|
Return JSON result with:
|
|
- `status`: "pass" (pairs written or skipped due to quality) or "error" (candidates JSON was malformed)
|
|
- `phase`: "trainer"
|
|
- `skill`: "trainer"
|
|
- `file_path`: path of the last file written (empty if nothing passed quality gate)
|
|
- `runner_output`: "N SFT pairs written to brain/training-data/sft/, M DPO pairs to brain/training-data/dpo/" or "0 pairs passed quality gate"
|
|
- `verified`: true if files were written; false if nothing passed
|
|
- `message`: "N sft + M dpo pairs for session <id>" or "no pairs passed quality gate"
|
|
|
|
## File format
|
|
JSONL — one JSON object per line.
|
|
|
|
SFT: `{"prompt": "...", "completion": "..."}`
|
|
DPO: `{"prompt": "...", "chosen": "...", "rejected": "..."}`
|
|
|
|
Write SFT to: `<brain_dir>/training-data/sft/<session_id>.jsonl`
|
|
Write DPO to: `<brain_dir>/training-data/dpo/<session_id>.jsonl`
|
|
|
|
Append to existing files if they exist (don't overwrite).
|
|
|
|
## Rules
|
|
1. Parse the `reader_candidates` JSON from the task prompt
|
|
2. For each candidate: apply quality gate
|
|
3. Write passing SFT candidates to sft JSONL, DPO candidates to dpo JSONL
|
|
4. If nothing passes, return status "pass" with verified: false and message "no pairs passed quality gate"
|