Files
hyperguild/docs/superpowers/plans/2026-04-19-hyperguild-phase2.md
Mathias Bergqvist c9310b1079
All checks were successful
cd / Build and deploy (push) Successful in 9s
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 4s
fix(ingestion): always append .md extension to written filenames
brain_write with a custom filename omitted the .md extension, causing
search to skip the file (search.go filters on HasSuffix .md).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 19:23:07 +02:00

1872 lines
55 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Hyperguild Phase 2 Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Add four new MCP skills (review, debug, spec, trainer) to the hyperguild supervisor, with automatic session history injection into all multi-phase skill workers.
**Architecture:** The supervisor reads prior session entries and injects them into each worker's task prompt when `session_id` is provided — the orchestrator no longer re-summarises history. The `trainer` skill runs a two-step sub-agent chain: a reader agent identifies learning moments from the session log, then a writer agent formats them as SFT/DPO pairs and writes to `brain/training-data/`. All other new skills are single-worker.
**Tech Stack:** Go 1.26, `internal/session` (JSONL log), `internal/exec` (claude subprocess executor), `internal/registry` (MCP tool registry), `config/supervisor/*.md` (discipline files), `config/models.yaml` (model routing)
---
## File Map
### New files
| File | Responsibility |
|------|---------------|
| `internal/session/history.go` | `FormatHistory(entries, excludePhase)` — formats prior session entries as a prompt block |
| `internal/session/history_test.go` | Unit tests for FormatHistory |
| `internal/skills/review/skill.go` | review skill: Config, Skill, New, Name, Tools |
| `internal/skills/review/handlers.go` | `Handle`, `handleReview` — single-phase code review worker |
| `internal/skills/review/handlers_test.go` | review handler unit tests |
| `internal/skills/debug/skill.go` | debug skill |
| `internal/skills/debug/handlers.go` | `handleDebug` — hypothesis generation worker |
| `internal/skills/debug/handlers_test.go` | debug handler unit tests |
| `internal/skills/spec/skill.go` | spec skill |
| `internal/skills/spec/handlers.go` | `handleSpec` — spec writing worker |
| `internal/skills/spec/handlers_test.go` | spec handler unit tests |
| `internal/skills/trainer/skill.go` | trainer skill: Config with ReaderPrompt + WriterPrompt + BrainDir |
| `internal/skills/trainer/handlers.go` | `handleTrain` — calls ExecutorFn twice: reader then writer |
| `internal/skills/trainer/handlers_test.go` | trainer handler unit tests (verifies two-call chain) |
| `config/supervisor/review.md` | Review worker discipline file |
| `config/supervisor/debug.md` | Debug worker discipline file |
| `config/supervisor/spec.md` | Spec worker discipline file |
| `config/supervisor/trainer-reader.md` | Trainer reader agent discipline |
| `config/supervisor/trainer-writer.md` | Trainer writer agent discipline |
### Modified files
| File | Change |
|------|--------|
| `internal/exec/result.go` | Add review/debug/spec/trainer to `validPhases` and Schema enum |
| `internal/skills/tdd/skill.go` | Add `SessionsDir string` to Config; add `session_id` to green/refactor tool schemas |
| `internal/skills/tdd/handlers.go` | Add `session_id` to `greenArgs` and `refactorArgs`; inject history when `session_id` non-empty |
| `internal/skills/tdd/handlers_test.go` | Add tests for history injection in green and refactor |
| `cmd/supervisor/main.go` | Pass `SessionsDir` to tdd.Config; read discipline files and register 4 new skills |
| `config/models.yaml` | Add `trainer` model entry |
---
## Task 1: Session history utility
**Files:**
- Create: `internal/session/history.go`
- Create: `internal/session/history_test.go`
- [ ] **Step 1: Write the failing test**
```go
// internal/session/history_test.go
package session_test
import (
"testing"
"time"
"github.com/mathiasbq/supervisor/internal/session"
"github.com/stretchr/testify/assert"
)
func TestFormatHistoryEmpty(t *testing.T) {
result := session.FormatHistory(nil, "")
assert.Equal(t, "", result)
}
func TestFormatHistoryFormatsEntries(t *testing.T) {
entries := []session.Entry{
{
Skill: "tdd", Phase: "red", FinalStatus: "pass",
FilePath: "internal/foo/foo_test.go",
Message: "wrote failing test for Foo",
Timestamp: time.Now(),
},
}
result := session.FormatHistory(entries, "")
assert.Contains(t, result, "## Session history")
assert.Contains(t, result, "Phase: red")
assert.Contains(t, result, "wrote failing test for Foo")
assert.Contains(t, result, "internal/foo/foo_test.go")
}
func TestFormatHistoryExcludesCurrentPhase(t *testing.T) {
entries := []session.Entry{
{Skill: "tdd", Phase: "red", Message: "red done", FinalStatus: "pass"},
{Skill: "tdd", Phase: "green", Message: "green done", FinalStatus: "pass"},
}
result := session.FormatHistory(entries, "green")
assert.Contains(t, result, "red done")
assert.NotContains(t, result, "green done")
}
```
- [ ] **Step 2: Run test to confirm it fails**
```bash
cd /path/to/supervisor
go test ./internal/session/... -run TestFormatHistory -v
```
Expected: `FAIL``session.FormatHistory undefined`
- [ ] **Step 3: Implement FormatHistory**
```go
// internal/session/history.go
package session
import (
"fmt"
"strings"
)
// FormatHistory formats prior session entries as a structured block for
// injection into a worker task prompt. Entries matching excludePhase are
// omitted (pass the current phase to avoid circular injection).
func FormatHistory(entries []Entry, excludePhase string) string {
var filtered []Entry
for _, e := range entries {
if e.Phase != excludePhase {
filtered = append(filtered, e)
}
}
if len(filtered) == 0 {
return ""
}
var b strings.Builder
b.WriteString("## Session history\n\n")
for _, e := range filtered {
fmt.Fprintf(&b, "### Phase: %s\n", e.Phase)
fmt.Fprintf(&b, "- Skill: %s\n", e.Skill)
fmt.Fprintf(&b, "- Status: %s\n", e.FinalStatus)
if e.FilePath != "" {
fmt.Fprintf(&b, "- File: %s\n", e.FilePath)
}
if e.Message != "" {
fmt.Fprintf(&b, "- Summary: %s\n", e.Message)
}
b.WriteString("\n")
}
return b.String()
}
```
- [ ] **Step 4: Run tests to confirm they pass**
```bash
go test ./internal/session/... -v
```
Expected: all `TestFormatHistory*` tests PASS
- [ ] **Step 5: Commit**
```bash
git add internal/session/history.go internal/session/history_test.go
git commit -m "feat(session): add FormatHistory for worker context injection"
```
---
## Task 2: Fix Schema enum and validPhases
**Files:**
- Modify: `internal/exec/result.go`
The Schema's `phase` enum currently only lists `["red","green","refactor"]`. Workers using any other phase name get forced to pick from that list (the retrospective worker was returning `"refactor"` as a result). Fix by removing the enum constraint from the schema (let it be any string) and rely solely on `validPhases` for server-side validation.
- [ ] **Step 1: Write the failing test**
```go
// Add to internal/exec/result_test.go (check existing file first for test helpers)
func TestValidateAcceptsAllPhases(t *testing.T) {
phases := []string{"red", "green", "refactor", "retrospective", "review", "debug", "spec", "trainer"}
for _, phase := range phases {
r := exec.Result{Status: "pass", Phase: phase, Skill: "test", ModelUsed: "self", Message: "ok"}
assert.NoError(t, r.Validate(), "phase %q should be valid", phase)
}
}
```
- [ ] **Step 2: Run test to confirm it fails**
```bash
go test ./internal/exec/... -run TestValidateAcceptsAllPhases -v
```
Expected: FAIL — `"review"`, `"debug"`, `"spec"`, `"trainer"` fail validation
- [ ] **Step 3: Update result.go**
In `internal/exec/result.go`, make these two changes:
**Change 1** — update `validPhases`:
```go
var validPhases = map[string]bool{
"red": true,
"green": true,
"refactor": true,
"retrospective": true,
"review": true,
"debug": true,
"spec": true,
"trainer": true,
}
```
**Change 2** — remove the enum constraint from the Schema `phase` property (replace the `"phase"` line):
```go
// Before:
"phase": {"type": "string", "enum": ["red","green","refactor"]},
// After:
"phase": {"type": "string"},
```
Also fix the error message in `Validate()`:
```go
// Before:
errs = append(errs, "phase must be red|green|refactor, got: "+r.Phase)
// After:
errs = append(errs, "phase must be one of red|green|refactor|retrospective|review|debug|spec|trainer, got: "+r.Phase)
```
- [ ] **Step 4: Run tests to confirm they pass**
```bash
go test ./internal/exec/... -v
```
Expected: all tests PASS, including `TestValidateAcceptsAllPhases`
- [ ] **Step 5: Commit**
```bash
git add internal/exec/result.go
git commit -m "fix(exec): expand validPhases and remove schema enum constraint for phase"
```
---
## Task 3: Session history injection in TDD green and refactor
**Files:**
- Modify: `internal/skills/tdd/skill.go`
- Modify: `internal/skills/tdd/handlers.go`
- Modify: `internal/skills/tdd/handlers_test.go`
- Modify: `cmd/supervisor/main.go`
- [ ] **Step 1: Write the failing tests**
Add to `internal/skills/tdd/handlers_test.go`:
```go
func TestTDDGreenInjectsSessionHistory(t *testing.T) {
sessDir := t.TempDir()
require.NoError(t, session.Append(sessDir, "sess-1", session.Entry{
SessionID: "sess-1", Skill: "tdd", Phase: "red", FinalStatus: "pass",
FilePath: "internal/foo/foo_test.go",
Message: "wrote failing test for Foo",
}))
var capturedPrompt string
fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) {
capturedPrompt = req.TaskPrompt
return iexec.Result{Status: "pass", Phase: "green", Skill: "tdd", Verified: true, ModelUsed: "self", Message: "ok"}, nil
}
sk := tdd.New(tdd.Config{SkillPrompt: "tdd", ExecutorFn: fakeFn, SessionsDir: sessDir})
_, err := sk.Handle(context.Background(), "tdd_green", json.RawMessage(
`{"project_root":"/tmp","test_path":"internal/foo/foo_test.go","test_cmd":"go test ./...","session_id":"sess-1"}`,
))
require.NoError(t, err)
assert.Contains(t, capturedPrompt, "## Session history")
assert.Contains(t, capturedPrompt, "wrote failing test for Foo")
}
func TestTDDGreenNoHistoryWhenSessionIDEmpty(t *testing.T) {
var capturedPrompt string
fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) {
capturedPrompt = req.TaskPrompt
return iexec.Result{Status: "pass", Phase: "green", Skill: "tdd", Verified: true, ModelUsed: "self", Message: "ok"}, nil
}
sk := tdd.New(tdd.Config{SkillPrompt: "tdd", ExecutorFn: fakeFn, SessionsDir: t.TempDir()})
_, err := sk.Handle(context.Background(), "tdd_green", json.RawMessage(
`{"project_root":"/tmp","test_path":"internal/foo/foo_test.go"}`,
))
require.NoError(t, err)
assert.NotContains(t, capturedPrompt, "## Session history")
}
```
You will need these imports in the test file:
```go
import (
"github.com/mathiasbq/supervisor/internal/session"
iexec "github.com/mathiasbq/supervisor/internal/exec"
)
```
- [ ] **Step 2: Run tests to confirm they fail**
```bash
go test ./internal/skills/tdd/... -run TestTDDGreen -v
```
Expected: FAIL — `tdd.Config` has no `SessionsDir` field, `session_id` not in args
- [ ] **Step 3: Add SessionsDir to Config and session_id to tool schemas**
In `internal/skills/tdd/skill.go`, add `SessionsDir` to Config:
```go
type Config struct {
SystemPrompt string
SkillPrompt string
ExecutorFn ExecutorFn
DefaultModel string
SessionsDir string // optional: path to brain/sessions/ for history injection
}
```
In `Tools()`, add `"session_id"` as an optional property to `tdd_green` and `tdd_refactor` input schemas:
```go
// tdd_green InputSchema:
InputSchema: schema(
[]string{"project_root", "test_path"},
map[string]any{
"project_root": strProp,
"test_path": strProp,
"model": strProp,
"test_cmd": strProp,
"session_id": strProp,
},
),
// tdd_refactor InputSchema (same addition):
InputSchema: schema(
[]string{"project_root", "test_path", "impl_path"},
map[string]any{
"project_root": strProp,
"test_path": strProp,
"impl_path": strProp,
"model": strProp,
"test_cmd": strProp,
"session_id": strProp,
},
),
```
- [ ] **Step 4: Update greenArgs, refactorArgs, and inject history**
In `internal/skills/tdd/handlers.go`, add imports and update structs + handlers:
```go
import (
// existing imports...
"github.com/mathiasbq/supervisor/internal/session"
)
type greenArgs struct {
ProjectRoot string `json:"project_root"`
TestPath string `json:"test_path"`
Model string `json:"model"`
TestCmd string `json:"test_cmd"`
SessionID string `json:"session_id"`
}
func (s *Skill) handleGreen(ctx context.Context, raw json.RawMessage) (json.RawMessage, error) {
var args greenArgs
if err := json.Unmarshal(raw, &args); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if args.ProjectRoot == "" {
return nil, fmt.Errorf("project_root is required")
}
if args.TestPath == "" {
return nil, fmt.Errorf("test_path is required")
}
task := fmt.Sprintf(
"phase: green\nproject_root: %s\ntest_path: %s\nmodel: %s\ntest_cmd: %s",
args.ProjectRoot, args.TestPath, s.resolveModel(args.Model), args.TestCmd,
)
task = s.prependHistory(args.SessionID, "green", task)
return s.execute(ctx, task)
}
type refactorArgs struct {
ProjectRoot string `json:"project_root"`
TestPath string `json:"test_path"`
ImplPath string `json:"impl_path"`
Model string `json:"model"`
TestCmd string `json:"test_cmd"`
SessionID string `json:"session_id"`
}
func (s *Skill) handleRefactor(ctx context.Context, raw json.RawMessage) (json.RawMessage, error) {
var args refactorArgs
if err := json.Unmarshal(raw, &args); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if args.ProjectRoot == "" {
return nil, fmt.Errorf("project_root is required")
}
if args.TestPath == "" {
return nil, fmt.Errorf("test_path is required")
}
if args.ImplPath == "" {
return nil, fmt.Errorf("impl_path is required")
}
task := fmt.Sprintf(
"phase: refactor\nproject_root: %s\ntest_path: %s\nimpl_path: %s\nmodel: %s\ntest_cmd: %s",
args.ProjectRoot, args.TestPath, args.ImplPath, s.resolveModel(args.Model), args.TestCmd,
)
task = s.prependHistory(args.SessionID, "refactor", task)
return s.execute(ctx, task)
}
// prependHistory reads the session log and prepends prior phase entries to the task prompt.
// If sessionID is empty or SessionsDir is not configured, task is returned unchanged.
func (s *Skill) prependHistory(sessionID, currentPhase, task string) string {
if sessionID == "" || s.cfg.SessionsDir == "" {
return task
}
entries, err := session.Read(s.cfg.SessionsDir, sessionID)
if err != nil || len(entries) == 0 {
return task
}
history := session.FormatHistory(entries, currentPhase)
if history == "" {
return task
}
return history + "\n---\n\n" + task
}
```
- [ ] **Step 5: Update main.go to pass SessionsDir to tdd.Config**
In `cmd/supervisor/main.go`, find the `tdd.New(tdd.Config{...})` call and add `SessionsDir`:
```go
reg.Register(tdd.New(tdd.Config{
SystemPrompt: string(systemPrompt),
SkillPrompt: string(tddPrompt),
DefaultModel: models.Resolve("tdd", ""),
ExecutorFn: executor.Run,
SessionsDir: cfg.SessionsDir, // ← add this line
}))
```
- [ ] **Step 6: Run all tests**
```bash
go test ./... -race -count=1
```
Expected: all tests PASS
- [ ] **Step 7: Commit**
```bash
git add internal/skills/tdd/skill.go internal/skills/tdd/handlers.go internal/skills/tdd/handlers_test.go cmd/supervisor/main.go
git commit -m "feat(tdd): inject session history into green and refactor worker prompts"
```
---
## Task 4: review skill
**Files:**
- Create: `internal/skills/review/skill.go`
- Create: `internal/skills/review/handlers.go`
- Create: `internal/skills/review/handlers_test.go`
- Create: `config/supervisor/review.md`
- Modify: `cmd/supervisor/main.go`
- [ ] **Step 1: Write the failing test**
```go
// internal/skills/review/handlers_test.go
package review_test
import (
"context"
"encoding/json"
"testing"
iexec "github.com/mathiasbq/supervisor/internal/exec"
"github.com/mathiasbq/supervisor/internal/skills/review"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestReviewToolRegistered(t *testing.T) {
sk := review.New(review.Config{SkillPrompt: "review rules"})
names := make([]string, 0)
for _, tool := range sk.Tools() {
names = append(names, tool.Name)
}
assert.Contains(t, names, "review")
}
func TestReviewRequiresProjectRoot(t *testing.T) {
sk := review.New(review.Config{SkillPrompt: "r"})
_, err := sk.Handle(context.Background(), "review", json.RawMessage(`{"files":["main.go"]}`))
assert.ErrorContains(t, err, "project_root")
}
func TestReviewRequiresFiles(t *testing.T) {
sk := review.New(review.Config{SkillPrompt: "r"})
_, err := sk.Handle(context.Background(), "review", json.RawMessage(`{"project_root":"/tmp"}`))
assert.ErrorContains(t, err, "files")
}
func TestReviewCallsExecutor(t *testing.T) {
called := false
var capturedTask string
fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) {
called = true
capturedTask = req.TaskPrompt
return iexec.Result{
Status: "pass", Phase: "review", Skill: "review",
Verified: true, ModelUsed: "self", Message: "2 warnings found",
}, nil
}
sk := review.New(review.Config{SkillPrompt: "review rules", ExecutorFn: fakeFn, SessionsDir: t.TempDir()})
out, err := sk.Handle(context.Background(), "review", json.RawMessage(
`{"project_root":"/tmp/proj","files":["internal/foo/foo.go"],"context":"PR: add Foo helper"}`,
))
require.NoError(t, err)
assert.True(t, called)
assert.Contains(t, capturedTask, "internal/foo/foo.go")
assert.Contains(t, capturedTask, "PR: add Foo helper")
var result iexec.Result
require.NoError(t, json.Unmarshal(out, &result))
assert.Equal(t, "pass", result.Status)
assert.Equal(t, "review", result.Phase)
}
var _ = require.New
```
- [ ] **Step 2: Run test to confirm it fails**
```bash
go test ./internal/skills/review/... -v
```
Expected: FAIL — package `review` does not exist
- [ ] **Step 3: Create skill.go**
```go
// internal/skills/review/skill.go
package review
import (
"context"
"encoding/json"
iexec "github.com/mathiasbq/supervisor/internal/exec"
"github.com/mathiasbq/supervisor/internal/registry"
)
type ExecutorFn func(ctx context.Context, req iexec.Request) (iexec.Result, error)
type Config struct {
SkillPrompt string
DefaultModel string
ExecutorFn ExecutorFn
SessionsDir string
}
type Skill struct{ cfg Config }
func New(cfg Config) *Skill { return &Skill{cfg: cfg} }
func (s *Skill) Name() string { return "review" }
func (s *Skill) Tools() []registry.ToolDef {
schema := func(required []string, props map[string]any) json.RawMessage {
b, _ := json.Marshal(map[string]any{"type": "object", "required": required, "properties": props})
return b
}
return []registry.ToolDef{
{
Name: "review",
Description: "Perform a structured code review of the specified files. Returns findings with severity levels.",
InputSchema: schema(
[]string{"project_root", "files"},
map[string]any{
"project_root": map[string]any{"type": "string"},
"files": map[string]any{"type": "array", "items": map[string]any{"type": "string"}},
"context": map[string]any{"type": "string"},
"model": map[string]any{"type": "string"},
"session_id": map[string]any{"type": "string"},
},
),
},
}
}
```
- [ ] **Step 4: Create handlers.go**
```go
// internal/skills/review/handlers.go
package review
import (
"context"
"encoding/json"
"fmt"
"strings"
iexec "github.com/mathiasbq/supervisor/internal/exec"
"github.com/mathiasbq/supervisor/internal/session"
)
type reviewArgs struct {
ProjectRoot string `json:"project_root"`
Files []string `json:"files"`
Context string `json:"context"`
Model string `json:"model"`
SessionID string `json:"session_id"`
}
func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
if tool != "review" {
return nil, fmt.Errorf("unknown tool: %s", tool)
}
var a reviewArgs
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if a.ProjectRoot == "" {
return nil, fmt.Errorf("project_root is required")
}
if len(a.Files) == 0 {
return nil, fmt.Errorf("files is required")
}
model := a.Model
if model == "" {
model = s.cfg.DefaultModel
}
task := fmt.Sprintf(
"phase: review\nproject_root: %s\nfiles: %s\ncontext: %s\nmodel: %s",
a.ProjectRoot, strings.Join(a.Files, ", "), a.Context, model,
)
task = s.prependHistory(a.SessionID, "review", task)
if s.cfg.ExecutorFn == nil {
return nil, fmt.Errorf("no executor configured")
}
result, err := s.cfg.ExecutorFn(ctx, iexec.Request{
SkillPrompt: s.cfg.SkillPrompt,
TaskPrompt: task,
Model: model,
Tools: "Read,Bash",
})
if err != nil {
return nil, err
}
b, err := json.Marshal(result)
if err != nil {
return nil, fmt.Errorf("marshal result: %w", err)
}
return b, nil
}
func (s *Skill) prependHistory(sessionID, currentPhase, task string) string {
if sessionID == "" || s.cfg.SessionsDir == "" {
return task
}
entries, err := session.Read(s.cfg.SessionsDir, sessionID)
if err != nil || len(entries) == 0 {
return task
}
history := session.FormatHistory(entries, currentPhase)
if history == "" {
return task
}
return history + "\n---\n\n" + task
}
```
- [ ] **Step 5: Create config/supervisor/review.md**
```markdown
# Code Review Discipline
You are a disciplined code reviewer. Read files carefully before commenting.
## Iron laws
1. Never approve security vulnerabilities: command injection, SQL injection, credential exposure, path traversal, unchecked input at system boundaries
2. Never approve silently swallowed errors — `err != nil` without wrapping or handling is always wrong
3. Never approve missing validation at system boundaries (user input, external APIs, file reads)
## Output contract
Return JSON result with:
- `status`: "pass" if no blocking issues; "fail" if any iron law is violated
- `phase`: "review"
- `skill`: "review"
- `file_path`: first file reviewed
- `runner_output`: full review formatted as:
```
CRITICAL: <issue> at <file>:<line>
WARNING: <issue> at <file>:<line>
SUGGESTION: <issue> at <file>:<line>
```
- `verified`: true if you read all specified files; false if any were missing or unreadable
- `message`: "N critical, M warnings, K suggestions" or "clean: <which iron law checks passed and why>"
## Rules
1. Read every file listed before writing feedback
2. Check iron laws first — any violation is CRITICAL and sets status to "fail"
3. Then check: correctness, test coverage for new code, Go style conventions
4. Never rubber-stamp — if nothing is wrong, explain specifically which iron law checks you ran and why they passed
5. Line references are required for every finding — "roughly around the middle" is not acceptable
```
- [ ] **Step 6: Wire into main.go**
In `cmd/supervisor/main.go`, add after existing `os.ReadFile` calls for discipline files:
```go
reviewPrompt, err := os.ReadFile(cfg.ConfigDir + "/review.md")
if err != nil {
logger.Error("read review.md", "path", cfg.ConfigDir+"/review.md", "err", err)
os.Exit(1)
}
```
Add import: `"github.com/mathiasbq/supervisor/internal/skills/review"`
Register after existing `reg.Register` calls:
```go
reg.Register(review.New(review.Config{
SkillPrompt: string(reviewPrompt),
DefaultModel: models.Resolve("review", ""),
ExecutorFn: executor.Run,
SessionsDir: cfg.SessionsDir,
}))
```
- [ ] **Step 7: Run all tests**
```bash
go test ./... -race -count=1
```
Expected: all tests PASS
- [ ] **Step 8: Commit**
```bash
git add internal/skills/review/ config/supervisor/review.md cmd/supervisor/main.go
git commit -m "feat(review): add code review MCP skill with session history injection"
```
---
## Task 5: debug skill
**Files:**
- Create: `internal/skills/debug/skill.go`
- Create: `internal/skills/debug/handlers.go`
- Create: `internal/skills/debug/handlers_test.go`
- Create: `config/supervisor/debug.md`
- Modify: `cmd/supervisor/main.go`
- [ ] **Step 1: Write the failing test**
```go
// internal/skills/debug/handlers_test.go
package debug_test
import (
"context"
"encoding/json"
"testing"
iexec "github.com/mathiasbq/supervisor/internal/exec"
"github.com/mathiasbq/supervisor/internal/skills/debug"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestDebugToolRegistered(t *testing.T) {
sk := debug.New(debug.Config{SkillPrompt: "debug rules"})
names := make([]string, 0)
for _, tool := range sk.Tools() {
names = append(names, tool.Name)
}
assert.Contains(t, names, "debug")
}
func TestDebugRequiresProjectRoot(t *testing.T) {
sk := debug.New(debug.Config{SkillPrompt: "d"})
_, err := sk.Handle(context.Background(), "debug", json.RawMessage(`{"error":"panic: nil pointer"}`))
assert.ErrorContains(t, err, "project_root")
}
func TestDebugRequiresError(t *testing.T) {
sk := debug.New(debug.Config{SkillPrompt: "d"})
_, err := sk.Handle(context.Background(), "debug", json.RawMessage(`{"project_root":"/tmp"}`))
assert.ErrorContains(t, err, "error")
}
func TestDebugCallsExecutor(t *testing.T) {
called := false
var capturedTask string
fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) {
called = true
capturedTask = req.TaskPrompt
return iexec.Result{
Status: "pass", Phase: "debug", Skill: "debug",
RunnerOutput: "HYPOTHESIS 1 (likelihood: high): nil map access\nVERIFY: go test ./... → expected: panic line reference",
Verified: false, ModelUsed: "self", Message: "3 hypotheses for: panic nil pointer at foo.go:42",
}, nil
}
sk := debug.New(debug.Config{SkillPrompt: "debug rules", ExecutorFn: fakeFn, SessionsDir: t.TempDir()})
out, err := sk.Handle(context.Background(), "debug", json.RawMessage(
`{"project_root":"/tmp/proj","error":"panic: nil pointer dereference at foo.go:42","context":"occurs on startup"}`,
))
require.NoError(t, err)
assert.True(t, called)
assert.Contains(t, capturedTask, "panic: nil pointer dereference")
assert.Contains(t, capturedTask, "occurs on startup")
var result iexec.Result
require.NoError(t, json.Unmarshal(out, &result))
assert.Equal(t, "debug", result.Phase)
}
var _ = require.New
```
- [ ] **Step 2: Run test to confirm it fails**
```bash
go test ./internal/skills/debug/... -v
```
Expected: FAIL — package does not exist
- [ ] **Step 3: Create skill.go**
```go
// internal/skills/debug/skill.go
package debug
import (
"context"
"encoding/json"
iexec "github.com/mathiasbq/supervisor/internal/exec"
"github.com/mathiasbq/supervisor/internal/registry"
)
type ExecutorFn func(ctx context.Context, req iexec.Request) (iexec.Result, error)
type Config struct {
SkillPrompt string
DefaultModel string
ExecutorFn ExecutorFn
SessionsDir string
}
type Skill struct{ cfg Config }
func New(cfg Config) *Skill { return &Skill{cfg: cfg} }
func (s *Skill) Name() string { return "debug" }
func (s *Skill) Tools() []registry.ToolDef {
schema := func(required []string, props map[string]any) json.RawMessage {
b, _ := json.Marshal(map[string]any{"type": "object", "required": required, "properties": props})
return b
}
str := map[string]any{"type": "string"}
return []registry.ToolDef{
{
Name: "debug",
Description: "Analyse an error and return 3-5 hypotheses ordered by likelihood, each with a concrete verification step.",
InputSchema: schema(
[]string{"project_root", "error"},
map[string]any{
"project_root": str,
"error": str,
"context": str,
"model": str,
"session_id": str,
},
),
},
}
}
```
- [ ] **Step 4: Create handlers.go**
```go
// internal/skills/debug/handlers.go
package debug
import (
"context"
"encoding/json"
"fmt"
iexec "github.com/mathiasbq/supervisor/internal/exec"
"github.com/mathiasbq/supervisor/internal/session"
)
type debugArgs struct {
ProjectRoot string `json:"project_root"`
Error string `json:"error"`
Context string `json:"context"`
Model string `json:"model"`
SessionID string `json:"session_id"`
}
func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
if tool != "debug" {
return nil, fmt.Errorf("unknown tool: %s", tool)
}
var a debugArgs
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if a.ProjectRoot == "" {
return nil, fmt.Errorf("project_root is required")
}
if a.Error == "" {
return nil, fmt.Errorf("error is required")
}
model := a.Model
if model == "" {
model = s.cfg.DefaultModel
}
task := fmt.Sprintf(
"phase: debug\nproject_root: %s\nerror: %s\ncontext: %s\nmodel: %s",
a.ProjectRoot, a.Error, a.Context, model,
)
task = s.prependHistory(a.SessionID, "debug", task)
if s.cfg.ExecutorFn == nil {
return nil, fmt.Errorf("no executor configured")
}
result, err := s.cfg.ExecutorFn(ctx, iexec.Request{
SkillPrompt: s.cfg.SkillPrompt,
TaskPrompt: task,
Model: model,
Tools: "Read,Bash",
})
if err != nil {
return nil, err
}
b, err := json.Marshal(result)
if err != nil {
return nil, fmt.Errorf("marshal result: %w", err)
}
return b, nil
}
func (s *Skill) prependHistory(sessionID, currentPhase, task string) string {
if sessionID == "" || s.cfg.SessionsDir == "" {
return task
}
entries, err := session.Read(s.cfg.SessionsDir, sessionID)
if err != nil || len(entries) == 0 {
return task
}
history := session.FormatHistory(entries, currentPhase)
if history == "" {
return task
}
return history + "\n---\n\n" + task
}
```
- [ ] **Step 5: Create config/supervisor/debug.md**
```markdown
# Debug Discipline
You are a systematic debugger. Form hypotheses before suggesting fixes.
## Iron laws
1. Never suggest "try X and see what happens" — every hypothesis must have a specific expected outcome if correct
2. Generate exactly 3-5 hypotheses, ordered by likelihood (most likely first)
3. Never fix the bug — diagnose only; the caller decides what to do with the hypotheses
## Output contract
Return JSON result with:
- `status`: "pass" (hypotheses generated) or "error" (error too ambiguous to analyse)
- `phase`: "debug"
- `skill`: "debug"
- `file_path`: the most relevant file to the error (read it)
- `runner_output`: your hypotheses, formatted as:
```
HYPOTHESIS 1 (likelihood: high): <mechanism>
VERIFY: <exact command or file to check> → expected if correct: <specific output>
HYPOTHESIS 2 (likelihood: medium): <mechanism>
VERIFY: <exact command or file to check> → expected if correct: <specific output>
```
- `verified`: false — verification is the caller's job
- `message`: "N hypotheses for: <one-line error summary>"
## Rules
1. Read the error and any context files provided before forming hypotheses
2. Identify the failure mode first — what actually went wrong, not just what the error says
3. For each hypothesis: name the mechanism, explain why it would produce this exact error, give a concrete verification command with expected output
4. If the error is clearly a typo or trivial mistake, still form 3 hypotheses — surface the most likely cause as #1
```
- [ ] **Step 6: Wire into main.go**
Add file read:
```go
debugPrompt, err := os.ReadFile(cfg.ConfigDir + "/debug.md")
if err != nil {
logger.Error("read debug.md", "path", cfg.ConfigDir+"/debug.md", "err", err)
os.Exit(1)
}
```
Add import: `"github.com/mathiasbq/supervisor/internal/skills/debug"`
Register skill:
```go
reg.Register(debug.New(debug.Config{
SkillPrompt: string(debugPrompt),
DefaultModel: models.Resolve("debug", ""),
ExecutorFn: executor.Run,
SessionsDir: cfg.SessionsDir,
}))
```
- [ ] **Step 7: Run all tests**
```bash
go test ./... -race -count=1
```
Expected: all tests PASS
- [ ] **Step 8: Commit**
```bash
git add internal/skills/debug/ config/supervisor/debug.md cmd/supervisor/main.go
git commit -m "feat(debug): add debug MCP skill with hypothesis generation"
```
---
## Task 6: spec skill
**Files:**
- Create: `internal/skills/spec/skill.go`
- Create: `internal/skills/spec/handlers.go`
- Create: `internal/skills/spec/handlers_test.go`
- Create: `config/supervisor/spec.md`
- Modify: `cmd/supervisor/main.go`
- [ ] **Step 1: Write the failing test**
```go
// internal/skills/spec/handlers_test.go
package spec_test
import (
"context"
"encoding/json"
"testing"
iexec "github.com/mathiasbq/supervisor/internal/exec"
"github.com/mathiasbq/supervisor/internal/skills/spec"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestSpecToolRegistered(t *testing.T) {
sk := spec.New(spec.Config{SkillPrompt: "spec rules"})
names := make([]string, 0)
for _, tool := range sk.Tools() {
names = append(names, tool.Name)
}
assert.Contains(t, names, "spec")
}
func TestSpecRequiresProjectRoot(t *testing.T) {
sk := spec.New(spec.Config{SkillPrompt: "s"})
_, err := sk.Handle(context.Background(), "spec", json.RawMessage(`{"requirements":"add login"}`))
assert.ErrorContains(t, err, "project_root")
}
func TestSpecRequiresRequirements(t *testing.T) {
sk := spec.New(spec.Config{SkillPrompt: "s"})
_, err := sk.Handle(context.Background(), "spec", json.RawMessage(`{"project_root":"/tmp"}`))
assert.ErrorContains(t, err, "requirements")
}
func TestSpecCallsExecutor(t *testing.T) {
called := false
var capturedTask string
fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) {
called = true
capturedTask = req.TaskPrompt
return iexec.Result{
Status: "pass", Phase: "spec", Skill: "spec",
FilePath: "/tmp/proj/docs/login-spec.md",
Verified: true, ModelUsed: "self", Message: "spec written: login feature",
}, nil
}
sk := spec.New(spec.Config{SkillPrompt: "spec rules", ExecutorFn: fakeFn, SessionsDir: t.TempDir()})
out, err := sk.Handle(context.Background(), "spec", json.RawMessage(
`{"project_root":"/tmp/proj","requirements":"add OAuth2 login","output_path":"docs/login-spec.md"}`,
))
require.NoError(t, err)
assert.True(t, called)
assert.Contains(t, capturedTask, "OAuth2 login")
assert.Contains(t, capturedTask, "docs/login-spec.md")
var result iexec.Result
require.NoError(t, json.Unmarshal(out, &result))
assert.Equal(t, "spec", result.Phase)
}
var _ = require.New
```
- [ ] **Step 2: Run test to confirm it fails**
```bash
go test ./internal/skills/spec/... -v
```
Expected: FAIL — package does not exist
- [ ] **Step 3: Create skill.go**
```go
// internal/skills/spec/skill.go
package spec
import (
"context"
"encoding/json"
iexec "github.com/mathiasbq/supervisor/internal/exec"
"github.com/mathiasbq/supervisor/internal/registry"
)
type ExecutorFn func(ctx context.Context, req iexec.Request) (iexec.Result, error)
type Config struct {
SkillPrompt string
DefaultModel string
ExecutorFn ExecutorFn
SessionsDir string
}
type Skill struct{ cfg Config }
func New(cfg Config) *Skill { return &Skill{cfg: cfg} }
func (s *Skill) Name() string { return "spec" }
func (s *Skill) Tools() []registry.ToolDef {
schema := func(required []string, props map[string]any) json.RawMessage {
b, _ := json.Marshal(map[string]any{"type": "object", "required": required, "properties": props})
return b
}
str := map[string]any{"type": "string"}
return []registry.ToolDef{
{
Name: "spec",
Description: "Generate a structured implementation spec from requirements. Writes the spec to output_path in the project.",
InputSchema: schema(
[]string{"project_root", "requirements"},
map[string]any{
"project_root": str,
"requirements": str,
"output_path": str,
"context": str,
"model": str,
"session_id": str,
},
),
},
}
}
```
- [ ] **Step 4: Create handlers.go**
```go
// internal/skills/spec/handlers.go
package spec
import (
"context"
"encoding/json"
"fmt"
iexec "github.com/mathiasbq/supervisor/internal/exec"
"github.com/mathiasbq/supervisor/internal/session"
)
type specArgs struct {
ProjectRoot string `json:"project_root"`
Requirements string `json:"requirements"`
OutputPath string `json:"output_path"`
Context string `json:"context"`
Model string `json:"model"`
SessionID string `json:"session_id"`
}
func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
if tool != "spec" {
return nil, fmt.Errorf("unknown tool: %s", tool)
}
var a specArgs
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if a.ProjectRoot == "" {
return nil, fmt.Errorf("project_root is required")
}
if a.Requirements == "" {
return nil, fmt.Errorf("requirements is required")
}
outputPath := a.OutputPath
if outputPath == "" {
outputPath = "docs/spec.md"
}
model := a.Model
if model == "" {
model = s.cfg.DefaultModel
}
task := fmt.Sprintf(
"phase: spec\nproject_root: %s\nrequirements: %s\noutput_path: %s\ncontext: %s\nmodel: %s",
a.ProjectRoot, a.Requirements, outputPath, a.Context, model,
)
task = s.prependHistory(a.SessionID, "spec", task)
if s.cfg.ExecutorFn == nil {
return nil, fmt.Errorf("no executor configured")
}
result, err := s.cfg.ExecutorFn(ctx, iexec.Request{
SkillPrompt: s.cfg.SkillPrompt,
TaskPrompt: task,
Model: model,
Tools: "Read,Write",
})
if err != nil {
return nil, err
}
b, err := json.Marshal(result)
if err != nil {
return nil, fmt.Errorf("marshal result: %w", err)
}
return b, nil
}
func (s *Skill) prependHistory(sessionID, currentPhase, task string) string {
if sessionID == "" || s.cfg.SessionsDir == "" {
return task
}
entries, err := session.Read(s.cfg.SessionsDir, sessionID)
if err != nil || len(entries) == 0 {
return task
}
history := session.FormatHistory(entries, currentPhase)
if history == "" {
return task
}
return history + "\n---\n\n" + task
}
```
- [ ] **Step 5: Create config/supervisor/spec.md**
```markdown
# Spec Writing Discipline
You write structured implementation specs. Nothing is left ambiguous.
## Iron laws
1. Success criteria must be measurable — "the system is fast" is banned; "p99 < 200ms under 100 RPS" is valid
2. Always include an explicit "Out of scope" section — if you don't draw the boundary, the developer will guess wrong
3. Every technical decision in the approach must have a rationale
## Output contract
Return JSON result with:
- `status`: "pass" (spec written) or "error" (requirements too ambiguous to spec without more input)
- `phase`: "spec"
- `skill`: "spec"
- `file_path`: the output_path where the spec was written (absolute path)
- `runner_output`: ""
- `verified`: true if the file was written successfully
- `message`: "spec written: <one-line summary of what was specced>"
## Spec structure
Write the spec as markdown to the output_path:
```markdown
# [Feature] Spec
## Problem statement
[What problem does this solve? For whom? Why now?]
## Success criteria
- [ ] [Criterion 1 — measurable and verifiable]
- [ ] [Criterion 2 — measurable and verifiable]
## Constraints
[Non-negotiable requirements the solution must satisfy]
## Out of scope
[What we are explicitly NOT doing in this iteration]
## Technical approach
[Architecture decisions, key components, rationale for each choice]
## Risks
[What could go wrong, and how we'd mitigate it]
```
If the requirements are too vague to produce measurable success criteria, return status "error" with a message listing the specific questions that need answers.
```
- [ ] **Step 6: Wire into main.go**
Add file read:
```go
specPrompt, err := os.ReadFile(cfg.ConfigDir + "/spec.md")
if err != nil {
logger.Error("read spec.md", "path", cfg.ConfigDir+"/spec.md", "err", err)
os.Exit(1)
}
```
Add import: `"github.com/mathiasbq/supervisor/internal/skills/spec"`
Register skill:
```go
reg.Register(spec.New(spec.Config{
SkillPrompt: string(specPrompt),
DefaultModel: models.Resolve("spec", ""),
ExecutorFn: executor.Run,
SessionsDir: cfg.SessionsDir,
}))
```
- [ ] **Step 7: Run all tests**
```bash
go test ./... -race -count=1
```
Expected: all tests PASS
- [ ] **Step 8: Commit**
```bash
git add internal/skills/spec/ config/supervisor/spec.md cmd/supervisor/main.go
git commit -m "feat(spec): add spec writing MCP skill"
```
---
## Task 7: trainer skill
**Files:**
- Create: `internal/skills/trainer/skill.go`
- Create: `internal/skills/trainer/handlers.go`
- Create: `internal/skills/trainer/handlers_test.go`
- Create: `config/supervisor/trainer-reader.md`
- Create: `config/supervisor/trainer-writer.md`
- Modify: `cmd/supervisor/main.go`
- Modify: `config/models.yaml`
- [ ] **Step 1: Write the failing test**
```go
// internal/skills/trainer/handlers_test.go
package trainer_test
import (
"context"
"encoding/json"
"testing"
iexec "github.com/mathiasbq/supervisor/internal/exec"
"github.com/mathiasbq/supervisor/internal/session"
"github.com/mathiasbq/supervisor/internal/skills/trainer"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestTrainerToolRegistered(t *testing.T) {
sk := trainer.New(trainer.Config{ReaderPrompt: "r", WriterPrompt: "w"})
names := make([]string, 0)
for _, tool := range sk.Tools() {
names = append(names, tool.Name)
}
assert.Contains(t, names, "trainer")
}
func TestTrainerRequiresSessionID(t *testing.T) {
sk := trainer.New(trainer.Config{ReaderPrompt: "r", WriterPrompt: "w"})
_, err := sk.Handle(context.Background(), "trainer", json.RawMessage(`{}`))
assert.ErrorContains(t, err, "session_id")
}
func TestTrainerCallsReaderThenWriter(t *testing.T) {
sessDir := t.TempDir()
require.NoError(t, session.Append(sessDir, "sess-1", session.Entry{
SessionID: "sess-1", Skill: "tdd", Phase: "red", FinalStatus: "pass",
Message: "wrote failing test", FilePath: "internal/foo/foo_test.go",
}))
callCount := 0
var readerTask, writerTask string
fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) {
callCount++
if callCount == 1 {
// reader call
readerTask = req.TaskPrompt
return iexec.Result{
Status: "pass", Phase: "trainer", Skill: "trainer",
RunnerOutput: `[{"type":"sft","moment":"first-pass clean TDD","score":4}]`,
Verified: true, ModelUsed: "self", Message: "1 sft candidate found",
}, nil
}
// writer call
writerTask = req.TaskPrompt
return iexec.Result{
Status: "pass", Phase: "trainer", Skill: "trainer",
FilePath: sessDir + "/training-data/sft/sess-1.jsonl",
Verified: true, ModelUsed: "self", Message: "1 sft pair written",
}, nil
}
sk := trainer.New(trainer.Config{
ReaderPrompt: "reader rules",
WriterPrompt: "writer rules",
ExecutorFn: fakeFn,
SessionsDir: sessDir,
BrainDir: t.TempDir(),
})
out, err := sk.Handle(context.Background(), "trainer", json.RawMessage(`{"session_id":"sess-1"}`))
require.NoError(t, err)
assert.Equal(t, 2, callCount, "executor must be called exactly twice: reader then writer")
assert.Contains(t, readerTask, "role: reader")
assert.Contains(t, readerTask, "sess-1")
assert.Contains(t, readerTask, "wrote failing test") // session history in reader prompt
assert.Contains(t, writerTask, "role: writer")
assert.Contains(t, writerTask, "sft candidate") // reader output passed to writer
var result iexec.Result
require.NoError(t, json.Unmarshal(out, &result))
assert.Equal(t, "trainer", result.Phase)
assert.Equal(t, "pass", result.Status)
}
var _ = require.New
```
- [ ] **Step 2: Run test to confirm it fails**
```bash
go test ./internal/skills/trainer/... -v
```
Expected: FAIL — package does not exist
- [ ] **Step 3: Create skill.go**
```go
// internal/skills/trainer/skill.go
package trainer
import (
"context"
"encoding/json"
iexec "github.com/mathiasbq/supervisor/internal/exec"
"github.com/mathiasbq/supervisor/internal/registry"
)
type ExecutorFn func(ctx context.Context, req iexec.Request) (iexec.Result, error)
type Config struct {
ReaderPrompt string
WriterPrompt string
DefaultModel string
ExecutorFn ExecutorFn
SessionsDir string
BrainDir string // root of brain/ directory; writer writes to BrainDir/training-data/
}
type Skill struct{ cfg Config }
func New(cfg Config) *Skill { return &Skill{cfg: cfg} }
func (s *Skill) Name() string { return "trainer" }
func (s *Skill) Tools() []registry.ToolDef {
schema := func(required []string, props map[string]any) json.RawMessage {
b, _ := json.Marshal(map[string]any{"type": "object", "required": required, "properties": props})
return b
}
return []registry.ToolDef{
{
Name: "trainer",
Description: "Extract SFT and DPO training pairs from a session log. Runs a reader→writer chain: reader identifies learning moments, writer formats and writes pairs to brain/training-data/.",
InputSchema: schema(
[]string{"session_id"},
map[string]any{
"session_id": map[string]any{"type": "string"},
"model": map[string]any{"type": "string"},
},
),
},
}
}
```
- [ ] **Step 4: Create handlers.go**
```go
// internal/skills/trainer/handlers.go
package trainer
import (
"context"
"encoding/json"
"fmt"
iexec "github.com/mathiasbq/supervisor/internal/exec"
"github.com/mathiasbq/supervisor/internal/session"
)
type trainArgs struct {
SessionID string `json:"session_id"`
Model string `json:"model"`
}
func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
if tool != "trainer" {
return nil, fmt.Errorf("unknown tool: %s", tool)
}
var a trainArgs
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if a.SessionID == "" {
return nil, fmt.Errorf("session_id is required")
}
if s.cfg.ExecutorFn == nil {
return nil, fmt.Errorf("no executor configured")
}
model := a.Model
if model == "" {
model = s.cfg.DefaultModel
}
entries, err := session.Read(s.cfg.SessionsDir, a.SessionID)
if err != nil {
return nil, fmt.Errorf("read session log: %w", err)
}
// ── Step 1: Reader agent ─────────────────────────────────────────────────
history := session.FormatHistory(entries, "")
readerTask := fmt.Sprintf(
"role: reader\nsession_id: %s\nbrain_dir: %s\n\n%s",
a.SessionID, s.cfg.BrainDir, history,
)
readerResult, err := s.cfg.ExecutorFn(ctx, iexec.Request{
SkillPrompt: s.cfg.ReaderPrompt,
TaskPrompt: readerTask,
Model: model,
Tools: "Read",
})
if err != nil {
return nil, fmt.Errorf("reader agent: %w", err)
}
// ── Step 2: Writer agent (receives reader candidates) ────────────────────
writerTask := fmt.Sprintf(
"role: writer\nsession_id: %s\nbrain_dir: %s\n\nreader_candidates:\n%s",
a.SessionID, s.cfg.BrainDir, readerResult.RunnerOutput,
)
writerResult, err := s.cfg.ExecutorFn(ctx, iexec.Request{
SkillPrompt: s.cfg.WriterPrompt,
TaskPrompt: writerTask,
Model: model,
Tools: "Read,Write",
})
if err != nil {
return nil, fmt.Errorf("writer agent: %w", err)
}
b, err := json.Marshal(writerResult)
if err != nil {
return nil, fmt.Errorf("marshal result: %w", err)
}
return b, nil
}
```
- [ ] **Step 5: Create config/supervisor/trainer-reader.md**
```markdown
# Trainer Reader Discipline
You scan session logs and identify candidate learning moments worth converting to training data.
## What to look for
- **SFT candidates**: the worker did exactly the right thing — a clean pattern worth reinforcing
- **DPO candidates**: the worker first produced a wrong or suboptimal response, then corrected — you have both rejected and chosen
## Scoring (15)
- 5: novel pattern, clearly correct, generalises across projects
- 4: good pattern, correct, somewhat project-specific but still useful
- 3: correct but obvious — include only if especially clean
- 2 or below: skip — too ambiguous or too context-specific
## Output contract
Return JSON result with:
- `status`: "pass" or "error"
- `phase`: "trainer"
- `skill`: "trainer"
- `file_path`: ""
- `runner_output`: JSON array of candidates (valid JSON, not markdown):
[{"type":"sft","moment":"<what happened>","prompt":"<what was asked>","completion":"<what was done right>","score":4},
{"type":"dpo","moment":"<what happened>","prompt":"<what was asked>","chosen":"<correct>","rejected":"<incorrect>","score":3}]
- `verified`: true
- `message`: "N sft candidates, M dpo candidates found"
## Rules
1. Read all session entries in the task prompt
2. Score each entry — only include entries scoring >= 3
3. Prompt/completion fields must be phrased to generalise: no project-specific paths or names
4. If no candidates score >= 3, return an empty array `[]` — never force low-quality candidates
```
- [ ] **Step 6: Create config/supervisor/trainer-writer.md**
```markdown
# Trainer Writer Discipline
You receive candidate learning moments from the reader and write clean SFT/DPO training pairs.
## Quality gate (apply before writing)
- SFT: prompt must be phrased so it could come from any project, not just this one
- DPO: chosen and rejected must be clearly distinguishable — skip if a reader can't tell which is better
- Never include project-specific paths, variable names, or identifiers in any pair
## Output contract
Return JSON result with:
- `status`: "pass" (pairs written or skipped due to quality) or "error" (candidates JSON was malformed)
- `phase`: "trainer"
- `skill`: "trainer"
- `file_path`: path of the last file written (empty if nothing passed quality gate)
- `runner_output`: "N SFT pairs written to brain/training-data/sft/, M DPO pairs to brain/training-data/dpo/" or "0 pairs passed quality gate"
- `verified`: true if files were written; false if nothing passed
- `message`: "N sft + M dpo pairs for session <id>" or "no pairs passed quality gate"
## File format
JSONL — one JSON object per line.
SFT: `{"prompt": "...", "completion": "..."}`
DPO: `{"prompt": "...", "chosen": "...", "rejected": "..."}`
Write SFT to: `<brain_dir>/training-data/sft/<session_id>.jsonl`
Write DPO to: `<brain_dir>/training-data/dpo/<session_id>.jsonl`
Append to existing files if they exist (don't overwrite).
## Rules
1. Parse the `reader_candidates` JSON from the task prompt
2. For each candidate: apply quality gate
3. Write passing SFT candidates to sft JSONL, DPO candidates to dpo JSONL
4. If nothing passes, return status "pass" with verified: false and message "no pairs passed quality gate"
```
- [ ] **Step 7: Wire into main.go**
Add file reads:
```go
trainerReaderPrompt, err := os.ReadFile(cfg.ConfigDir + "/trainer-reader.md")
if err != nil {
logger.Error("read trainer-reader.md", "path", cfg.ConfigDir+"/trainer-reader.md", "err", err)
os.Exit(1)
}
trainerWriterPrompt, err := os.ReadFile(cfg.ConfigDir + "/trainer-writer.md")
if err != nil {
logger.Error("read trainer-writer.md", "path", cfg.ConfigDir+"/trainer-writer.md", "err", err)
os.Exit(1)
}
```
Add import: `"github.com/mathiasbq/supervisor/internal/skills/trainer"`
Register skill:
```go
reg.Register(trainer.New(trainer.Config{
ReaderPrompt: string(trainerReaderPrompt),
WriterPrompt: string(trainerWriterPrompt),
DefaultModel: models.Resolve("trainer", ""),
ExecutorFn: executor.Run,
SessionsDir: cfg.SessionsDir,
BrainDir: cfg.BrainDir,
}))
```
- [ ] **Step 8: Update config/models.yaml**
Add `trainer` entry following existing format:
```yaml
skills:
tdd: ollama/qwen3-coder-30b-tuned
review: ollama/devstral-tuned
debug: ollama/deepseek-r1-tuned
retrospective: ollama/qwen3-coder-30b-tuned
spec: ollama/qwen3-coder-30b-tuned
trainer: ollama/qwen3-coder-30b-tuned
```
- [ ] **Step 9: Run all tests**
```bash
go test ./... -race -count=1
```
Expected: all tests PASS, including `TestTrainerCallsReaderThenWriter`
- [ ] **Step 10: Commit**
```bash
git add internal/skills/trainer/ config/supervisor/trainer-reader.md config/supervisor/trainer-writer.md config/models.yaml cmd/supervisor/main.go
git commit -m "feat(trainer): add trainer MCP skill with reader→writer sub-agent chain"
```
---
## Task 8: Integration smoke test and CI push
**Files:** none new — validates the full system
- [ ] **Step 1: Run task check (full quality gate)**
```bash
task check
```
Expected: lint, test, and vet all pass for both modules
- [ ] **Step 2: Verify all 12 MCP tools are registered**
With servers running (`task start` in another terminal):
```bash
curl -s -X POST http://localhost:3200/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' \
| python3 -c "import json,sys; tools=json.load(sys.stdin)['result']['tools']; [print(t['name']) for t in tools]"
```
Expected output (12 tools):
```
tdd_red
tdd_green
tdd_refactor
brain_query
brain_write
tier
session_log
retrospective
review
debug
spec
trainer
```
- [ ] **Step 3: Smoke test each new skill**
```bash
# review
curl -s --max-time 10 -X POST http://localhost:3200/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"review","arguments":{"project_root":"/tmp","files":["nonexistent.go"],"context":"test"}}}'
# debug
curl -s --max-time 10 -X POST http://localhost:3200/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"debug","arguments":{"project_root":"/tmp","error":"test error"}}}'
# spec
curl -s --max-time 10 -X POST http://localhost:3200/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":4,"method":"tools/call","params":{"name":"spec","arguments":{"project_root":"/tmp","requirements":"add login button"}}}'
# trainer (requires a session log entry)
curl -s --max-time 10 -X POST http://localhost:3200/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":5,"method":"tools/call","params":{"name":"trainer","arguments":{"session_id":"2026-04-17-validate-hyperguild"}}}'
```
Each should return a valid JSON-RPC result (not `-32000` error). The actual worker call will take longer — `--max-time 10` just tests that the MCP dispatch layer works.
- [ ] **Step 4: Push and verify CI**
```bash
git push origin main
```
Watch the CI run complete on Gitea. Check for:
- `check` job: green (lint + test + vet)
- `mirror` job: green (pushed to GitHub)
```bash
curl -s "https://gitea.d-ma.be/api/v1/repos/mathias/hyperguild/actions/runs?limit=1" \
-H "Authorization: token $GITEA_TOKEN" | python3 -c "
import json,sys
d=json.load(sys.stdin)
for r in d.get('workflow_runs',[]):
print(f'#{r[\"id\"]} {r[\"status\"]:12} {r.get(\"conclusion\",\"\"):8} {r[\"display_title\"][:50]}')
"
```
- [ ] **Step 5: Tag v0.2.0**
```bash
task tag version=v0.2.0
```
---
## Self-review
**Spec coverage check:**
- Session history injection into tdd_green and tdd_refactor → Task 3 ✓
- All new skills accept `session_id` for history injection → Tasks 47 ✓
- Trainer reader→writer chain → Task 7 ✓
- Schema enum fixed (was causing retrospective to return wrong phase) → Task 2 ✓
- Phase 2 skills registered in main.go → Tasks 47 each include main.go wiring ✓
- CI passes → Task 8 ✓
**Placeholder scan:** None found — all steps include complete code.
**Type consistency:**
- `ExecutorFn` is defined in each skill package as `func(ctx context.Context, req iexec.Request) (iexec.Result, error)` — consistent across tdd, review, debug, spec, trainer ✓
- `Config.SessionsDir` present in all new skills ✓
- `trainer.Config.BrainDir` used in handlers.go Task 7 writer task prompt ✓
- `session.FormatHistory` signature `(entries []Entry, excludePhase string) string` used consistently ✓
- `prependHistory` method is defined identically in review, debug, spec handlers — this is intentional duplication (YAGNI: not enough skills to justify extracting a shared mixin) ✓
**Note on tasks 47:** These are independent of each other and can be executed in parallel by separate subagents after Tasks 13 are complete.