# Hyperguild Phase 2 Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Add four new MCP skills (review, debug, spec, trainer) to the hyperguild supervisor, with automatic session history injection into all multi-phase skill workers. **Architecture:** The supervisor reads prior session entries and injects them into each worker's task prompt when `session_id` is provided — the orchestrator no longer re-summarises history. The `trainer` skill runs a two-step sub-agent chain: a reader agent identifies learning moments from the session log, then a writer agent formats them as SFT/DPO pairs and writes to `brain/training-data/`. All other new skills are single-worker. **Tech Stack:** Go 1.26, `internal/session` (JSONL log), `internal/exec` (claude subprocess executor), `internal/registry` (MCP tool registry), `config/supervisor/*.md` (discipline files), `config/models.yaml` (model routing) --- ## File Map ### New files | File | Responsibility | |------|---------------| | `internal/session/history.go` | `FormatHistory(entries, excludePhase)` — formats prior session entries as a prompt block | | `internal/session/history_test.go` | Unit tests for FormatHistory | | `internal/skills/review/skill.go` | review skill: Config, Skill, New, Name, Tools | | `internal/skills/review/handlers.go` | `Handle`, `handleReview` — single-phase code review worker | | `internal/skills/review/handlers_test.go` | review handler unit tests | | `internal/skills/debug/skill.go` | debug skill | | `internal/skills/debug/handlers.go` | `handleDebug` — hypothesis generation worker | | `internal/skills/debug/handlers_test.go` | debug handler unit tests | | `internal/skills/spec/skill.go` | spec skill | | `internal/skills/spec/handlers.go` | `handleSpec` — spec writing worker | | `internal/skills/spec/handlers_test.go` | spec handler unit tests | | `internal/skills/trainer/skill.go` | trainer skill: Config with ReaderPrompt + WriterPrompt + BrainDir | | `internal/skills/trainer/handlers.go` | `handleTrain` — calls ExecutorFn twice: reader then writer | | `internal/skills/trainer/handlers_test.go` | trainer handler unit tests (verifies two-call chain) | | `config/supervisor/review.md` | Review worker discipline file | | `config/supervisor/debug.md` | Debug worker discipline file | | `config/supervisor/spec.md` | Spec worker discipline file | | `config/supervisor/trainer-reader.md` | Trainer reader agent discipline | | `config/supervisor/trainer-writer.md` | Trainer writer agent discipline | ### Modified files | File | Change | |------|--------| | `internal/exec/result.go` | Add review/debug/spec/trainer to `validPhases` and Schema enum | | `internal/skills/tdd/skill.go` | Add `SessionsDir string` to Config; add `session_id` to green/refactor tool schemas | | `internal/skills/tdd/handlers.go` | Add `session_id` to `greenArgs` and `refactorArgs`; inject history when `session_id` non-empty | | `internal/skills/tdd/handlers_test.go` | Add tests for history injection in green and refactor | | `cmd/supervisor/main.go` | Pass `SessionsDir` to tdd.Config; read discipline files and register 4 new skills | | `config/models.yaml` | Add `trainer` model entry | --- ## Task 1: Session history utility **Files:** - Create: `internal/session/history.go` - Create: `internal/session/history_test.go` - [ ] **Step 1: Write the failing test** ```go // internal/session/history_test.go package session_test import ( "testing" "time" "github.com/mathiasbq/supervisor/internal/session" "github.com/stretchr/testify/assert" ) func TestFormatHistoryEmpty(t *testing.T) { result := session.FormatHistory(nil, "") assert.Equal(t, "", result) } func TestFormatHistoryFormatsEntries(t *testing.T) { entries := []session.Entry{ { Skill: "tdd", Phase: "red", FinalStatus: "pass", FilePath: "internal/foo/foo_test.go", Message: "wrote failing test for Foo", Timestamp: time.Now(), }, } result := session.FormatHistory(entries, "") assert.Contains(t, result, "## Session history") assert.Contains(t, result, "Phase: red") assert.Contains(t, result, "wrote failing test for Foo") assert.Contains(t, result, "internal/foo/foo_test.go") } func TestFormatHistoryExcludesCurrentPhase(t *testing.T) { entries := []session.Entry{ {Skill: "tdd", Phase: "red", Message: "red done", FinalStatus: "pass"}, {Skill: "tdd", Phase: "green", Message: "green done", FinalStatus: "pass"}, } result := session.FormatHistory(entries, "green") assert.Contains(t, result, "red done") assert.NotContains(t, result, "green done") } ``` - [ ] **Step 2: Run test to confirm it fails** ```bash cd /path/to/supervisor go test ./internal/session/... -run TestFormatHistory -v ``` Expected: `FAIL` — `session.FormatHistory undefined` - [ ] **Step 3: Implement FormatHistory** ```go // internal/session/history.go package session import ( "fmt" "strings" ) // FormatHistory formats prior session entries as a structured block for // injection into a worker task prompt. Entries matching excludePhase are // omitted (pass the current phase to avoid circular injection). func FormatHistory(entries []Entry, excludePhase string) string { var filtered []Entry for _, e := range entries { if e.Phase != excludePhase { filtered = append(filtered, e) } } if len(filtered) == 0 { return "" } var b strings.Builder b.WriteString("## Session history\n\n") for _, e := range filtered { fmt.Fprintf(&b, "### Phase: %s\n", e.Phase) fmt.Fprintf(&b, "- Skill: %s\n", e.Skill) fmt.Fprintf(&b, "- Status: %s\n", e.FinalStatus) if e.FilePath != "" { fmt.Fprintf(&b, "- File: %s\n", e.FilePath) } if e.Message != "" { fmt.Fprintf(&b, "- Summary: %s\n", e.Message) } b.WriteString("\n") } return b.String() } ``` - [ ] **Step 4: Run tests to confirm they pass** ```bash go test ./internal/session/... -v ``` Expected: all `TestFormatHistory*` tests PASS - [ ] **Step 5: Commit** ```bash git add internal/session/history.go internal/session/history_test.go git commit -m "feat(session): add FormatHistory for worker context injection" ``` --- ## Task 2: Fix Schema enum and validPhases **Files:** - Modify: `internal/exec/result.go` The Schema's `phase` enum currently only lists `["red","green","refactor"]`. Workers using any other phase name get forced to pick from that list (the retrospective worker was returning `"refactor"` as a result). Fix by removing the enum constraint from the schema (let it be any string) and rely solely on `validPhases` for server-side validation. - [ ] **Step 1: Write the failing test** ```go // Add to internal/exec/result_test.go (check existing file first for test helpers) func TestValidateAcceptsAllPhases(t *testing.T) { phases := []string{"red", "green", "refactor", "retrospective", "review", "debug", "spec", "trainer"} for _, phase := range phases { r := exec.Result{Status: "pass", Phase: phase, Skill: "test", ModelUsed: "self", Message: "ok"} assert.NoError(t, r.Validate(), "phase %q should be valid", phase) } } ``` - [ ] **Step 2: Run test to confirm it fails** ```bash go test ./internal/exec/... -run TestValidateAcceptsAllPhases -v ``` Expected: FAIL — `"review"`, `"debug"`, `"spec"`, `"trainer"` fail validation - [ ] **Step 3: Update result.go** In `internal/exec/result.go`, make these two changes: **Change 1** — update `validPhases`: ```go var validPhases = map[string]bool{ "red": true, "green": true, "refactor": true, "retrospective": true, "review": true, "debug": true, "spec": true, "trainer": true, } ``` **Change 2** — remove the enum constraint from the Schema `phase` property (replace the `"phase"` line): ```go // Before: "phase": {"type": "string", "enum": ["red","green","refactor"]}, // After: "phase": {"type": "string"}, ``` Also fix the error message in `Validate()`: ```go // Before: errs = append(errs, "phase must be red|green|refactor, got: "+r.Phase) // After: errs = append(errs, "phase must be one of red|green|refactor|retrospective|review|debug|spec|trainer, got: "+r.Phase) ``` - [ ] **Step 4: Run tests to confirm they pass** ```bash go test ./internal/exec/... -v ``` Expected: all tests PASS, including `TestValidateAcceptsAllPhases` - [ ] **Step 5: Commit** ```bash git add internal/exec/result.go git commit -m "fix(exec): expand validPhases and remove schema enum constraint for phase" ``` --- ## Task 3: Session history injection in TDD green and refactor **Files:** - Modify: `internal/skills/tdd/skill.go` - Modify: `internal/skills/tdd/handlers.go` - Modify: `internal/skills/tdd/handlers_test.go` - Modify: `cmd/supervisor/main.go` - [ ] **Step 1: Write the failing tests** Add to `internal/skills/tdd/handlers_test.go`: ```go func TestTDDGreenInjectsSessionHistory(t *testing.T) { sessDir := t.TempDir() require.NoError(t, session.Append(sessDir, "sess-1", session.Entry{ SessionID: "sess-1", Skill: "tdd", Phase: "red", FinalStatus: "pass", FilePath: "internal/foo/foo_test.go", Message: "wrote failing test for Foo", })) var capturedPrompt string fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) { capturedPrompt = req.TaskPrompt return iexec.Result{Status: "pass", Phase: "green", Skill: "tdd", Verified: true, ModelUsed: "self", Message: "ok"}, nil } sk := tdd.New(tdd.Config{SkillPrompt: "tdd", ExecutorFn: fakeFn, SessionsDir: sessDir}) _, err := sk.Handle(context.Background(), "tdd_green", json.RawMessage( `{"project_root":"/tmp","test_path":"internal/foo/foo_test.go","test_cmd":"go test ./...","session_id":"sess-1"}`, )) require.NoError(t, err) assert.Contains(t, capturedPrompt, "## Session history") assert.Contains(t, capturedPrompt, "wrote failing test for Foo") } func TestTDDGreenNoHistoryWhenSessionIDEmpty(t *testing.T) { var capturedPrompt string fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) { capturedPrompt = req.TaskPrompt return iexec.Result{Status: "pass", Phase: "green", Skill: "tdd", Verified: true, ModelUsed: "self", Message: "ok"}, nil } sk := tdd.New(tdd.Config{SkillPrompt: "tdd", ExecutorFn: fakeFn, SessionsDir: t.TempDir()}) _, err := sk.Handle(context.Background(), "tdd_green", json.RawMessage( `{"project_root":"/tmp","test_path":"internal/foo/foo_test.go"}`, )) require.NoError(t, err) assert.NotContains(t, capturedPrompt, "## Session history") } ``` You will need these imports in the test file: ```go import ( "github.com/mathiasbq/supervisor/internal/session" iexec "github.com/mathiasbq/supervisor/internal/exec" ) ``` - [ ] **Step 2: Run tests to confirm they fail** ```bash go test ./internal/skills/tdd/... -run TestTDDGreen -v ``` Expected: FAIL — `tdd.Config` has no `SessionsDir` field, `session_id` not in args - [ ] **Step 3: Add SessionsDir to Config and session_id to tool schemas** In `internal/skills/tdd/skill.go`, add `SessionsDir` to Config: ```go type Config struct { SystemPrompt string SkillPrompt string ExecutorFn ExecutorFn DefaultModel string SessionsDir string // optional: path to brain/sessions/ for history injection } ``` In `Tools()`, add `"session_id"` as an optional property to `tdd_green` and `tdd_refactor` input schemas: ```go // tdd_green InputSchema: InputSchema: schema( []string{"project_root", "test_path"}, map[string]any{ "project_root": strProp, "test_path": strProp, "model": strProp, "test_cmd": strProp, "session_id": strProp, }, ), // tdd_refactor InputSchema (same addition): InputSchema: schema( []string{"project_root", "test_path", "impl_path"}, map[string]any{ "project_root": strProp, "test_path": strProp, "impl_path": strProp, "model": strProp, "test_cmd": strProp, "session_id": strProp, }, ), ``` - [ ] **Step 4: Update greenArgs, refactorArgs, and inject history** In `internal/skills/tdd/handlers.go`, add imports and update structs + handlers: ```go import ( // existing imports... "github.com/mathiasbq/supervisor/internal/session" ) type greenArgs struct { ProjectRoot string `json:"project_root"` TestPath string `json:"test_path"` Model string `json:"model"` TestCmd string `json:"test_cmd"` SessionID string `json:"session_id"` } func (s *Skill) handleGreen(ctx context.Context, raw json.RawMessage) (json.RawMessage, error) { var args greenArgs if err := json.Unmarshal(raw, &args); err != nil { return nil, fmt.Errorf("parse args: %w", err) } if args.ProjectRoot == "" { return nil, fmt.Errorf("project_root is required") } if args.TestPath == "" { return nil, fmt.Errorf("test_path is required") } task := fmt.Sprintf( "phase: green\nproject_root: %s\ntest_path: %s\nmodel: %s\ntest_cmd: %s", args.ProjectRoot, args.TestPath, s.resolveModel(args.Model), args.TestCmd, ) task = s.prependHistory(args.SessionID, "green", task) return s.execute(ctx, task) } type refactorArgs struct { ProjectRoot string `json:"project_root"` TestPath string `json:"test_path"` ImplPath string `json:"impl_path"` Model string `json:"model"` TestCmd string `json:"test_cmd"` SessionID string `json:"session_id"` } func (s *Skill) handleRefactor(ctx context.Context, raw json.RawMessage) (json.RawMessage, error) { var args refactorArgs if err := json.Unmarshal(raw, &args); err != nil { return nil, fmt.Errorf("parse args: %w", err) } if args.ProjectRoot == "" { return nil, fmt.Errorf("project_root is required") } if args.TestPath == "" { return nil, fmt.Errorf("test_path is required") } if args.ImplPath == "" { return nil, fmt.Errorf("impl_path is required") } task := fmt.Sprintf( "phase: refactor\nproject_root: %s\ntest_path: %s\nimpl_path: %s\nmodel: %s\ntest_cmd: %s", args.ProjectRoot, args.TestPath, args.ImplPath, s.resolveModel(args.Model), args.TestCmd, ) task = s.prependHistory(args.SessionID, "refactor", task) return s.execute(ctx, task) } // prependHistory reads the session log and prepends prior phase entries to the task prompt. // If sessionID is empty or SessionsDir is not configured, task is returned unchanged. func (s *Skill) prependHistory(sessionID, currentPhase, task string) string { if sessionID == "" || s.cfg.SessionsDir == "" { return task } entries, err := session.Read(s.cfg.SessionsDir, sessionID) if err != nil || len(entries) == 0 { return task } history := session.FormatHistory(entries, currentPhase) if history == "" { return task } return history + "\n---\n\n" + task } ``` - [ ] **Step 5: Update main.go to pass SessionsDir to tdd.Config** In `cmd/supervisor/main.go`, find the `tdd.New(tdd.Config{...})` call and add `SessionsDir`: ```go reg.Register(tdd.New(tdd.Config{ SystemPrompt: string(systemPrompt), SkillPrompt: string(tddPrompt), DefaultModel: models.Resolve("tdd", ""), ExecutorFn: executor.Run, SessionsDir: cfg.SessionsDir, // ← add this line })) ``` - [ ] **Step 6: Run all tests** ```bash go test ./... -race -count=1 ``` Expected: all tests PASS - [ ] **Step 7: Commit** ```bash git add internal/skills/tdd/skill.go internal/skills/tdd/handlers.go internal/skills/tdd/handlers_test.go cmd/supervisor/main.go git commit -m "feat(tdd): inject session history into green and refactor worker prompts" ``` --- ## Task 4: review skill **Files:** - Create: `internal/skills/review/skill.go` - Create: `internal/skills/review/handlers.go` - Create: `internal/skills/review/handlers_test.go` - Create: `config/supervisor/review.md` - Modify: `cmd/supervisor/main.go` - [ ] **Step 1: Write the failing test** ```go // internal/skills/review/handlers_test.go package review_test import ( "context" "encoding/json" "testing" iexec "github.com/mathiasbq/supervisor/internal/exec" "github.com/mathiasbq/supervisor/internal/skills/review" "github.com/stretchr/testify/assert" "github.com/stretchr/testify/require" ) func TestReviewToolRegistered(t *testing.T) { sk := review.New(review.Config{SkillPrompt: "review rules"}) names := make([]string, 0) for _, tool := range sk.Tools() { names = append(names, tool.Name) } assert.Contains(t, names, "review") } func TestReviewRequiresProjectRoot(t *testing.T) { sk := review.New(review.Config{SkillPrompt: "r"}) _, err := sk.Handle(context.Background(), "review", json.RawMessage(`{"files":["main.go"]}`)) assert.ErrorContains(t, err, "project_root") } func TestReviewRequiresFiles(t *testing.T) { sk := review.New(review.Config{SkillPrompt: "r"}) _, err := sk.Handle(context.Background(), "review", json.RawMessage(`{"project_root":"/tmp"}`)) assert.ErrorContains(t, err, "files") } func TestReviewCallsExecutor(t *testing.T) { called := false var capturedTask string fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) { called = true capturedTask = req.TaskPrompt return iexec.Result{ Status: "pass", Phase: "review", Skill: "review", Verified: true, ModelUsed: "self", Message: "2 warnings found", }, nil } sk := review.New(review.Config{SkillPrompt: "review rules", ExecutorFn: fakeFn, SessionsDir: t.TempDir()}) out, err := sk.Handle(context.Background(), "review", json.RawMessage( `{"project_root":"/tmp/proj","files":["internal/foo/foo.go"],"context":"PR: add Foo helper"}`, )) require.NoError(t, err) assert.True(t, called) assert.Contains(t, capturedTask, "internal/foo/foo.go") assert.Contains(t, capturedTask, "PR: add Foo helper") var result iexec.Result require.NoError(t, json.Unmarshal(out, &result)) assert.Equal(t, "pass", result.Status) assert.Equal(t, "review", result.Phase) } var _ = require.New ``` - [ ] **Step 2: Run test to confirm it fails** ```bash go test ./internal/skills/review/... -v ``` Expected: FAIL — package `review` does not exist - [ ] **Step 3: Create skill.go** ```go // internal/skills/review/skill.go package review import ( "context" "encoding/json" iexec "github.com/mathiasbq/supervisor/internal/exec" "github.com/mathiasbq/supervisor/internal/registry" ) type ExecutorFn func(ctx context.Context, req iexec.Request) (iexec.Result, error) type Config struct { SkillPrompt string DefaultModel string ExecutorFn ExecutorFn SessionsDir string } type Skill struct{ cfg Config } func New(cfg Config) *Skill { return &Skill{cfg: cfg} } func (s *Skill) Name() string { return "review" } func (s *Skill) Tools() []registry.ToolDef { schema := func(required []string, props map[string]any) json.RawMessage { b, _ := json.Marshal(map[string]any{"type": "object", "required": required, "properties": props}) return b } return []registry.ToolDef{ { Name: "review", Description: "Perform a structured code review of the specified files. Returns findings with severity levels.", InputSchema: schema( []string{"project_root", "files"}, map[string]any{ "project_root": map[string]any{"type": "string"}, "files": map[string]any{"type": "array", "items": map[string]any{"type": "string"}}, "context": map[string]any{"type": "string"}, "model": map[string]any{"type": "string"}, "session_id": map[string]any{"type": "string"}, }, ), }, } } ``` - [ ] **Step 4: Create handlers.go** ```go // internal/skills/review/handlers.go package review import ( "context" "encoding/json" "fmt" "strings" iexec "github.com/mathiasbq/supervisor/internal/exec" "github.com/mathiasbq/supervisor/internal/session" ) type reviewArgs struct { ProjectRoot string `json:"project_root"` Files []string `json:"files"` Context string `json:"context"` Model string `json:"model"` SessionID string `json:"session_id"` } func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) { if tool != "review" { return nil, fmt.Errorf("unknown tool: %s", tool) } var a reviewArgs if err := json.Unmarshal(args, &a); err != nil { return nil, fmt.Errorf("parse args: %w", err) } if a.ProjectRoot == "" { return nil, fmt.Errorf("project_root is required") } if len(a.Files) == 0 { return nil, fmt.Errorf("files is required") } model := a.Model if model == "" { model = s.cfg.DefaultModel } task := fmt.Sprintf( "phase: review\nproject_root: %s\nfiles: %s\ncontext: %s\nmodel: %s", a.ProjectRoot, strings.Join(a.Files, ", "), a.Context, model, ) task = s.prependHistory(a.SessionID, "review", task) if s.cfg.ExecutorFn == nil { return nil, fmt.Errorf("no executor configured") } result, err := s.cfg.ExecutorFn(ctx, iexec.Request{ SkillPrompt: s.cfg.SkillPrompt, TaskPrompt: task, Model: model, Tools: "Read,Bash", }) if err != nil { return nil, err } b, err := json.Marshal(result) if err != nil { return nil, fmt.Errorf("marshal result: %w", err) } return b, nil } func (s *Skill) prependHistory(sessionID, currentPhase, task string) string { if sessionID == "" || s.cfg.SessionsDir == "" { return task } entries, err := session.Read(s.cfg.SessionsDir, sessionID) if err != nil || len(entries) == 0 { return task } history := session.FormatHistory(entries, currentPhase) if history == "" { return task } return history + "\n---\n\n" + task } ``` - [ ] **Step 5: Create config/supervisor/review.md** ```markdown # Code Review Discipline You are a disciplined code reviewer. Read files carefully before commenting. ## Iron laws 1. Never approve security vulnerabilities: command injection, SQL injection, credential exposure, path traversal, unchecked input at system boundaries 2. Never approve silently swallowed errors — `err != nil` without wrapping or handling is always wrong 3. Never approve missing validation at system boundaries (user input, external APIs, file reads) ## Output contract Return JSON result with: - `status`: "pass" if no blocking issues; "fail" if any iron law is violated - `phase`: "review" - `skill`: "review" - `file_path`: first file reviewed - `runner_output`: full review formatted as: ``` CRITICAL: at : WARNING: at : SUGGESTION: at : ``` - `verified`: true if you read all specified files; false if any were missing or unreadable - `message`: "N critical, M warnings, K suggestions" or "clean: " ## Rules 1. Read every file listed before writing feedback 2. Check iron laws first — any violation is CRITICAL and sets status to "fail" 3. Then check: correctness, test coverage for new code, Go style conventions 4. Never rubber-stamp — if nothing is wrong, explain specifically which iron law checks you ran and why they passed 5. Line references are required for every finding — "roughly around the middle" is not acceptable ``` - [ ] **Step 6: Wire into main.go** In `cmd/supervisor/main.go`, add after existing `os.ReadFile` calls for discipline files: ```go reviewPrompt, err := os.ReadFile(cfg.ConfigDir + "/review.md") if err != nil { logger.Error("read review.md", "path", cfg.ConfigDir+"/review.md", "err", err) os.Exit(1) } ``` Add import: `"github.com/mathiasbq/supervisor/internal/skills/review"` Register after existing `reg.Register` calls: ```go reg.Register(review.New(review.Config{ SkillPrompt: string(reviewPrompt), DefaultModel: models.Resolve("review", ""), ExecutorFn: executor.Run, SessionsDir: cfg.SessionsDir, })) ``` - [ ] **Step 7: Run all tests** ```bash go test ./... -race -count=1 ``` Expected: all tests PASS - [ ] **Step 8: Commit** ```bash git add internal/skills/review/ config/supervisor/review.md cmd/supervisor/main.go git commit -m "feat(review): add code review MCP skill with session history injection" ``` --- ## Task 5: debug skill **Files:** - Create: `internal/skills/debug/skill.go` - Create: `internal/skills/debug/handlers.go` - Create: `internal/skills/debug/handlers_test.go` - Create: `config/supervisor/debug.md` - Modify: `cmd/supervisor/main.go` - [ ] **Step 1: Write the failing test** ```go // internal/skills/debug/handlers_test.go package debug_test import ( "context" "encoding/json" "testing" iexec "github.com/mathiasbq/supervisor/internal/exec" "github.com/mathiasbq/supervisor/internal/skills/debug" "github.com/stretchr/testify/assert" "github.com/stretchr/testify/require" ) func TestDebugToolRegistered(t *testing.T) { sk := debug.New(debug.Config{SkillPrompt: "debug rules"}) names := make([]string, 0) for _, tool := range sk.Tools() { names = append(names, tool.Name) } assert.Contains(t, names, "debug") } func TestDebugRequiresProjectRoot(t *testing.T) { sk := debug.New(debug.Config{SkillPrompt: "d"}) _, err := sk.Handle(context.Background(), "debug", json.RawMessage(`{"error":"panic: nil pointer"}`)) assert.ErrorContains(t, err, "project_root") } func TestDebugRequiresError(t *testing.T) { sk := debug.New(debug.Config{SkillPrompt: "d"}) _, err := sk.Handle(context.Background(), "debug", json.RawMessage(`{"project_root":"/tmp"}`)) assert.ErrorContains(t, err, "error") } func TestDebugCallsExecutor(t *testing.T) { called := false var capturedTask string fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) { called = true capturedTask = req.TaskPrompt return iexec.Result{ Status: "pass", Phase: "debug", Skill: "debug", RunnerOutput: "HYPOTHESIS 1 (likelihood: high): nil map access\nVERIFY: go test ./... → expected: panic line reference", Verified: false, ModelUsed: "self", Message: "3 hypotheses for: panic nil pointer at foo.go:42", }, nil } sk := debug.New(debug.Config{SkillPrompt: "debug rules", ExecutorFn: fakeFn, SessionsDir: t.TempDir()}) out, err := sk.Handle(context.Background(), "debug", json.RawMessage( `{"project_root":"/tmp/proj","error":"panic: nil pointer dereference at foo.go:42","context":"occurs on startup"}`, )) require.NoError(t, err) assert.True(t, called) assert.Contains(t, capturedTask, "panic: nil pointer dereference") assert.Contains(t, capturedTask, "occurs on startup") var result iexec.Result require.NoError(t, json.Unmarshal(out, &result)) assert.Equal(t, "debug", result.Phase) } var _ = require.New ``` - [ ] **Step 2: Run test to confirm it fails** ```bash go test ./internal/skills/debug/... -v ``` Expected: FAIL — package does not exist - [ ] **Step 3: Create skill.go** ```go // internal/skills/debug/skill.go package debug import ( "context" "encoding/json" iexec "github.com/mathiasbq/supervisor/internal/exec" "github.com/mathiasbq/supervisor/internal/registry" ) type ExecutorFn func(ctx context.Context, req iexec.Request) (iexec.Result, error) type Config struct { SkillPrompt string DefaultModel string ExecutorFn ExecutorFn SessionsDir string } type Skill struct{ cfg Config } func New(cfg Config) *Skill { return &Skill{cfg: cfg} } func (s *Skill) Name() string { return "debug" } func (s *Skill) Tools() []registry.ToolDef { schema := func(required []string, props map[string]any) json.RawMessage { b, _ := json.Marshal(map[string]any{"type": "object", "required": required, "properties": props}) return b } str := map[string]any{"type": "string"} return []registry.ToolDef{ { Name: "debug", Description: "Analyse an error and return 3-5 hypotheses ordered by likelihood, each with a concrete verification step.", InputSchema: schema( []string{"project_root", "error"}, map[string]any{ "project_root": str, "error": str, "context": str, "model": str, "session_id": str, }, ), }, } } ``` - [ ] **Step 4: Create handlers.go** ```go // internal/skills/debug/handlers.go package debug import ( "context" "encoding/json" "fmt" iexec "github.com/mathiasbq/supervisor/internal/exec" "github.com/mathiasbq/supervisor/internal/session" ) type debugArgs struct { ProjectRoot string `json:"project_root"` Error string `json:"error"` Context string `json:"context"` Model string `json:"model"` SessionID string `json:"session_id"` } func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) { if tool != "debug" { return nil, fmt.Errorf("unknown tool: %s", tool) } var a debugArgs if err := json.Unmarshal(args, &a); err != nil { return nil, fmt.Errorf("parse args: %w", err) } if a.ProjectRoot == "" { return nil, fmt.Errorf("project_root is required") } if a.Error == "" { return nil, fmt.Errorf("error is required") } model := a.Model if model == "" { model = s.cfg.DefaultModel } task := fmt.Sprintf( "phase: debug\nproject_root: %s\nerror: %s\ncontext: %s\nmodel: %s", a.ProjectRoot, a.Error, a.Context, model, ) task = s.prependHistory(a.SessionID, "debug", task) if s.cfg.ExecutorFn == nil { return nil, fmt.Errorf("no executor configured") } result, err := s.cfg.ExecutorFn(ctx, iexec.Request{ SkillPrompt: s.cfg.SkillPrompt, TaskPrompt: task, Model: model, Tools: "Read,Bash", }) if err != nil { return nil, err } b, err := json.Marshal(result) if err != nil { return nil, fmt.Errorf("marshal result: %w", err) } return b, nil } func (s *Skill) prependHistory(sessionID, currentPhase, task string) string { if sessionID == "" || s.cfg.SessionsDir == "" { return task } entries, err := session.Read(s.cfg.SessionsDir, sessionID) if err != nil || len(entries) == 0 { return task } history := session.FormatHistory(entries, currentPhase) if history == "" { return task } return history + "\n---\n\n" + task } ``` - [ ] **Step 5: Create config/supervisor/debug.md** ```markdown # Debug Discipline You are a systematic debugger. Form hypotheses before suggesting fixes. ## Iron laws 1. Never suggest "try X and see what happens" — every hypothesis must have a specific expected outcome if correct 2. Generate exactly 3-5 hypotheses, ordered by likelihood (most likely first) 3. Never fix the bug — diagnose only; the caller decides what to do with the hypotheses ## Output contract Return JSON result with: - `status`: "pass" (hypotheses generated) or "error" (error too ambiguous to analyse) - `phase`: "debug" - `skill`: "debug" - `file_path`: the most relevant file to the error (read it) - `runner_output`: your hypotheses, formatted as: ``` HYPOTHESIS 1 (likelihood: high): VERIFY: → expected if correct: HYPOTHESIS 2 (likelihood: medium): VERIFY: → expected if correct: ``` - `verified`: false — verification is the caller's job - `message`: "N hypotheses for: " ## Rules 1. Read the error and any context files provided before forming hypotheses 2. Identify the failure mode first — what actually went wrong, not just what the error says 3. For each hypothesis: name the mechanism, explain why it would produce this exact error, give a concrete verification command with expected output 4. If the error is clearly a typo or trivial mistake, still form 3 hypotheses — surface the most likely cause as #1 ``` - [ ] **Step 6: Wire into main.go** Add file read: ```go debugPrompt, err := os.ReadFile(cfg.ConfigDir + "/debug.md") if err != nil { logger.Error("read debug.md", "path", cfg.ConfigDir+"/debug.md", "err", err) os.Exit(1) } ``` Add import: `"github.com/mathiasbq/supervisor/internal/skills/debug"` Register skill: ```go reg.Register(debug.New(debug.Config{ SkillPrompt: string(debugPrompt), DefaultModel: models.Resolve("debug", ""), ExecutorFn: executor.Run, SessionsDir: cfg.SessionsDir, })) ``` - [ ] **Step 7: Run all tests** ```bash go test ./... -race -count=1 ``` Expected: all tests PASS - [ ] **Step 8: Commit** ```bash git add internal/skills/debug/ config/supervisor/debug.md cmd/supervisor/main.go git commit -m "feat(debug): add debug MCP skill with hypothesis generation" ``` --- ## Task 6: spec skill **Files:** - Create: `internal/skills/spec/skill.go` - Create: `internal/skills/spec/handlers.go` - Create: `internal/skills/spec/handlers_test.go` - Create: `config/supervisor/spec.md` - Modify: `cmd/supervisor/main.go` - [ ] **Step 1: Write the failing test** ```go // internal/skills/spec/handlers_test.go package spec_test import ( "context" "encoding/json" "testing" iexec "github.com/mathiasbq/supervisor/internal/exec" "github.com/mathiasbq/supervisor/internal/skills/spec" "github.com/stretchr/testify/assert" "github.com/stretchr/testify/require" ) func TestSpecToolRegistered(t *testing.T) { sk := spec.New(spec.Config{SkillPrompt: "spec rules"}) names := make([]string, 0) for _, tool := range sk.Tools() { names = append(names, tool.Name) } assert.Contains(t, names, "spec") } func TestSpecRequiresProjectRoot(t *testing.T) { sk := spec.New(spec.Config{SkillPrompt: "s"}) _, err := sk.Handle(context.Background(), "spec", json.RawMessage(`{"requirements":"add login"}`)) assert.ErrorContains(t, err, "project_root") } func TestSpecRequiresRequirements(t *testing.T) { sk := spec.New(spec.Config{SkillPrompt: "s"}) _, err := sk.Handle(context.Background(), "spec", json.RawMessage(`{"project_root":"/tmp"}`)) assert.ErrorContains(t, err, "requirements") } func TestSpecCallsExecutor(t *testing.T) { called := false var capturedTask string fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) { called = true capturedTask = req.TaskPrompt return iexec.Result{ Status: "pass", Phase: "spec", Skill: "spec", FilePath: "/tmp/proj/docs/login-spec.md", Verified: true, ModelUsed: "self", Message: "spec written: login feature", }, nil } sk := spec.New(spec.Config{SkillPrompt: "spec rules", ExecutorFn: fakeFn, SessionsDir: t.TempDir()}) out, err := sk.Handle(context.Background(), "spec", json.RawMessage( `{"project_root":"/tmp/proj","requirements":"add OAuth2 login","output_path":"docs/login-spec.md"}`, )) require.NoError(t, err) assert.True(t, called) assert.Contains(t, capturedTask, "OAuth2 login") assert.Contains(t, capturedTask, "docs/login-spec.md") var result iexec.Result require.NoError(t, json.Unmarshal(out, &result)) assert.Equal(t, "spec", result.Phase) } var _ = require.New ``` - [ ] **Step 2: Run test to confirm it fails** ```bash go test ./internal/skills/spec/... -v ``` Expected: FAIL — package does not exist - [ ] **Step 3: Create skill.go** ```go // internal/skills/spec/skill.go package spec import ( "context" "encoding/json" iexec "github.com/mathiasbq/supervisor/internal/exec" "github.com/mathiasbq/supervisor/internal/registry" ) type ExecutorFn func(ctx context.Context, req iexec.Request) (iexec.Result, error) type Config struct { SkillPrompt string DefaultModel string ExecutorFn ExecutorFn SessionsDir string } type Skill struct{ cfg Config } func New(cfg Config) *Skill { return &Skill{cfg: cfg} } func (s *Skill) Name() string { return "spec" } func (s *Skill) Tools() []registry.ToolDef { schema := func(required []string, props map[string]any) json.RawMessage { b, _ := json.Marshal(map[string]any{"type": "object", "required": required, "properties": props}) return b } str := map[string]any{"type": "string"} return []registry.ToolDef{ { Name: "spec", Description: "Generate a structured implementation spec from requirements. Writes the spec to output_path in the project.", InputSchema: schema( []string{"project_root", "requirements"}, map[string]any{ "project_root": str, "requirements": str, "output_path": str, "context": str, "model": str, "session_id": str, }, ), }, } } ``` - [ ] **Step 4: Create handlers.go** ```go // internal/skills/spec/handlers.go package spec import ( "context" "encoding/json" "fmt" iexec "github.com/mathiasbq/supervisor/internal/exec" "github.com/mathiasbq/supervisor/internal/session" ) type specArgs struct { ProjectRoot string `json:"project_root"` Requirements string `json:"requirements"` OutputPath string `json:"output_path"` Context string `json:"context"` Model string `json:"model"` SessionID string `json:"session_id"` } func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) { if tool != "spec" { return nil, fmt.Errorf("unknown tool: %s", tool) } var a specArgs if err := json.Unmarshal(args, &a); err != nil { return nil, fmt.Errorf("parse args: %w", err) } if a.ProjectRoot == "" { return nil, fmt.Errorf("project_root is required") } if a.Requirements == "" { return nil, fmt.Errorf("requirements is required") } outputPath := a.OutputPath if outputPath == "" { outputPath = "docs/spec.md" } model := a.Model if model == "" { model = s.cfg.DefaultModel } task := fmt.Sprintf( "phase: spec\nproject_root: %s\nrequirements: %s\noutput_path: %s\ncontext: %s\nmodel: %s", a.ProjectRoot, a.Requirements, outputPath, a.Context, model, ) task = s.prependHistory(a.SessionID, "spec", task) if s.cfg.ExecutorFn == nil { return nil, fmt.Errorf("no executor configured") } result, err := s.cfg.ExecutorFn(ctx, iexec.Request{ SkillPrompt: s.cfg.SkillPrompt, TaskPrompt: task, Model: model, Tools: "Read,Write", }) if err != nil { return nil, err } b, err := json.Marshal(result) if err != nil { return nil, fmt.Errorf("marshal result: %w", err) } return b, nil } func (s *Skill) prependHistory(sessionID, currentPhase, task string) string { if sessionID == "" || s.cfg.SessionsDir == "" { return task } entries, err := session.Read(s.cfg.SessionsDir, sessionID) if err != nil || len(entries) == 0 { return task } history := session.FormatHistory(entries, currentPhase) if history == "" { return task } return history + "\n---\n\n" + task } ``` - [ ] **Step 5: Create config/supervisor/spec.md** ```markdown # Spec Writing Discipline You write structured implementation specs. Nothing is left ambiguous. ## Iron laws 1. Success criteria must be measurable — "the system is fast" is banned; "p99 < 200ms under 100 RPS" is valid 2. Always include an explicit "Out of scope" section — if you don't draw the boundary, the developer will guess wrong 3. Every technical decision in the approach must have a rationale ## Output contract Return JSON result with: - `status`: "pass" (spec written) or "error" (requirements too ambiguous to spec without more input) - `phase`: "spec" - `skill`: "spec" - `file_path`: the output_path where the spec was written (absolute path) - `runner_output`: "" - `verified`: true if the file was written successfully - `message`: "spec written: " ## Spec structure Write the spec as markdown to the output_path: ```markdown # [Feature] Spec ## Problem statement [What problem does this solve? For whom? Why now?] ## Success criteria - [ ] [Criterion 1 — measurable and verifiable] - [ ] [Criterion 2 — measurable and verifiable] ## Constraints [Non-negotiable requirements the solution must satisfy] ## Out of scope [What we are explicitly NOT doing in this iteration] ## Technical approach [Architecture decisions, key components, rationale for each choice] ## Risks [What could go wrong, and how we'd mitigate it] ``` If the requirements are too vague to produce measurable success criteria, return status "error" with a message listing the specific questions that need answers. ``` - [ ] **Step 6: Wire into main.go** Add file read: ```go specPrompt, err := os.ReadFile(cfg.ConfigDir + "/spec.md") if err != nil { logger.Error("read spec.md", "path", cfg.ConfigDir+"/spec.md", "err", err) os.Exit(1) } ``` Add import: `"github.com/mathiasbq/supervisor/internal/skills/spec"` Register skill: ```go reg.Register(spec.New(spec.Config{ SkillPrompt: string(specPrompt), DefaultModel: models.Resolve("spec", ""), ExecutorFn: executor.Run, SessionsDir: cfg.SessionsDir, })) ``` - [ ] **Step 7: Run all tests** ```bash go test ./... -race -count=1 ``` Expected: all tests PASS - [ ] **Step 8: Commit** ```bash git add internal/skills/spec/ config/supervisor/spec.md cmd/supervisor/main.go git commit -m "feat(spec): add spec writing MCP skill" ``` --- ## Task 7: trainer skill **Files:** - Create: `internal/skills/trainer/skill.go` - Create: `internal/skills/trainer/handlers.go` - Create: `internal/skills/trainer/handlers_test.go` - Create: `config/supervisor/trainer-reader.md` - Create: `config/supervisor/trainer-writer.md` - Modify: `cmd/supervisor/main.go` - Modify: `config/models.yaml` - [ ] **Step 1: Write the failing test** ```go // internal/skills/trainer/handlers_test.go package trainer_test import ( "context" "encoding/json" "testing" iexec "github.com/mathiasbq/supervisor/internal/exec" "github.com/mathiasbq/supervisor/internal/session" "github.com/mathiasbq/supervisor/internal/skills/trainer" "github.com/stretchr/testify/assert" "github.com/stretchr/testify/require" ) func TestTrainerToolRegistered(t *testing.T) { sk := trainer.New(trainer.Config{ReaderPrompt: "r", WriterPrompt: "w"}) names := make([]string, 0) for _, tool := range sk.Tools() { names = append(names, tool.Name) } assert.Contains(t, names, "trainer") } func TestTrainerRequiresSessionID(t *testing.T) { sk := trainer.New(trainer.Config{ReaderPrompt: "r", WriterPrompt: "w"}) _, err := sk.Handle(context.Background(), "trainer", json.RawMessage(`{}`)) assert.ErrorContains(t, err, "session_id") } func TestTrainerCallsReaderThenWriter(t *testing.T) { sessDir := t.TempDir() require.NoError(t, session.Append(sessDir, "sess-1", session.Entry{ SessionID: "sess-1", Skill: "tdd", Phase: "red", FinalStatus: "pass", Message: "wrote failing test", FilePath: "internal/foo/foo_test.go", })) callCount := 0 var readerTask, writerTask string fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) { callCount++ if callCount == 1 { // reader call readerTask = req.TaskPrompt return iexec.Result{ Status: "pass", Phase: "trainer", Skill: "trainer", RunnerOutput: `[{"type":"sft","moment":"first-pass clean TDD","score":4}]`, Verified: true, ModelUsed: "self", Message: "1 sft candidate found", }, nil } // writer call writerTask = req.TaskPrompt return iexec.Result{ Status: "pass", Phase: "trainer", Skill: "trainer", FilePath: sessDir + "/training-data/sft/sess-1.jsonl", Verified: true, ModelUsed: "self", Message: "1 sft pair written", }, nil } sk := trainer.New(trainer.Config{ ReaderPrompt: "reader rules", WriterPrompt: "writer rules", ExecutorFn: fakeFn, SessionsDir: sessDir, BrainDir: t.TempDir(), }) out, err := sk.Handle(context.Background(), "trainer", json.RawMessage(`{"session_id":"sess-1"}`)) require.NoError(t, err) assert.Equal(t, 2, callCount, "executor must be called exactly twice: reader then writer") assert.Contains(t, readerTask, "role: reader") assert.Contains(t, readerTask, "sess-1") assert.Contains(t, readerTask, "wrote failing test") // session history in reader prompt assert.Contains(t, writerTask, "role: writer") assert.Contains(t, writerTask, "sft candidate") // reader output passed to writer var result iexec.Result require.NoError(t, json.Unmarshal(out, &result)) assert.Equal(t, "trainer", result.Phase) assert.Equal(t, "pass", result.Status) } var _ = require.New ``` - [ ] **Step 2: Run test to confirm it fails** ```bash go test ./internal/skills/trainer/... -v ``` Expected: FAIL — package does not exist - [ ] **Step 3: Create skill.go** ```go // internal/skills/trainer/skill.go package trainer import ( "context" "encoding/json" iexec "github.com/mathiasbq/supervisor/internal/exec" "github.com/mathiasbq/supervisor/internal/registry" ) type ExecutorFn func(ctx context.Context, req iexec.Request) (iexec.Result, error) type Config struct { ReaderPrompt string WriterPrompt string DefaultModel string ExecutorFn ExecutorFn SessionsDir string BrainDir string // root of brain/ directory; writer writes to BrainDir/training-data/ } type Skill struct{ cfg Config } func New(cfg Config) *Skill { return &Skill{cfg: cfg} } func (s *Skill) Name() string { return "trainer" } func (s *Skill) Tools() []registry.ToolDef { schema := func(required []string, props map[string]any) json.RawMessage { b, _ := json.Marshal(map[string]any{"type": "object", "required": required, "properties": props}) return b } return []registry.ToolDef{ { Name: "trainer", Description: "Extract SFT and DPO training pairs from a session log. Runs a reader→writer chain: reader identifies learning moments, writer formats and writes pairs to brain/training-data/.", InputSchema: schema( []string{"session_id"}, map[string]any{ "session_id": map[string]any{"type": "string"}, "model": map[string]any{"type": "string"}, }, ), }, } } ``` - [ ] **Step 4: Create handlers.go** ```go // internal/skills/trainer/handlers.go package trainer import ( "context" "encoding/json" "fmt" iexec "github.com/mathiasbq/supervisor/internal/exec" "github.com/mathiasbq/supervisor/internal/session" ) type trainArgs struct { SessionID string `json:"session_id"` Model string `json:"model"` } func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) { if tool != "trainer" { return nil, fmt.Errorf("unknown tool: %s", tool) } var a trainArgs if err := json.Unmarshal(args, &a); err != nil { return nil, fmt.Errorf("parse args: %w", err) } if a.SessionID == "" { return nil, fmt.Errorf("session_id is required") } if s.cfg.ExecutorFn == nil { return nil, fmt.Errorf("no executor configured") } model := a.Model if model == "" { model = s.cfg.DefaultModel } entries, err := session.Read(s.cfg.SessionsDir, a.SessionID) if err != nil { return nil, fmt.Errorf("read session log: %w", err) } // ── Step 1: Reader agent ───────────────────────────────────────────────── history := session.FormatHistory(entries, "") readerTask := fmt.Sprintf( "role: reader\nsession_id: %s\nbrain_dir: %s\n\n%s", a.SessionID, s.cfg.BrainDir, history, ) readerResult, err := s.cfg.ExecutorFn(ctx, iexec.Request{ SkillPrompt: s.cfg.ReaderPrompt, TaskPrompt: readerTask, Model: model, Tools: "Read", }) if err != nil { return nil, fmt.Errorf("reader agent: %w", err) } // ── Step 2: Writer agent (receives reader candidates) ──────────────────── writerTask := fmt.Sprintf( "role: writer\nsession_id: %s\nbrain_dir: %s\n\nreader_candidates:\n%s", a.SessionID, s.cfg.BrainDir, readerResult.RunnerOutput, ) writerResult, err := s.cfg.ExecutorFn(ctx, iexec.Request{ SkillPrompt: s.cfg.WriterPrompt, TaskPrompt: writerTask, Model: model, Tools: "Read,Write", }) if err != nil { return nil, fmt.Errorf("writer agent: %w", err) } b, err := json.Marshal(writerResult) if err != nil { return nil, fmt.Errorf("marshal result: %w", err) } return b, nil } ``` - [ ] **Step 5: Create config/supervisor/trainer-reader.md** ```markdown # Trainer Reader Discipline You scan session logs and identify candidate learning moments worth converting to training data. ## What to look for - **SFT candidates**: the worker did exactly the right thing — a clean pattern worth reinforcing - **DPO candidates**: the worker first produced a wrong or suboptimal response, then corrected — you have both rejected and chosen ## Scoring (1–5) - 5: novel pattern, clearly correct, generalises across projects - 4: good pattern, correct, somewhat project-specific but still useful - 3: correct but obvious — include only if especially clean - 2 or below: skip — too ambiguous or too context-specific ## Output contract Return JSON result with: - `status`: "pass" or "error" - `phase`: "trainer" - `skill`: "trainer" - `file_path`: "" - `runner_output`: JSON array of candidates (valid JSON, not markdown): [{"type":"sft","moment":"","prompt":"","completion":"","score":4}, {"type":"dpo","moment":"","prompt":"","chosen":"","rejected":"","score":3}] - `verified`: true - `message`: "N sft candidates, M dpo candidates found" ## Rules 1. Read all session entries in the task prompt 2. Score each entry — only include entries scoring >= 3 3. Prompt/completion fields must be phrased to generalise: no project-specific paths or names 4. If no candidates score >= 3, return an empty array `[]` — never force low-quality candidates ``` - [ ] **Step 6: Create config/supervisor/trainer-writer.md** ```markdown # Trainer Writer Discipline You receive candidate learning moments from the reader and write clean SFT/DPO training pairs. ## Quality gate (apply before writing) - SFT: prompt must be phrased so it could come from any project, not just this one - DPO: chosen and rejected must be clearly distinguishable — skip if a reader can't tell which is better - Never include project-specific paths, variable names, or identifiers in any pair ## Output contract Return JSON result with: - `status`: "pass" (pairs written or skipped due to quality) or "error" (candidates JSON was malformed) - `phase`: "trainer" - `skill`: "trainer" - `file_path`: path of the last file written (empty if nothing passed quality gate) - `runner_output`: "N SFT pairs written to brain/training-data/sft/, M DPO pairs to brain/training-data/dpo/" or "0 pairs passed quality gate" - `verified`: true if files were written; false if nothing passed - `message`: "N sft + M dpo pairs for session " or "no pairs passed quality gate" ## File format JSONL — one JSON object per line. SFT: `{"prompt": "...", "completion": "..."}` DPO: `{"prompt": "...", "chosen": "...", "rejected": "..."}` Write SFT to: `/training-data/sft/.jsonl` Write DPO to: `/training-data/dpo/.jsonl` Append to existing files if they exist (don't overwrite). ## Rules 1. Parse the `reader_candidates` JSON from the task prompt 2. For each candidate: apply quality gate 3. Write passing SFT candidates to sft JSONL, DPO candidates to dpo JSONL 4. If nothing passes, return status "pass" with verified: false and message "no pairs passed quality gate" ``` - [ ] **Step 7: Wire into main.go** Add file reads: ```go trainerReaderPrompt, err := os.ReadFile(cfg.ConfigDir + "/trainer-reader.md") if err != nil { logger.Error("read trainer-reader.md", "path", cfg.ConfigDir+"/trainer-reader.md", "err", err) os.Exit(1) } trainerWriterPrompt, err := os.ReadFile(cfg.ConfigDir + "/trainer-writer.md") if err != nil { logger.Error("read trainer-writer.md", "path", cfg.ConfigDir+"/trainer-writer.md", "err", err) os.Exit(1) } ``` Add import: `"github.com/mathiasbq/supervisor/internal/skills/trainer"` Register skill: ```go reg.Register(trainer.New(trainer.Config{ ReaderPrompt: string(trainerReaderPrompt), WriterPrompt: string(trainerWriterPrompt), DefaultModel: models.Resolve("trainer", ""), ExecutorFn: executor.Run, SessionsDir: cfg.SessionsDir, BrainDir: cfg.BrainDir, })) ``` - [ ] **Step 8: Update config/models.yaml** Add `trainer` entry following existing format: ```yaml skills: tdd: ollama/qwen3-coder-30b-tuned review: ollama/devstral-tuned debug: ollama/deepseek-r1-tuned retrospective: ollama/qwen3-coder-30b-tuned spec: ollama/qwen3-coder-30b-tuned trainer: ollama/qwen3-coder-30b-tuned ``` - [ ] **Step 9: Run all tests** ```bash go test ./... -race -count=1 ``` Expected: all tests PASS, including `TestTrainerCallsReaderThenWriter` - [ ] **Step 10: Commit** ```bash git add internal/skills/trainer/ config/supervisor/trainer-reader.md config/supervisor/trainer-writer.md config/models.yaml cmd/supervisor/main.go git commit -m "feat(trainer): add trainer MCP skill with reader→writer sub-agent chain" ``` --- ## Task 8: Integration smoke test and CI push **Files:** none new — validates the full system - [ ] **Step 1: Run task check (full quality gate)** ```bash task check ``` Expected: lint, test, and vet all pass for both modules - [ ] **Step 2: Verify all 12 MCP tools are registered** With servers running (`task start` in another terminal): ```bash curl -s -X POST http://localhost:3200/mcp \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' \ | python3 -c "import json,sys; tools=json.load(sys.stdin)['result']['tools']; [print(t['name']) for t in tools]" ``` Expected output (12 tools): ``` tdd_red tdd_green tdd_refactor brain_query brain_write tier session_log retrospective review debug spec trainer ``` - [ ] **Step 3: Smoke test each new skill** ```bash # review curl -s --max-time 10 -X POST http://localhost:3200/mcp \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"review","arguments":{"project_root":"/tmp","files":["nonexistent.go"],"context":"test"}}}' # debug curl -s --max-time 10 -X POST http://localhost:3200/mcp \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"debug","arguments":{"project_root":"/tmp","error":"test error"}}}' # spec curl -s --max-time 10 -X POST http://localhost:3200/mcp \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","id":4,"method":"tools/call","params":{"name":"spec","arguments":{"project_root":"/tmp","requirements":"add login button"}}}' # trainer (requires a session log entry) curl -s --max-time 10 -X POST http://localhost:3200/mcp \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","id":5,"method":"tools/call","params":{"name":"trainer","arguments":{"session_id":"2026-04-17-validate-hyperguild"}}}' ``` Each should return a valid JSON-RPC result (not `-32000` error). The actual worker call will take longer — `--max-time 10` just tests that the MCP dispatch layer works. - [ ] **Step 4: Push and verify CI** ```bash git push origin main ``` Watch the CI run complete on Gitea. Check for: - `check` job: green (lint + test + vet) - `mirror` job: green (pushed to GitHub) ```bash curl -s "https://gitea.d-ma.be/api/v1/repos/mathias/hyperguild/actions/runs?limit=1" \ -H "Authorization: token $GITEA_TOKEN" | python3 -c " import json,sys d=json.load(sys.stdin) for r in d.get('workflow_runs',[]): print(f'#{r[\"id\"]} {r[\"status\"]:12} {r.get(\"conclusion\",\"\"):8} {r[\"display_title\"][:50]}') " ``` - [ ] **Step 5: Tag v0.2.0** ```bash task tag version=v0.2.0 ``` --- ## Self-review **Spec coverage check:** - Session history injection into tdd_green and tdd_refactor → Task 3 ✓ - All new skills accept `session_id` for history injection → Tasks 4–7 ✓ - Trainer reader→writer chain → Task 7 ✓ - Schema enum fixed (was causing retrospective to return wrong phase) → Task 2 ✓ - Phase 2 skills registered in main.go → Tasks 4–7 each include main.go wiring ✓ - CI passes → Task 8 ✓ **Placeholder scan:** None found — all steps include complete code. **Type consistency:** - `ExecutorFn` is defined in each skill package as `func(ctx context.Context, req iexec.Request) (iexec.Result, error)` — consistent across tdd, review, debug, spec, trainer ✓ - `Config.SessionsDir` present in all new skills ✓ - `trainer.Config.BrainDir` used in handlers.go Task 7 writer task prompt ✓ - `session.FormatHistory` signature `(entries []Entry, excludePhase string) string` used consistently ✓ - `prependHistory` method is defined identically in review, debug, spec handlers — this is intentional duplication (YAGNI: not enough skills to justify extracting a shared mixin) ✓ **Note on tasks 4–7:** These are independent of each other and can be executed in parallel by separate subagents after Tasks 1–3 are complete.