hyperguild/docs/superpowers/plans/2026-04-19-hyperguild-phase2.md

# Hyperguild Phase 2 Implementation Plan

> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.

**Goal:** Add four new MCP skills (review, debug, spec, trainer) to the hyperguild supervisor, with automatic session history injection into all multi-phase skill workers.

**Architecture:** The supervisor reads prior session entries and injects them into each worker's task prompt when `session_id` is provided — the orchestrator no longer re-summarises history. The `trainer` skill runs a two-step sub-agent chain: a reader agent identifies learning moments from the session log, then a writer agent formats them as SFT/DPO pairs and writes to `brain/training-data/`. All other new skills are single-worker.

**Tech Stack:** Go 1.26, `internal/session` (JSONL log), `internal/exec` (claude subprocess executor), `internal/registry` (MCP tool registry), `config/supervisor/*.md` (discipline files), `config/models.yaml` (model routing)

---

## File Map

### New files
| File | Responsibility |
|------|---------------|
| `internal/session/history.go` | `FormatHistory(entries, excludePhase)` — formats prior session entries as a prompt block |
| `internal/session/history_test.go` | Unit tests for FormatHistory |
| `internal/skills/review/skill.go` | review skill: Config, Skill, New, Name, Tools |
| `internal/skills/review/handlers.go` | `Handle`, `handleReview` — single-phase code review worker |
| `internal/skills/review/handlers_test.go` | review handler unit tests |
| `internal/skills/debug/skill.go` | debug skill |
| `internal/skills/debug/handlers.go` | `handleDebug` — hypothesis generation worker |
| `internal/skills/debug/handlers_test.go` | debug handler unit tests |
| `internal/skills/spec/skill.go` | spec skill |
| `internal/skills/spec/handlers.go` | `handleSpec` — spec writing worker |
| `internal/skills/spec/handlers_test.go` | spec handler unit tests |
| `internal/skills/trainer/skill.go` | trainer skill: Config with ReaderPrompt + WriterPrompt + BrainDir |
| `internal/skills/trainer/handlers.go` | `handleTrain` — calls ExecutorFn twice: reader then writer |
| `internal/skills/trainer/handlers_test.go` | trainer handler unit tests (verifies two-call chain) |
| `config/supervisor/review.md` | Review worker discipline file |
| `config/supervisor/debug.md` | Debug worker discipline file |
| `config/supervisor/spec.md` | Spec worker discipline file |
| `config/supervisor/trainer-reader.md` | Trainer reader agent discipline |
| `config/supervisor/trainer-writer.md` | Trainer writer agent discipline |

### Modified files
| File | Change |
|------|--------|
| `internal/exec/result.go` | Add review/debug/spec/trainer to `validPhases` and Schema enum |
| `internal/skills/tdd/skill.go` | Add `SessionsDir string` to Config; add `session_id` to green/refactor tool schemas |
| `internal/skills/tdd/handlers.go` | Add `session_id` to `greenArgs` and `refactorArgs`; inject history when `session_id` non-empty |
| `internal/skills/tdd/handlers_test.go` | Add tests for history injection in green and refactor |
| `cmd/supervisor/main.go` | Pass `SessionsDir` to tdd.Config; read discipline files and register 4 new skills |
| `config/models.yaml` | Add `trainer` model entry |

---

## Task 1: Session history utility

**Files:**
- Create: `internal/session/history.go`
- Create: `internal/session/history_test.go`

- [ ] **Step 1: Write the failing test**

```go
// internal/session/history_test.go
package session_test

import (
	"testing"
	"time"

	"github.com/mathiasbq/supervisor/internal/session"
	"github.com/stretchr/testify/assert"
)

func TestFormatHistoryEmpty(t *testing.T) {
	result := session.FormatHistory(nil, "")
	assert.Equal(t, "", result)
}

func TestFormatHistoryFormatsEntries(t *testing.T) {
	entries := []session.Entry{
		{
			Skill: "tdd", Phase: "red", FinalStatus: "pass",
			FilePath: "internal/foo/foo_test.go",
			Message:  "wrote failing test for Foo",
			Timestamp: time.Now(),
		},
	}
	result := session.FormatHistory(entries, "")
	assert.Contains(t, result, "## Session history")
	assert.Contains(t, result, "Phase: red")
	assert.Contains(t, result, "wrote failing test for Foo")
	assert.Contains(t, result, "internal/foo/foo_test.go")
}

func TestFormatHistoryExcludesCurrentPhase(t *testing.T) {
	entries := []session.Entry{
		{Skill: "tdd", Phase: "red", Message: "red done", FinalStatus: "pass"},
		{Skill: "tdd", Phase: "green", Message: "green done", FinalStatus: "pass"},
	}
	result := session.FormatHistory(entries, "green")
	assert.Contains(t, result, "red done")
	assert.NotContains(t, result, "green done")
}
```

- [ ] **Step 2: Run test to confirm it fails**

```bash
cd /path/to/supervisor
go test ./internal/session/... -run TestFormatHistory -v
```
Expected: `FAIL` — `session.FormatHistory undefined`

- [ ] **Step 3: Implement FormatHistory**

```go
// internal/session/history.go
package session

import (
	"fmt"
	"strings"
)

// FormatHistory formats prior session entries as a structured block for
// injection into a worker task prompt. Entries matching excludePhase are
// omitted (pass the current phase to avoid circular injection).
func FormatHistory(entries []Entry, excludePhase string) string {
	var filtered []Entry
	for _, e := range entries {
		if e.Phase != excludePhase {
			filtered = append(filtered, e)
		}
	}
	if len(filtered) == 0 {
		return ""
	}

	var b strings.Builder
	b.WriteString("## Session history\n\n")
	for _, e := range filtered {
		fmt.Fprintf(&b, "### Phase: %s\n", e.Phase)
		fmt.Fprintf(&b, "- Skill: %s\n", e.Skill)
		fmt.Fprintf(&b, "- Status: %s\n", e.FinalStatus)
		if e.FilePath != "" {
			fmt.Fprintf(&b, "- File: %s\n", e.FilePath)
		}
		if e.Message != "" {
			fmt.Fprintf(&b, "- Summary: %s\n", e.Message)
		}
		b.WriteString("\n")
	}
	return b.String()
}
```

- [ ] **Step 4: Run tests to confirm they pass**

```bash
go test ./internal/session/... -v
```
Expected: all `TestFormatHistory*` tests PASS

- [ ] **Step 5: Commit**

```bash
git add internal/session/history.go internal/session/history_test.go
git commit -m "feat(session): add FormatHistory for worker context injection"
```

---

## Task 2: Fix Schema enum and validPhases

**Files:**
- Modify: `internal/exec/result.go`

The Schema's `phase` enum currently only lists `["red","green","refactor"]`. Workers using any other phase name get forced to pick from that list (the retrospective worker was returning `"refactor"` as a result). Fix by removing the enum constraint from the schema (let it be any string) and rely solely on `validPhases` for server-side validation.

- [ ] **Step 1: Write the failing test**

```go
// Add to internal/exec/result_test.go (check existing file first for test helpers)
func TestValidateAcceptsAllPhases(t *testing.T) {
	phases := []string{"red", "green", "refactor", "retrospective", "review", "debug", "spec", "trainer"}
	for _, phase := range phases {
		r := exec.Result{Status: "pass", Phase: phase, Skill: "test", ModelUsed: "self", Message: "ok"}
		assert.NoError(t, r.Validate(), "phase %q should be valid", phase)
	}
}
```

- [ ] **Step 2: Run test to confirm it fails**

```bash
go test ./internal/exec/... -run TestValidateAcceptsAllPhases -v
```
Expected: FAIL — `"review"`, `"debug"`, `"spec"`, `"trainer"` fail validation

- [ ] **Step 3: Update result.go**

In `internal/exec/result.go`, make these two changes:

**Change 1** — update `validPhases`:
```go
var validPhases = map[string]bool{
	"red":           true,
	"green":         true,
	"refactor":      true,
	"retrospective": true,
	"review":        true,
	"debug":         true,
	"spec":          true,
	"trainer":       true,
}
```

**Change 2** — remove the enum constraint from the Schema `phase` property (replace the `"phase"` line):
```go
// Before:
"phase":         {"type": "string", "enum": ["red","green","refactor"]},

// After:
"phase":         {"type": "string"},
```

Also fix the error message in `Validate()`:
```go
// Before:
errs = append(errs, "phase must be red|green|refactor, got: "+r.Phase)

// After:
errs = append(errs, "phase must be one of red|green|refactor|retrospective|review|debug|spec|trainer, got: "+r.Phase)
```

- [ ] **Step 4: Run tests to confirm they pass**

```bash
go test ./internal/exec/... -v
```
Expected: all tests PASS, including `TestValidateAcceptsAllPhases`

- [ ] **Step 5: Commit**

```bash
git add internal/exec/result.go
git commit -m "fix(exec): expand validPhases and remove schema enum constraint for phase"
```

---

## Task 3: Session history injection in TDD green and refactor

**Files:**
- Modify: `internal/skills/tdd/skill.go`
- Modify: `internal/skills/tdd/handlers.go`
- Modify: `internal/skills/tdd/handlers_test.go`
- Modify: `cmd/supervisor/main.go`

- [ ] **Step 1: Write the failing tests**

Add to `internal/skills/tdd/handlers_test.go`:

```go
func TestTDDGreenInjectsSessionHistory(t *testing.T) {
	sessDir := t.TempDir()
	require.NoError(t, session.Append(sessDir, "sess-1", session.Entry{
		SessionID: "sess-1", Skill: "tdd", Phase: "red", FinalStatus: "pass",
		FilePath: "internal/foo/foo_test.go",
		Message:  "wrote failing test for Foo",
	}))

	var capturedPrompt string
	fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) {
		capturedPrompt = req.TaskPrompt
		return iexec.Result{Status: "pass", Phase: "green", Skill: "tdd", Verified: true, ModelUsed: "self", Message: "ok"}, nil
	}

	sk := tdd.New(tdd.Config{SkillPrompt: "tdd", ExecutorFn: fakeFn, SessionsDir: sessDir})
	_, err := sk.Handle(context.Background(), "tdd_green", json.RawMessage(
		`{"project_root":"/tmp","test_path":"internal/foo/foo_test.go","test_cmd":"go test ./...","session_id":"sess-1"}`,
	))
	require.NoError(t, err)
	assert.Contains(t, capturedPrompt, "## Session history")
	assert.Contains(t, capturedPrompt, "wrote failing test for Foo")
}

func TestTDDGreenNoHistoryWhenSessionIDEmpty(t *testing.T) {
	var capturedPrompt string
	fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) {
		capturedPrompt = req.TaskPrompt
		return iexec.Result{Status: "pass", Phase: "green", Skill: "tdd", Verified: true, ModelUsed: "self", Message: "ok"}, nil
	}

	sk := tdd.New(tdd.Config{SkillPrompt: "tdd", ExecutorFn: fakeFn, SessionsDir: t.TempDir()})
	_, err := sk.Handle(context.Background(), "tdd_green", json.RawMessage(
		`{"project_root":"/tmp","test_path":"internal/foo/foo_test.go"}`,
	))
	require.NoError(t, err)
	assert.NotContains(t, capturedPrompt, "## Session history")
}
```

You will need these imports in the test file:
```go
import (
    "github.com/mathiasbq/supervisor/internal/session"
    iexec "github.com/mathiasbq/supervisor/internal/exec"
)
```

- [ ] **Step 2: Run tests to confirm they fail**

```bash
go test ./internal/skills/tdd/... -run TestTDDGreen -v
```
Expected: FAIL — `tdd.Config` has no `SessionsDir` field, `session_id` not in args

- [ ] **Step 3: Add SessionsDir to Config and session_id to tool schemas**

In `internal/skills/tdd/skill.go`, add `SessionsDir` to Config:
```go
type Config struct {
	SystemPrompt string
	SkillPrompt  string
	ExecutorFn   ExecutorFn
	DefaultModel string
	SessionsDir  string // optional: path to brain/sessions/ for history injection
}
```

In `Tools()`, add `"session_id"` as an optional property to `tdd_green` and `tdd_refactor` input schemas:
```go
// tdd_green InputSchema:
InputSchema: schema(
    []string{"project_root", "test_path"},
    map[string]any{
        "project_root": strProp,
        "test_path":    strProp,
        "model":        strProp,
        "test_cmd":     strProp,
        "session_id":   strProp,
    },
),
// tdd_refactor InputSchema (same addition):
InputSchema: schema(
    []string{"project_root", "test_path", "impl_path"},
    map[string]any{
        "project_root": strProp,
        "test_path":    strProp,
        "impl_path":    strProp,
        "model":        strProp,
        "test_cmd":     strProp,
        "session_id":   strProp,
    },
),
```

- [ ] **Step 4: Update greenArgs, refactorArgs, and inject history**

In `internal/skills/tdd/handlers.go`, add imports and update structs + handlers:

```go
import (
    // existing imports...
    "github.com/mathiasbq/supervisor/internal/session"
)

type greenArgs struct {
	ProjectRoot string `json:"project_root"`
	TestPath    string `json:"test_path"`
	Model       string `json:"model"`
	TestCmd     string `json:"test_cmd"`
	SessionID   string `json:"session_id"`
}

func (s *Skill) handleGreen(ctx context.Context, raw json.RawMessage) (json.RawMessage, error) {
	var args greenArgs
	if err := json.Unmarshal(raw, &args); err != nil {
		return nil, fmt.Errorf("parse args: %w", err)
	}
	if args.ProjectRoot == "" {
		return nil, fmt.Errorf("project_root is required")
	}
	if args.TestPath == "" {
		return nil, fmt.Errorf("test_path is required")
	}

	task := fmt.Sprintf(
		"phase: green\nproject_root: %s\ntest_path: %s\nmodel: %s\ntest_cmd: %s",
		args.ProjectRoot, args.TestPath, s.resolveModel(args.Model), args.TestCmd,
	)
	task = s.prependHistory(args.SessionID, "green", task)
	return s.execute(ctx, task)
}

type refactorArgs struct {
	ProjectRoot string `json:"project_root"`
	TestPath    string `json:"test_path"`
	ImplPath    string `json:"impl_path"`
	Model       string `json:"model"`
	TestCmd     string `json:"test_cmd"`
	SessionID   string `json:"session_id"`
}

func (s *Skill) handleRefactor(ctx context.Context, raw json.RawMessage) (json.RawMessage, error) {
	var args refactorArgs
	if err := json.Unmarshal(raw, &args); err != nil {
		return nil, fmt.Errorf("parse args: %w", err)
	}
	if args.ProjectRoot == "" {
		return nil, fmt.Errorf("project_root is required")
	}
	if args.TestPath == "" {
		return nil, fmt.Errorf("test_path is required")
	}
	if args.ImplPath == "" {
		return nil, fmt.Errorf("impl_path is required")
	}

	task := fmt.Sprintf(
		"phase: refactor\nproject_root: %s\ntest_path: %s\nimpl_path: %s\nmodel: %s\ntest_cmd: %s",
		args.ProjectRoot, args.TestPath, args.ImplPath, s.resolveModel(args.Model), args.TestCmd,
	)
	task = s.prependHistory(args.SessionID, "refactor", task)
	return s.execute(ctx, task)
}

// prependHistory reads the session log and prepends prior phase entries to the task prompt.
// If sessionID is empty or SessionsDir is not configured, task is returned unchanged.
func (s *Skill) prependHistory(sessionID, currentPhase, task string) string {
	if sessionID == "" || s.cfg.SessionsDir == "" {
		return task
	}
	entries, err := session.Read(s.cfg.SessionsDir, sessionID)
	if err != nil || len(entries) == 0 {
		return task
	}
	history := session.FormatHistory(entries, currentPhase)
	if history == "" {
		return task
	}
	return history + "\n---\n\n" + task
}
```

- [ ] **Step 5: Update main.go to pass SessionsDir to tdd.Config**

In `cmd/supervisor/main.go`, find the `tdd.New(tdd.Config{...})` call and add `SessionsDir`:
```go
reg.Register(tdd.New(tdd.Config{
	SystemPrompt: string(systemPrompt),
	SkillPrompt:  string(tddPrompt),
	DefaultModel: models.Resolve("tdd", ""),
	ExecutorFn:   executor.Run,
	SessionsDir:  cfg.SessionsDir,  // ← add this line
}))
```

- [ ] **Step 6: Run all tests**

```bash
go test ./... -race -count=1
```
Expected: all tests PASS

- [ ] **Step 7: Commit**

```bash
git add internal/skills/tdd/skill.go internal/skills/tdd/handlers.go internal/skills/tdd/handlers_test.go cmd/supervisor/main.go
git commit -m "feat(tdd): inject session history into green and refactor worker prompts"
```

---

## Task 4: review skill

**Files:**
- Create: `internal/skills/review/skill.go`
- Create: `internal/skills/review/handlers.go`
- Create: `internal/skills/review/handlers_test.go`
- Create: `config/supervisor/review.md`
- Modify: `cmd/supervisor/main.go`

- [ ] **Step 1: Write the failing test**

```go
// internal/skills/review/handlers_test.go
package review_test

import (
	"context"
	"encoding/json"
	"testing"

	iexec "github.com/mathiasbq/supervisor/internal/exec"
	"github.com/mathiasbq/supervisor/internal/skills/review"
	"github.com/stretchr/testify/assert"
	"github.com/stretchr/testify/require"
)

func TestReviewToolRegistered(t *testing.T) {
	sk := review.New(review.Config{SkillPrompt: "review rules"})
	names := make([]string, 0)
	for _, tool := range sk.Tools() {
		names = append(names, tool.Name)
	}
	assert.Contains(t, names, "review")
}

func TestReviewRequiresProjectRoot(t *testing.T) {
	sk := review.New(review.Config{SkillPrompt: "r"})
	_, err := sk.Handle(context.Background(), "review", json.RawMessage(`{"files":["main.go"]}`))
	assert.ErrorContains(t, err, "project_root")
}

func TestReviewRequiresFiles(t *testing.T) {
	sk := review.New(review.Config{SkillPrompt: "r"})
	_, err := sk.Handle(context.Background(), "review", json.RawMessage(`{"project_root":"/tmp"}`))
	assert.ErrorContains(t, err, "files")
}

func TestReviewCallsExecutor(t *testing.T) {
	called := false
	var capturedTask string
	fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) {
		called = true
		capturedTask = req.TaskPrompt
		return iexec.Result{
			Status: "pass", Phase: "review", Skill: "review",
			Verified: true, ModelUsed: "self", Message: "2 warnings found",
		}, nil
	}

	sk := review.New(review.Config{SkillPrompt: "review rules", ExecutorFn: fakeFn, SessionsDir: t.TempDir()})
	out, err := sk.Handle(context.Background(), "review", json.RawMessage(
		`{"project_root":"/tmp/proj","files":["internal/foo/foo.go"],"context":"PR: add Foo helper"}`,
	))
	require.NoError(t, err)
	assert.True(t, called)
	assert.Contains(t, capturedTask, "internal/foo/foo.go")
	assert.Contains(t, capturedTask, "PR: add Foo helper")

	var result iexec.Result
	require.NoError(t, json.Unmarshal(out, &result))
	assert.Equal(t, "pass", result.Status)
	assert.Equal(t, "review", result.Phase)
}

var _ = require.New
```

- [ ] **Step 2: Run test to confirm it fails**

```bash
go test ./internal/skills/review/... -v
```
Expected: FAIL — package `review` does not exist

- [ ] **Step 3: Create skill.go**

```go
// internal/skills/review/skill.go
package review

import (
	"context"
	"encoding/json"

	iexec "github.com/mathiasbq/supervisor/internal/exec"
	"github.com/mathiasbq/supervisor/internal/registry"
)

type ExecutorFn func(ctx context.Context, req iexec.Request) (iexec.Result, error)

type Config struct {
	SkillPrompt  string
	DefaultModel string
	ExecutorFn   ExecutorFn
	SessionsDir  string
}

type Skill struct{ cfg Config }

func New(cfg Config) *Skill { return &Skill{cfg: cfg} }

func (s *Skill) Name() string { return "review" }

func (s *Skill) Tools() []registry.ToolDef {
	schema := func(required []string, props map[string]any) json.RawMessage {
		b, _ := json.Marshal(map[string]any{"type": "object", "required": required, "properties": props})
		return b
	}
	return []registry.ToolDef{
		{
			Name:        "review",
			Description: "Perform a structured code review of the specified files. Returns findings with severity levels.",
			InputSchema: schema(
				[]string{"project_root", "files"},
				map[string]any{
					"project_root": map[string]any{"type": "string"},
					"files":        map[string]any{"type": "array", "items": map[string]any{"type": "string"}},
					"context":      map[string]any{"type": "string"},
					"model":        map[string]any{"type": "string"},
					"session_id":   map[string]any{"type": "string"},
				},
			),
		},
	}
}
```

- [ ] **Step 4: Create handlers.go**

```go
// internal/skills/review/handlers.go
package review

import (
	"context"
	"encoding/json"
	"fmt"
	"strings"

	iexec "github.com/mathiasbq/supervisor/internal/exec"
	"github.com/mathiasbq/supervisor/internal/session"
)

type reviewArgs struct {
	ProjectRoot string   `json:"project_root"`
	Files       []string `json:"files"`
	Context     string   `json:"context"`
	Model       string   `json:"model"`
	SessionID   string   `json:"session_id"`
}

func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
	if tool != "review" {
		return nil, fmt.Errorf("unknown tool: %s", tool)
	}
	var a reviewArgs
	if err := json.Unmarshal(args, &a); err != nil {
		return nil, fmt.Errorf("parse args: %w", err)
	}
	if a.ProjectRoot == "" {
		return nil, fmt.Errorf("project_root is required")
	}
	if len(a.Files) == 0 {
		return nil, fmt.Errorf("files is required")
	}

	model := a.Model
	if model == "" {
		model = s.cfg.DefaultModel
	}

	task := fmt.Sprintf(
		"phase: review\nproject_root: %s\nfiles: %s\ncontext: %s\nmodel: %s",
		a.ProjectRoot, strings.Join(a.Files, ", "), a.Context, model,
	)
	task = s.prependHistory(a.SessionID, "review", task)

	if s.cfg.ExecutorFn == nil {
		return nil, fmt.Errorf("no executor configured")
	}
	result, err := s.cfg.ExecutorFn(ctx, iexec.Request{
		SkillPrompt: s.cfg.SkillPrompt,
		TaskPrompt:  task,
		Model:       model,
		Tools:       "Read,Bash",
	})
	if err != nil {
		return nil, err
	}
	b, err := json.Marshal(result)
	if err != nil {
		return nil, fmt.Errorf("marshal result: %w", err)
	}
	return b, nil
}

func (s *Skill) prependHistory(sessionID, currentPhase, task string) string {
	if sessionID == "" || s.cfg.SessionsDir == "" {
		return task
	}
	entries, err := session.Read(s.cfg.SessionsDir, sessionID)
	if err != nil || len(entries) == 0 {
		return task
	}
	history := session.FormatHistory(entries, currentPhase)
	if history == "" {
		return task
	}
	return history + "\n---\n\n" + task
}
```

- [ ] **Step 5: Create config/supervisor/review.md**

```markdown
# Code Review Discipline

You are a disciplined code reviewer. Read files carefully before commenting.

## Iron laws
1. Never approve security vulnerabilities: command injection, SQL injection, credential exposure, path traversal, unchecked input at system boundaries
2. Never approve silently swallowed errors — `err != nil` without wrapping or handling is always wrong
3. Never approve missing validation at system boundaries (user input, external APIs, file reads)

## Output contract
Return JSON result with:
- `status`: "pass" if no blocking issues; "fail" if any iron law is violated
- `phase`: "review"
- `skill`: "review"
- `file_path`: first file reviewed
- `runner_output`: full review formatted as:
  ```
  CRITICAL: <issue> at <file>:<line>
  WARNING: <issue> at <file>:<line>
  SUGGESTION: <issue> at <file>:<line>
  ```
- `verified`: true if you read all specified files; false if any were missing or unreadable
- `message`: "N critical, M warnings, K suggestions" or "clean: <which iron law checks passed and why>"

## Rules
1. Read every file listed before writing feedback
2. Check iron laws first — any violation is CRITICAL and sets status to "fail"
3. Then check: correctness, test coverage for new code, Go style conventions
4. Never rubber-stamp — if nothing is wrong, explain specifically which iron law checks you ran and why they passed
5. Line references are required for every finding — "roughly around the middle" is not acceptable
```

- [ ] **Step 6: Wire into main.go**

In `cmd/supervisor/main.go`, add after existing `os.ReadFile` calls for discipline files:
```go
reviewPrompt, err := os.ReadFile(cfg.ConfigDir + "/review.md")
if err != nil {
    logger.Error("read review.md", "path", cfg.ConfigDir+"/review.md", "err", err)
    os.Exit(1)
}
```

Add import: `"github.com/mathiasbq/supervisor/internal/skills/review"`

Register after existing `reg.Register` calls:
```go
reg.Register(review.New(review.Config{
    SkillPrompt:  string(reviewPrompt),
    DefaultModel: models.Resolve("review", ""),
    ExecutorFn:   executor.Run,
    SessionsDir:  cfg.SessionsDir,
}))
```

- [ ] **Step 7: Run all tests**

```bash
go test ./... -race -count=1
```
Expected: all tests PASS

- [ ] **Step 8: Commit**

```bash
git add internal/skills/review/ config/supervisor/review.md cmd/supervisor/main.go
git commit -m "feat(review): add code review MCP skill with session history injection"
```

---

## Task 5: debug skill

**Files:**
- Create: `internal/skills/debug/skill.go`
- Create: `internal/skills/debug/handlers.go`
- Create: `internal/skills/debug/handlers_test.go`
- Create: `config/supervisor/debug.md`
- Modify: `cmd/supervisor/main.go`

- [ ] **Step 1: Write the failing test**

```go
// internal/skills/debug/handlers_test.go
package debug_test

import (
	"context"
	"encoding/json"
	"testing"

	iexec "github.com/mathiasbq/supervisor/internal/exec"
	"github.com/mathiasbq/supervisor/internal/skills/debug"
	"github.com/stretchr/testify/assert"
	"github.com/stretchr/testify/require"
)

func TestDebugToolRegistered(t *testing.T) {
	sk := debug.New(debug.Config{SkillPrompt: "debug rules"})
	names := make([]string, 0)
	for _, tool := range sk.Tools() {
		names = append(names, tool.Name)
	}
	assert.Contains(t, names, "debug")
}

func TestDebugRequiresProjectRoot(t *testing.T) {
	sk := debug.New(debug.Config{SkillPrompt: "d"})
	_, err := sk.Handle(context.Background(), "debug", json.RawMessage(`{"error":"panic: nil pointer"}`))
	assert.ErrorContains(t, err, "project_root")
}

func TestDebugRequiresError(t *testing.T) {
	sk := debug.New(debug.Config{SkillPrompt: "d"})
	_, err := sk.Handle(context.Background(), "debug", json.RawMessage(`{"project_root":"/tmp"}`))
	assert.ErrorContains(t, err, "error")
}

func TestDebugCallsExecutor(t *testing.T) {
	called := false
	var capturedTask string
	fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) {
		called = true
		capturedTask = req.TaskPrompt
		return iexec.Result{
			Status: "pass", Phase: "debug", Skill: "debug",
			RunnerOutput: "HYPOTHESIS 1 (likelihood: high): nil map access\nVERIFY: go test ./... → expected: panic line reference",
			Verified:     false, ModelUsed: "self", Message: "3 hypotheses for: panic nil pointer at foo.go:42",
		}, nil
	}

	sk := debug.New(debug.Config{SkillPrompt: "debug rules", ExecutorFn: fakeFn, SessionsDir: t.TempDir()})
	out, err := sk.Handle(context.Background(), "debug", json.RawMessage(
		`{"project_root":"/tmp/proj","error":"panic: nil pointer dereference at foo.go:42","context":"occurs on startup"}`,
	))
	require.NoError(t, err)
	assert.True(t, called)
	assert.Contains(t, capturedTask, "panic: nil pointer dereference")
	assert.Contains(t, capturedTask, "occurs on startup")

	var result iexec.Result
	require.NoError(t, json.Unmarshal(out, &result))
	assert.Equal(t, "debug", result.Phase)
}

var _ = require.New
```

- [ ] **Step 2: Run test to confirm it fails**

```bash
go test ./internal/skills/debug/... -v
```
Expected: FAIL — package does not exist

- [ ] **Step 3: Create skill.go**

```go
// internal/skills/debug/skill.go
package debug

import (
	"context"
	"encoding/json"

	iexec "github.com/mathiasbq/supervisor/internal/exec"
	"github.com/mathiasbq/supervisor/internal/registry"
)

type ExecutorFn func(ctx context.Context, req iexec.Request) (iexec.Result, error)

type Config struct {
	SkillPrompt  string
	DefaultModel string
	ExecutorFn   ExecutorFn
	SessionsDir  string
}

type Skill struct{ cfg Config }

func New(cfg Config) *Skill { return &Skill{cfg: cfg} }

func (s *Skill) Name() string { return "debug" }

func (s *Skill) Tools() []registry.ToolDef {
	schema := func(required []string, props map[string]any) json.RawMessage {
		b, _ := json.Marshal(map[string]any{"type": "object", "required": required, "properties": props})
		return b
	}
	str := map[string]any{"type": "string"}
	return []registry.ToolDef{
		{
			Name:        "debug",
			Description: "Analyse an error and return 3-5 hypotheses ordered by likelihood, each with a concrete verification step.",
			InputSchema: schema(
				[]string{"project_root", "error"},
				map[string]any{
					"project_root": str,
					"error":        str,
					"context":      str,
					"model":        str,
					"session_id":   str,
				},
			),
		},
	}
}
```

- [ ] **Step 4: Create handlers.go**

```go
// internal/skills/debug/handlers.go
package debug

import (
	"context"
	"encoding/json"
	"fmt"

	iexec "github.com/mathiasbq/supervisor/internal/exec"
	"github.com/mathiasbq/supervisor/internal/session"
)

type debugArgs struct {
	ProjectRoot string `json:"project_root"`
	Error       string `json:"error"`
	Context     string `json:"context"`
	Model       string `json:"model"`
	SessionID   string `json:"session_id"`
}

func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
	if tool != "debug" {
		return nil, fmt.Errorf("unknown tool: %s", tool)
	}
	var a debugArgs
	if err := json.Unmarshal(args, &a); err != nil {
		return nil, fmt.Errorf("parse args: %w", err)
	}
	if a.ProjectRoot == "" {
		return nil, fmt.Errorf("project_root is required")
	}
	if a.Error == "" {
		return nil, fmt.Errorf("error is required")
	}

	model := a.Model
	if model == "" {
		model = s.cfg.DefaultModel
	}

	task := fmt.Sprintf(
		"phase: debug\nproject_root: %s\nerror: %s\ncontext: %s\nmodel: %s",
		a.ProjectRoot, a.Error, a.Context, model,
	)
	task = s.prependHistory(a.SessionID, "debug", task)

	if s.cfg.ExecutorFn == nil {
		return nil, fmt.Errorf("no executor configured")
	}
	result, err := s.cfg.ExecutorFn(ctx, iexec.Request{
		SkillPrompt: s.cfg.SkillPrompt,
		TaskPrompt:  task,
		Model:       model,
		Tools:       "Read,Bash",
	})
	if err != nil {
		return nil, err
	}
	b, err := json.Marshal(result)
	if err != nil {
		return nil, fmt.Errorf("marshal result: %w", err)
	}
	return b, nil
}

func (s *Skill) prependHistory(sessionID, currentPhase, task string) string {
	if sessionID == "" || s.cfg.SessionsDir == "" {
		return task
	}
	entries, err := session.Read(s.cfg.SessionsDir, sessionID)
	if err != nil || len(entries) == 0 {
		return task
	}
	history := session.FormatHistory(entries, currentPhase)
	if history == "" {
		return task
	}
	return history + "\n---\n\n" + task
}
```

- [ ] **Step 5: Create config/supervisor/debug.md**

```markdown
# Debug Discipline

You are a systematic debugger. Form hypotheses before suggesting fixes.

## Iron laws
1. Never suggest "try X and see what happens" — every hypothesis must have a specific expected outcome if correct
2. Generate exactly 3-5 hypotheses, ordered by likelihood (most likely first)
3. Never fix the bug — diagnose only; the caller decides what to do with the hypotheses

## Output contract
Return JSON result with:
- `status`: "pass" (hypotheses generated) or "error" (error too ambiguous to analyse)
- `phase`: "debug"
- `skill`: "debug"
- `file_path`: the most relevant file to the error (read it)
- `runner_output`: your hypotheses, formatted as:
  ```
  HYPOTHESIS 1 (likelihood: high): <mechanism>
  VERIFY: <exact command or file to check> → expected if correct: <specific output>

  HYPOTHESIS 2 (likelihood: medium): <mechanism>
  VERIFY: <exact command or file to check> → expected if correct: <specific output>
  ```
- `verified`: false — verification is the caller's job
- `message`: "N hypotheses for: <one-line error summary>"

## Rules
1. Read the error and any context files provided before forming hypotheses
2. Identify the failure mode first — what actually went wrong, not just what the error says
3. For each hypothesis: name the mechanism, explain why it would produce this exact error, give a concrete verification command with expected output
4. If the error is clearly a typo or trivial mistake, still form 3 hypotheses — surface the most likely cause as #1
```

- [ ] **Step 6: Wire into main.go**

Add file read:
```go
debugPrompt, err := os.ReadFile(cfg.ConfigDir + "/debug.md")
if err != nil {
    logger.Error("read debug.md", "path", cfg.ConfigDir+"/debug.md", "err", err)
    os.Exit(1)
}
```

Add import: `"github.com/mathiasbq/supervisor/internal/skills/debug"`

Register skill:
```go
reg.Register(debug.New(debug.Config{
    SkillPrompt:  string(debugPrompt),
    DefaultModel: models.Resolve("debug", ""),
    ExecutorFn:   executor.Run,
    SessionsDir:  cfg.SessionsDir,
}))
```

- [ ] **Step 7: Run all tests**

```bash
go test ./... -race -count=1
```
Expected: all tests PASS

- [ ] **Step 8: Commit**

```bash
git add internal/skills/debug/ config/supervisor/debug.md cmd/supervisor/main.go
git commit -m "feat(debug): add debug MCP skill with hypothesis generation"
```

---

## Task 6: spec skill

**Files:**
- Create: `internal/skills/spec/skill.go`
- Create: `internal/skills/spec/handlers.go`
- Create: `internal/skills/spec/handlers_test.go`
- Create: `config/supervisor/spec.md`
- Modify: `cmd/supervisor/main.go`

- [ ] **Step 1: Write the failing test**

```go
// internal/skills/spec/handlers_test.go
package spec_test

import (
	"context"
	"encoding/json"
	"testing"

	iexec "github.com/mathiasbq/supervisor/internal/exec"
	"github.com/mathiasbq/supervisor/internal/skills/spec"
	"github.com/stretchr/testify/assert"
	"github.com/stretchr/testify/require"
)

func TestSpecToolRegistered(t *testing.T) {
	sk := spec.New(spec.Config{SkillPrompt: "spec rules"})
	names := make([]string, 0)
	for _, tool := range sk.Tools() {
		names = append(names, tool.Name)
	}
	assert.Contains(t, names, "spec")
}

func TestSpecRequiresProjectRoot(t *testing.T) {
	sk := spec.New(spec.Config{SkillPrompt: "s"})
	_, err := sk.Handle(context.Background(), "spec", json.RawMessage(`{"requirements":"add login"}`))
	assert.ErrorContains(t, err, "project_root")
}

func TestSpecRequiresRequirements(t *testing.T) {
	sk := spec.New(spec.Config{SkillPrompt: "s"})
	_, err := sk.Handle(context.Background(), "spec", json.RawMessage(`{"project_root":"/tmp"}`))
	assert.ErrorContains(t, err, "requirements")
}

func TestSpecCallsExecutor(t *testing.T) {
	called := false
	var capturedTask string
	fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) {
		called = true
		capturedTask = req.TaskPrompt
		return iexec.Result{
			Status: "pass", Phase: "spec", Skill: "spec",
			FilePath: "/tmp/proj/docs/login-spec.md",
			Verified: true, ModelUsed: "self", Message: "spec written: login feature",
		}, nil
	}

	sk := spec.New(spec.Config{SkillPrompt: "spec rules", ExecutorFn: fakeFn, SessionsDir: t.TempDir()})
	out, err := sk.Handle(context.Background(), "spec", json.RawMessage(
		`{"project_root":"/tmp/proj","requirements":"add OAuth2 login","output_path":"docs/login-spec.md"}`,
	))
	require.NoError(t, err)
	assert.True(t, called)
	assert.Contains(t, capturedTask, "OAuth2 login")
	assert.Contains(t, capturedTask, "docs/login-spec.md")

	var result iexec.Result
	require.NoError(t, json.Unmarshal(out, &result))
	assert.Equal(t, "spec", result.Phase)
}

var _ = require.New
```

- [ ] **Step 2: Run test to confirm it fails**

```bash
go test ./internal/skills/spec/... -v
```
Expected: FAIL — package does not exist

- [ ] **Step 3: Create skill.go**

```go
// internal/skills/spec/skill.go
package spec

import (
	"context"
	"encoding/json"

	iexec "github.com/mathiasbq/supervisor/internal/exec"
	"github.com/mathiasbq/supervisor/internal/registry"
)

type ExecutorFn func(ctx context.Context, req iexec.Request) (iexec.Result, error)

type Config struct {
	SkillPrompt  string
	DefaultModel string
	ExecutorFn   ExecutorFn
	SessionsDir  string
}

type Skill struct{ cfg Config }

func New(cfg Config) *Skill { return &Skill{cfg: cfg} }

func (s *Skill) Name() string { return "spec" }

func (s *Skill) Tools() []registry.ToolDef {
	schema := func(required []string, props map[string]any) json.RawMessage {
		b, _ := json.Marshal(map[string]any{"type": "object", "required": required, "properties": props})
		return b
	}
	str := map[string]any{"type": "string"}
	return []registry.ToolDef{
		{
			Name:        "spec",
			Description: "Generate a structured implementation spec from requirements. Writes the spec to output_path in the project.",
			InputSchema: schema(
				[]string{"project_root", "requirements"},
				map[string]any{
					"project_root": str,
					"requirements": str,
					"output_path":  str,
					"context":      str,
					"model":        str,
					"session_id":   str,
				},
			),
		},
	}
}
```

- [ ] **Step 4: Create handlers.go**

```go
// internal/skills/spec/handlers.go
package spec

import (
	"context"
	"encoding/json"
	"fmt"

	iexec "github.com/mathiasbq/supervisor/internal/exec"
	"github.com/mathiasbq/supervisor/internal/session"
)

type specArgs struct {
	ProjectRoot  string `json:"project_root"`
	Requirements string `json:"requirements"`
	OutputPath   string `json:"output_path"`
	Context      string `json:"context"`
	Model        string `json:"model"`
	SessionID    string `json:"session_id"`
}

func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
	if tool != "spec" {
		return nil, fmt.Errorf("unknown tool: %s", tool)
	}
	var a specArgs
	if err := json.Unmarshal(args, &a); err != nil {
		return nil, fmt.Errorf("parse args: %w", err)
	}
	if a.ProjectRoot == "" {
		return nil, fmt.Errorf("project_root is required")
	}
	if a.Requirements == "" {
		return nil, fmt.Errorf("requirements is required")
	}
	outputPath := a.OutputPath
	if outputPath == "" {
		outputPath = "docs/spec.md"
	}

	model := a.Model
	if model == "" {
		model = s.cfg.DefaultModel
	}

	task := fmt.Sprintf(
		"phase: spec\nproject_root: %s\nrequirements: %s\noutput_path: %s\ncontext: %s\nmodel: %s",
		a.ProjectRoot, a.Requirements, outputPath, a.Context, model,
	)
	task = s.prependHistory(a.SessionID, "spec", task)

	if s.cfg.ExecutorFn == nil {
		return nil, fmt.Errorf("no executor configured")
	}
	result, err := s.cfg.ExecutorFn(ctx, iexec.Request{
		SkillPrompt: s.cfg.SkillPrompt,
		TaskPrompt:  task,
		Model:       model,
		Tools:       "Read,Write",
	})
	if err != nil {
		return nil, err
	}
	b, err := json.Marshal(result)
	if err != nil {
		return nil, fmt.Errorf("marshal result: %w", err)
	}
	return b, nil
}

func (s *Skill) prependHistory(sessionID, currentPhase, task string) string {
	if sessionID == "" || s.cfg.SessionsDir == "" {
		return task
	}
	entries, err := session.Read(s.cfg.SessionsDir, sessionID)
	if err != nil || len(entries) == 0 {
		return task
	}
	history := session.FormatHistory(entries, currentPhase)
	if history == "" {
		return task
	}
	return history + "\n---\n\n" + task
}
```

- [ ] **Step 5: Create config/supervisor/spec.md**

```markdown
# Spec Writing Discipline

You write structured implementation specs. Nothing is left ambiguous.

## Iron laws
1. Success criteria must be measurable — "the system is fast" is banned; "p99 < 200ms under 100 RPS" is valid
2. Always include an explicit "Out of scope" section — if you don't draw the boundary, the developer will guess wrong
3. Every technical decision in the approach must have a rationale

## Output contract
Return JSON result with:
- `status`: "pass" (spec written) or "error" (requirements too ambiguous to spec without more input)
- `phase`: "spec"
- `skill`: "spec"
- `file_path`: the output_path where the spec was written (absolute path)
- `runner_output`: ""
- `verified`: true if the file was written successfully
- `message`: "spec written: <one-line summary of what was specced>"

## Spec structure
Write the spec as markdown to the output_path:

```markdown
# [Feature] Spec

## Problem statement
[What problem does this solve? For whom? Why now?]

## Success criteria
- [ ] [Criterion 1 — measurable and verifiable]
- [ ] [Criterion 2 — measurable and verifiable]

## Constraints
[Non-negotiable requirements the solution must satisfy]

## Out of scope
[What we are explicitly NOT doing in this iteration]

## Technical approach
[Architecture decisions, key components, rationale for each choice]

## Risks
[What could go wrong, and how we'd mitigate it]
```

If the requirements are too vague to produce measurable success criteria, return status "error" with a message listing the specific questions that need answers.
```

- [ ] **Step 6: Wire into main.go**

Add file read:
```go
specPrompt, err := os.ReadFile(cfg.ConfigDir + "/spec.md")
if err != nil {
    logger.Error("read spec.md", "path", cfg.ConfigDir+"/spec.md", "err", err)
    os.Exit(1)
}
```

Add import: `"github.com/mathiasbq/supervisor/internal/skills/spec"`

Register skill:
```go
reg.Register(spec.New(spec.Config{
    SkillPrompt:  string(specPrompt),
    DefaultModel: models.Resolve("spec", ""),
    ExecutorFn:   executor.Run,
    SessionsDir:  cfg.SessionsDir,
}))
```

- [ ] **Step 7: Run all tests**

```bash
go test ./... -race -count=1
```
Expected: all tests PASS

- [ ] **Step 8: Commit**

```bash
git add internal/skills/spec/ config/supervisor/spec.md cmd/supervisor/main.go
git commit -m "feat(spec): add spec writing MCP skill"
```

---

## Task 7: trainer skill

**Files:**
- Create: `internal/skills/trainer/skill.go`
- Create: `internal/skills/trainer/handlers.go`
- Create: `internal/skills/trainer/handlers_test.go`
- Create: `config/supervisor/trainer-reader.md`
- Create: `config/supervisor/trainer-writer.md`
- Modify: `cmd/supervisor/main.go`
- Modify: `config/models.yaml`

- [ ] **Step 1: Write the failing test**

```go
// internal/skills/trainer/handlers_test.go
package trainer_test

import (
	"context"
	"encoding/json"
	"testing"

	iexec "github.com/mathiasbq/supervisor/internal/exec"
	"github.com/mathiasbq/supervisor/internal/session"
	"github.com/mathiasbq/supervisor/internal/skills/trainer"
	"github.com/stretchr/testify/assert"
	"github.com/stretchr/testify/require"
)

func TestTrainerToolRegistered(t *testing.T) {
	sk := trainer.New(trainer.Config{ReaderPrompt: "r", WriterPrompt: "w"})
	names := make([]string, 0)
	for _, tool := range sk.Tools() {
		names = append(names, tool.Name)
	}
	assert.Contains(t, names, "trainer")
}

func TestTrainerRequiresSessionID(t *testing.T) {
	sk := trainer.New(trainer.Config{ReaderPrompt: "r", WriterPrompt: "w"})
	_, err := sk.Handle(context.Background(), "trainer", json.RawMessage(`{}`))
	assert.ErrorContains(t, err, "session_id")
}

func TestTrainerCallsReaderThenWriter(t *testing.T) {
	sessDir := t.TempDir()
	require.NoError(t, session.Append(sessDir, "sess-1", session.Entry{
		SessionID: "sess-1", Skill: "tdd", Phase: "red", FinalStatus: "pass",
		Message: "wrote failing test", FilePath: "internal/foo/foo_test.go",
	}))

	callCount := 0
	var readerTask, writerTask string

	fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) {
		callCount++
		if callCount == 1 {
			// reader call
			readerTask = req.TaskPrompt
			return iexec.Result{
				Status: "pass", Phase: "trainer", Skill: "trainer",
				RunnerOutput: `[{"type":"sft","moment":"first-pass clean TDD","score":4}]`,
				Verified: true, ModelUsed: "self", Message: "1 sft candidate found",
			}, nil
		}
		// writer call
		writerTask = req.TaskPrompt
		return iexec.Result{
			Status: "pass", Phase: "trainer", Skill: "trainer",
			FilePath: sessDir + "/training-data/sft/sess-1.jsonl",
			Verified: true, ModelUsed: "self", Message: "1 sft pair written",
		}, nil
	}

	sk := trainer.New(trainer.Config{
		ReaderPrompt: "reader rules",
		WriterPrompt: "writer rules",
		ExecutorFn:   fakeFn,
		SessionsDir:  sessDir,
		BrainDir:     t.TempDir(),
	})
	out, err := sk.Handle(context.Background(), "trainer", json.RawMessage(`{"session_id":"sess-1"}`))
	require.NoError(t, err)

	assert.Equal(t, 2, callCount, "executor must be called exactly twice: reader then writer")
	assert.Contains(t, readerTask, "role: reader")
	assert.Contains(t, readerTask, "sess-1")
	assert.Contains(t, readerTask, "wrote failing test") // session history in reader prompt
	assert.Contains(t, writerTask, "role: writer")
	assert.Contains(t, writerTask, "sft candidate") // reader output passed to writer

	var result iexec.Result
	require.NoError(t, json.Unmarshal(out, &result))
	assert.Equal(t, "trainer", result.Phase)
	assert.Equal(t, "pass", result.Status)
}

var _ = require.New
```

- [ ] **Step 2: Run test to confirm it fails**

```bash
go test ./internal/skills/trainer/... -v
```
Expected: FAIL — package does not exist

- [ ] **Step 3: Create skill.go**

```go
// internal/skills/trainer/skill.go
package trainer

import (
	"context"
	"encoding/json"

	iexec "github.com/mathiasbq/supervisor/internal/exec"
	"github.com/mathiasbq/supervisor/internal/registry"
)

type ExecutorFn func(ctx context.Context, req iexec.Request) (iexec.Result, error)

type Config struct {
	ReaderPrompt string
	WriterPrompt string
	DefaultModel string
	ExecutorFn   ExecutorFn
	SessionsDir  string
	BrainDir     string // root of brain/ directory; writer writes to BrainDir/training-data/
}

type Skill struct{ cfg Config }

func New(cfg Config) *Skill { return &Skill{cfg: cfg} }

func (s *Skill) Name() string { return "trainer" }

func (s *Skill) Tools() []registry.ToolDef {
	schema := func(required []string, props map[string]any) json.RawMessage {
		b, _ := json.Marshal(map[string]any{"type": "object", "required": required, "properties": props})
		return b
	}
	return []registry.ToolDef{
		{
			Name:        "trainer",
			Description: "Extract SFT and DPO training pairs from a session log. Runs a reader→writer chain: reader identifies learning moments, writer formats and writes pairs to brain/training-data/.",
			InputSchema: schema(
				[]string{"session_id"},
				map[string]any{
					"session_id": map[string]any{"type": "string"},
					"model":      map[string]any{"type": "string"},
				},
			),
		},
	}
}
```

- [ ] **Step 4: Create handlers.go**

```go
// internal/skills/trainer/handlers.go
package trainer

import (
	"context"
	"encoding/json"
	"fmt"

	iexec "github.com/mathiasbq/supervisor/internal/exec"
	"github.com/mathiasbq/supervisor/internal/session"
)

type trainArgs struct {
	SessionID string `json:"session_id"`
	Model     string `json:"model"`
}

func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
	if tool != "trainer" {
		return nil, fmt.Errorf("unknown tool: %s", tool)
	}
	var a trainArgs
	if err := json.Unmarshal(args, &a); err != nil {
		return nil, fmt.Errorf("parse args: %w", err)
	}
	if a.SessionID == "" {
		return nil, fmt.Errorf("session_id is required")
	}
	if s.cfg.ExecutorFn == nil {
		return nil, fmt.Errorf("no executor configured")
	}

	model := a.Model
	if model == "" {
		model = s.cfg.DefaultModel
	}

	entries, err := session.Read(s.cfg.SessionsDir, a.SessionID)
	if err != nil {
		return nil, fmt.Errorf("read session log: %w", err)
	}

	// ── Step 1: Reader agent ─────────────────────────────────────────────────
	history := session.FormatHistory(entries, "")
	readerTask := fmt.Sprintf(
		"role: reader\nsession_id: %s\nbrain_dir: %s\n\n%s",
		a.SessionID, s.cfg.BrainDir, history,
	)
	readerResult, err := s.cfg.ExecutorFn(ctx, iexec.Request{
		SkillPrompt: s.cfg.ReaderPrompt,
		TaskPrompt:  readerTask,
		Model:       model,
		Tools:       "Read",
	})
	if err != nil {
		return nil, fmt.Errorf("reader agent: %w", err)
	}

	// ── Step 2: Writer agent (receives reader candidates) ────────────────────
	writerTask := fmt.Sprintf(
		"role: writer\nsession_id: %s\nbrain_dir: %s\n\nreader_candidates:\n%s",
		a.SessionID, s.cfg.BrainDir, readerResult.RunnerOutput,
	)
	writerResult, err := s.cfg.ExecutorFn(ctx, iexec.Request{
		SkillPrompt: s.cfg.WriterPrompt,
		TaskPrompt:  writerTask,
		Model:       model,
		Tools:       "Read,Write",
	})
	if err != nil {
		return nil, fmt.Errorf("writer agent: %w", err)
	}

	b, err := json.Marshal(writerResult)
	if err != nil {
		return nil, fmt.Errorf("marshal result: %w", err)
	}
	return b, nil
}
```

- [ ] **Step 5: Create config/supervisor/trainer-reader.md**

```markdown
# Trainer Reader Discipline

You scan session logs and identify candidate learning moments worth converting to training data.

## What to look for
- **SFT candidates**: the worker did exactly the right thing — a clean pattern worth reinforcing
- **DPO candidates**: the worker first produced a wrong or suboptimal response, then corrected — you have both rejected and chosen

## Scoring (1–5)
- 5: novel pattern, clearly correct, generalises across projects
- 4: good pattern, correct, somewhat project-specific but still useful
- 3: correct but obvious — include only if especially clean
- 2 or below: skip — too ambiguous or too context-specific

## Output contract
Return JSON result with:
- `status`: "pass" or "error"
- `phase`: "trainer"
- `skill`: "trainer"
- `file_path`: ""
- `runner_output`: JSON array of candidates (valid JSON, not markdown):
  [{"type":"sft","moment":"<what happened>","prompt":"<what was asked>","completion":"<what was done right>","score":4},
   {"type":"dpo","moment":"<what happened>","prompt":"<what was asked>","chosen":"<correct>","rejected":"<incorrect>","score":3}]
- `verified`: true
- `message`: "N sft candidates, M dpo candidates found"

## Rules
1. Read all session entries in the task prompt
2. Score each entry — only include entries scoring >= 3
3. Prompt/completion fields must be phrased to generalise: no project-specific paths or names
4. If no candidates score >= 3, return an empty array `[]` — never force low-quality candidates
```

- [ ] **Step 6: Create config/supervisor/trainer-writer.md**

```markdown
# Trainer Writer Discipline

You receive candidate learning moments from the reader and write clean SFT/DPO training pairs.

## Quality gate (apply before writing)
- SFT: prompt must be phrased so it could come from any project, not just this one
- DPO: chosen and rejected must be clearly distinguishable — skip if a reader can't tell which is better
- Never include project-specific paths, variable names, or identifiers in any pair

## Output contract
Return JSON result with:
- `status`: "pass" (pairs written or skipped due to quality) or "error" (candidates JSON was malformed)
- `phase`: "trainer"
- `skill`: "trainer"
- `file_path`: path of the last file written (empty if nothing passed quality gate)
- `runner_output`: "N SFT pairs written to brain/training-data/sft/, M DPO pairs to brain/training-data/dpo/" or "0 pairs passed quality gate"
- `verified`: true if files were written; false if nothing passed
- `message`: "N sft + M dpo pairs for session <id>" or "no pairs passed quality gate"

## File format
JSONL — one JSON object per line.

SFT: `{"prompt": "...", "completion": "..."}`
DPO: `{"prompt": "...", "chosen": "...", "rejected": "..."}`

Write SFT to: `<brain_dir>/training-data/sft/<session_id>.jsonl`
Write DPO to: `<brain_dir>/training-data/dpo/<session_id>.jsonl`

Append to existing files if they exist (don't overwrite).

## Rules
1. Parse the `reader_candidates` JSON from the task prompt
2. For each candidate: apply quality gate
3. Write passing SFT candidates to sft JSONL, DPO candidates to dpo JSONL
4. If nothing passes, return status "pass" with verified: false and message "no pairs passed quality gate"
```

- [ ] **Step 7: Wire into main.go**

Add file reads:
```go
trainerReaderPrompt, err := os.ReadFile(cfg.ConfigDir + "/trainer-reader.md")
if err != nil {
    logger.Error("read trainer-reader.md", "path", cfg.ConfigDir+"/trainer-reader.md", "err", err)
    os.Exit(1)
}
trainerWriterPrompt, err := os.ReadFile(cfg.ConfigDir + "/trainer-writer.md")
if err != nil {
    logger.Error("read trainer-writer.md", "path", cfg.ConfigDir+"/trainer-writer.md", "err", err)
    os.Exit(1)
}
```

Add import: `"github.com/mathiasbq/supervisor/internal/skills/trainer"`

Register skill:
```go
reg.Register(trainer.New(trainer.Config{
    ReaderPrompt: string(trainerReaderPrompt),
    WriterPrompt: string(trainerWriterPrompt),
    DefaultModel: models.Resolve("trainer", ""),
    ExecutorFn:   executor.Run,
    SessionsDir:  cfg.SessionsDir,
    BrainDir:     cfg.BrainDir,
}))
```

- [ ] **Step 8: Update config/models.yaml**

Add `trainer` entry following existing format:
```yaml
skills:
  tdd:           ollama/qwen3-coder-30b-tuned
  review:        ollama/devstral-tuned
  debug:         ollama/deepseek-r1-tuned
  retrospective: ollama/qwen3-coder-30b-tuned
  spec:          ollama/qwen3-coder-30b-tuned
  trainer:       ollama/qwen3-coder-30b-tuned
```

- [ ] **Step 9: Run all tests**

```bash
go test ./... -race -count=1
```
Expected: all tests PASS, including `TestTrainerCallsReaderThenWriter`

- [ ] **Step 10: Commit**

```bash
git add internal/skills/trainer/ config/supervisor/trainer-reader.md config/supervisor/trainer-writer.md config/models.yaml cmd/supervisor/main.go
git commit -m "feat(trainer): add trainer MCP skill with reader→writer sub-agent chain"
```

---

## Task 8: Integration smoke test and CI push

**Files:** none new — validates the full system

- [ ] **Step 1: Run task check (full quality gate)**

```bash
task check
```
Expected: lint, test, and vet all pass for both modules

- [ ] **Step 2: Verify all 12 MCP tools are registered**

With servers running (`task start` in another terminal):
```bash
curl -s -X POST http://localhost:3200/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' \
  | python3 -c "import json,sys; tools=json.load(sys.stdin)['result']['tools']; [print(t['name']) for t in tools]"
```

Expected output (12 tools):
```
tdd_red
tdd_green
tdd_refactor
brain_query
brain_write
tier
session_log
retrospective
review
debug
spec
trainer
```

- [ ] **Step 3: Smoke test each new skill**

```bash
# review
curl -s --max-time 10 -X POST http://localhost:3200/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"review","arguments":{"project_root":"/tmp","files":["nonexistent.go"],"context":"test"}}}'

# debug
curl -s --max-time 10 -X POST http://localhost:3200/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"debug","arguments":{"project_root":"/tmp","error":"test error"}}}'

# spec
curl -s --max-time 10 -X POST http://localhost:3200/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":4,"method":"tools/call","params":{"name":"spec","arguments":{"project_root":"/tmp","requirements":"add login button"}}}'

# trainer (requires a session log entry)
curl -s --max-time 10 -X POST http://localhost:3200/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":5,"method":"tools/call","params":{"name":"trainer","arguments":{"session_id":"2026-04-17-validate-hyperguild"}}}'
```

Each should return a valid JSON-RPC result (not `-32000` error). The actual worker call will take longer — `--max-time 10` just tests that the MCP dispatch layer works.

- [ ] **Step 4: Push and verify CI**

```bash
git push origin main
```

Watch the CI run complete on Gitea. Check for:
- `check` job: green (lint + test + vet)
- `mirror` job: green (pushed to GitHub)

```bash
curl -s "https://gitea.d-ma.be/api/v1/repos/mathias/hyperguild/actions/runs?limit=1" \
  -H "Authorization: token $GITEA_TOKEN" | python3 -c "
import json,sys
d=json.load(sys.stdin)
for r in d.get('workflow_runs',[]):
    print(f'#{r[\"id\"]} {r[\"status\"]:12} {r.get(\"conclusion\",\"\"):8} {r[\"display_title\"][:50]}')
"
```

- [ ] **Step 5: Tag v0.2.0**

```bash
task tag version=v0.2.0
```

---

## Self-review

**Spec coverage check:**
- Session history injection into tdd_green and tdd_refactor → Task 3 ✓
- All new skills accept `session_id` for history injection → Tasks 4–7 ✓
- Trainer reader→writer chain → Task 7 ✓
- Schema enum fixed (was causing retrospective to return wrong phase) → Task 2 ✓
- Phase 2 skills registered in main.go → Tasks 4–7 each include main.go wiring ✓
- CI passes → Task 8 ✓

**Placeholder scan:** None found — all steps include complete code.

**Type consistency:**
- `ExecutorFn` is defined in each skill package as `func(ctx context.Context, req iexec.Request) (iexec.Result, error)` — consistent across tdd, review, debug, spec, trainer ✓
- `Config.SessionsDir` present in all new skills ✓
- `trainer.Config.BrainDir` used in handlers.go Task 7 writer task prompt ✓
- `session.FormatHistory` signature `(entries []Entry, excludePhase string) string` used consistently ✓
- `prependHistory` method is defined identically in review, debug, spec handlers — this is intentional duplication (YAGNI: not enough skills to justify extracting a shared mixin) ✓

**Note on tasks 4–7:** These are independent of each other and can be executed in parallel by separate subagents after Tasks 1–3 are complete.