14 TDD-shaped tasks across two worktrees: hyperguild for code
(internal/routing package, cmd/routing binary, Dockerfile, CD
workflow, mode template, smoke test, docs) and infra for the
k3s manifests (deployment, service, nodeport, SOPS-encrypted
secret). Plan 7 amendment baked in: internal/skills/{review,
debug,retrospective,trainer} survive Plan 6 — Plan 7 only
deletes tdd, spec, and the supervisor binary.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2450 lines
74 KiB
Markdown
2450 lines
74 KiB
Markdown
# Mode 2 Routing Pod Implementation Plan
|
||
|
||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||
|
||
**Goal:** Ship a thin policy pod at `koala:30310` that routes the four cost-routable skill calls (`code_review`, `debug`, `retrospective`, `trainer`) to a LiteLLM-proxied local or Claude model based on per-skill pass rate. Replaces the unconditional supervisor-runs-locally behavior in client-local mode.
|
||
|
||
**Architecture:** New Go binary at `cmd/routing/`, reusing `internal/skills/{review,debug,retrospective,trainer}/`, `internal/exec/litellm.go`, `internal/registry`, and `internal/mcp` (bearer-auth handler from `f49850d`). A new `internal/routing` package adds (a) a pure-function decision policy, (b) a TTL-cached pass-rate fetcher, (c) a session-log decision logger, and (d) a router that wraps a `CompleteFunc` so the existing skill packages stay routing-oblivious. Deployed via Flux at NodePort `:30310` alongside the supervisor and ingestion pods.
|
||
|
||
**Tech Stack:** Go 1.26 stdlib (`net/http`, `crypto/sha256`, `encoding/json`, `time`, `sync`); existing `testify` for tests; SOPS-encrypted Secret in the infra repo; gitea CI buildctl→skopeo; Flux Kustomize reconciliation.
|
||
|
||
---
|
||
|
||
## Plan 6 of 7 — Hyperguild Skill Migration
|
||
|
||
Plans 1–5 merged. Plan 6 is the substantive routing-pod plan; Plan 7 (supervisor retirement) follows.
|
||
|
||
**Spec:** `docs/superpowers/specs/2026-05-04-mode-2-routing-pod-design.md` (committed `51e0123`).
|
||
|
||
### Two worktrees
|
||
|
||
- **Hyperguild worktree:** `~/Documents/local-dev/AI/hyperguild/.worktrees/mode-2-routing-pod/` on branch `feat/mode-2-routing-pod`. Contains the Go code, Dockerfile addition, CD workflow update, mode-template update, README, and smoke test.
|
||
- **Infra worktree:** `~/Documents/local-dev/AI/infra/.worktrees/mode-2-routing-pod/` on branch `feat/routing-pod-manifests`. Contains the k3s manifests for the new pod plus the SOPS-encrypted Secret.
|
||
|
||
Each task's "Files" header names the worktree. Implementer subagents must `cd` into the named worktree before any read/edit/git operation. Plan paths describe the post-merge canonical state (per `2026-05-03-plan-canonical-dispatch-ephemeral` brain entry); dispatch prompts add the worktree translation.
|
||
|
||
### Verification convention
|
||
|
||
Per task, the implementer runs `task check` (lint + test + vet + drift + govulncheck), not just `go test ./...`. CI's lint gate caught a Plan-1 errcheck regression that local tests missed (per `feedback_per_task_verification` memory). Append `//nolint:errcheck` to any `fmt.Fprint*` to stdout/stderr that ignores its return value. Ignored errors on `defer resp.Body.Close()` use `defer func() { _ = resp.Body.Close() }()`.
|
||
|
||
### Status taxonomy for implementer subagents
|
||
|
||
- `DONE` — task completed, all checks green, verification commands ran clean.
|
||
- `DONE_WITH_CONCERNS` — task completed, but the implementer noticed a plan bug, an environmental anomaly, or related code that looks suspicious. Controller decides: doc-patch, follow-up commit, or accept and roll on (per `2026-05-03-done-with-concerns-vs-blocked` brain entry).
|
||
- `BLOCKED` — implementer cannot complete the assigned work. Controller re-dispatches with more context.
|
||
- `NEEDS_CONTEXT` — implementer needs information not in the dispatch (rare; usually a doc bug).
|
||
|
||
### Code-reviewer expectations
|
||
|
||
The reviewer agent surfaces candidate improvements; the controller filters. Per `2026-05-03-code-reviewer-output-as-candidates`, reject reviewer suggestions that add helpers for single-use sites, abstractions for hypothetical futures, or stylistic refactors that diverge from the plan's heredocs. Apply genuine bugs and security findings; defer the rest.
|
||
|
||
### Flux operational note
|
||
|
||
The auth rollout (commit `afe9a08` in infra) demonstrated that Flux server-side-applies the `routing` Deployment every ~30s and strips any `kubectl rollout restart` annotation, deleting the new ReplicaSet's pod. To force a pod restart on a Flux-managed deployment, use `kubectl -n <ns> delete pod -l app=<name>` — the existing ReplicaSet recreates without an annotation Flux can revert.
|
||
|
||
### Plan 7 amendment baked in
|
||
|
||
`internal/skills/{review,debug,retrospective,trainer}/` are reused by the routing pod and **must not be deleted in Plan 7**. Plan 7 deletes only `internal/skills/{tdd,spec}/`, the supervisor binary, the supervisor manifests, and frees NodePort `:30320`. The implementer of Plan 7 must read this paragraph and the matching note in the spec before deleting anything.
|
||
|
||
## File Structure
|
||
|
||
### Hyperguild worktree
|
||
|
||
| Path | Action | Responsibility |
|
||
|---|---|---|
|
||
| `internal/config/routing.go` | create | `RoutingConfig` typed struct, `LoadRouting()` env parser |
|
||
| `internal/config/routing_test.go` | create | Defaults + env-override tests |
|
||
| `internal/routing/policy.go` | create | `Decision` enum, `Policy.Decide(passRate, hash) Decision` |
|
||
| `internal/routing/policy_test.go` | create | Table-driven coverage of all four rules |
|
||
| `internal/routing/hash.go` | create | `CanonicalHash(system, user) uint64` (SHA-256 prefix) |
|
||
| `internal/routing/hash_test.go` | create | Determinism + low-bit distribution sanity |
|
||
| `internal/routing/passrate.go` | create | `Fetcher` with TTL cache, calls `GET /pass-rate` |
|
||
| `internal/routing/passrate_test.go` | create | `httptest.Server`; cache hit/miss, error path |
|
||
| `internal/routing/log.go` | create | `Logger.LogDecision(...)` posts to brain MCP `session_log` |
|
||
| `internal/routing/log_test.go` | create | `httptest.Server` capture + body shape assertion |
|
||
| `internal/routing/router.go` | create | `Router.Run(...)` wraps fetcher + policy + logger + LiteLLM |
|
||
| `internal/routing/router_test.go` | create | Mocked fetcher/logger/litellm; route + fail-open paths |
|
||
| `internal/routing/snapshot_test.go` | create | Asserts routing pod's `tools/list` byte-equals captured snapshot |
|
||
| `internal/routing/testdata/tools_list.snapshot.json` | create | Snapshot from current supervisor advertisement |
|
||
| `cmd/routing/main.go` | create | Wires Config → LiteLLM → Router → Skills → Registry → MCP server |
|
||
| `cmd/routing/main_test.go` | create | Integration test with fakes for LiteLLM + brain |
|
||
| `cmd/hyperguild/mode.go:74-87` | modify | `modeClientLocal` adds `headers: X-Hyperguild-Mode`, removes `_routing_pending` |
|
||
| `cmd/hyperguild/mode_test.go` | modify | Updated assertion for the new shape |
|
||
| `cmd/hyperguild/README.md` | modify | Drop "not deployed yet" note; document the header |
|
||
| `Dockerfile.routing` | create | Builds `cmd/routing`, bakes `config/`, runs as non-root, no claude CLI |
|
||
| `.gitea/workflows/cd.yml` | modify | Build + push routing image; sed `routing/deployment.yaml` in infra |
|
||
| `Taskfile.yml` | modify | Add `smoke:routing` task |
|
||
| `scripts/smoke-routing.sh` | create | Boots binary, hits each tool, asserts brain has `_routing` entries |
|
||
| `README.md` | modify | Mode 2 + new env vars + routing pod URL |
|
||
| `.context/PROJECT.md` | modify | Document `koala:30310/mcp` + the four routed skills |
|
||
|
||
### Infra worktree
|
||
|
||
| Path | Action | Responsibility |
|
||
|---|---|---|
|
||
| `k3s/apps/routing/namespace.yaml` | create | Namespace `routing` |
|
||
| `k3s/apps/routing/deployment.yaml` | create | One-replica Deployment, koala nodeSelector, image from gitea registry |
|
||
| `k3s/apps/routing/service.yaml` | create | ClusterIP `routing` on port 3210 |
|
||
| `k3s/apps/routing/nodeport.yaml` | create | NodePort 30310 → service 3210 |
|
||
| `k3s/apps/routing/secrets.enc.yaml` | create | SOPS-encrypted `LITELLM_API_KEY` + optional `ROUTING_MCP_TOKEN` |
|
||
| `k3s/apps/routing/kustomization.yaml` | create | Bundles the above |
|
||
| `k3s/apps/kustomization.yaml` | modify | Add `routing` to the apps list |
|
||
|
||
---
|
||
|
||
## Task 1: `RoutingConfig` struct + env parser
|
||
|
||
**Worktree:** hyperguild
|
||
|
||
Typed config struct for the routing pod. New struct (not appended to `Config`) because the routing pod's surface differs from the supervisor's; merging would force every routing field onto the supervisor and vice versa.
|
||
|
||
**Files:**
|
||
- Create: `internal/config/routing.go`
|
||
- Create: `internal/config/routing_test.go`
|
||
|
||
- [ ] **Step 1: Write the failing test**
|
||
|
||
Create `internal/config/routing_test.go`:
|
||
|
||
```go
|
||
package config_test
|
||
|
||
import (
|
||
"testing"
|
||
|
||
"github.com/mathiasbq/supervisor/internal/config"
|
||
"github.com/stretchr/testify/assert"
|
||
"github.com/stretchr/testify/require"
|
||
)
|
||
|
||
func TestLoadRoutingDefaults(t *testing.T) {
|
||
for _, k := range []string{
|
||
"ROUTING_PORT", "ROUTING_MCP_TOKEN", "LITELLM_BASE_URL", "LITELLM_API_KEY",
|
||
"BRAIN_URL", "HYPERGUILD_LOCAL_MODEL", "HYPERGUILD_CLAUDE_MODEL",
|
||
"HYPERGUILD_ROUTE_LOCAL_FLOOR", "HYPERGUILD_ROUTE_LOCAL_CEIL",
|
||
"HYPERGUILD_PASS_RATE_TTL_SECONDS",
|
||
} {
|
||
t.Setenv(k, "")
|
||
}
|
||
|
||
cfg, err := config.LoadRouting()
|
||
require.NoError(t, err)
|
||
assert.Equal(t, "3210", cfg.Port)
|
||
assert.Equal(t, "", cfg.MCPAuthToken)
|
||
assert.Equal(t, "http://piguard:4000", cfg.LiteLLMBaseURL)
|
||
assert.Equal(t, "http://ingestion.supervisor:3300", cfg.BrainURL)
|
||
assert.Equal(t, "qwen35", cfg.LocalModel)
|
||
assert.Equal(t, "claude-sonnet-4-6", cfg.ClaudeModel)
|
||
assert.InDelta(t, 0.90, cfg.RouteLocalFloor, 1e-9)
|
||
assert.InDelta(t, 0.70, cfg.RouteLocalCeil, 1e-9)
|
||
assert.Equal(t, 60, cfg.PassRateTTLSeconds)
|
||
}
|
||
|
||
func TestLoadRoutingFromEnv(t *testing.T) {
|
||
t.Setenv("ROUTING_PORT", "3250")
|
||
t.Setenv("ROUTING_MCP_TOKEN", "tok-xyz")
|
||
t.Setenv("LITELLM_BASE_URL", "http://localhost:4000")
|
||
t.Setenv("LITELLM_API_KEY", "lk")
|
||
t.Setenv("BRAIN_URL", "http://localhost:3300")
|
||
t.Setenv("HYPERGUILD_LOCAL_MODEL", "qwen2-7b")
|
||
t.Setenv("HYPERGUILD_CLAUDE_MODEL", "claude-opus-4-7")
|
||
t.Setenv("HYPERGUILD_ROUTE_LOCAL_FLOOR", "0.85")
|
||
t.Setenv("HYPERGUILD_ROUTE_LOCAL_CEIL", "0.65")
|
||
t.Setenv("HYPERGUILD_PASS_RATE_TTL_SECONDS", "30")
|
||
|
||
cfg, err := config.LoadRouting()
|
||
require.NoError(t, err)
|
||
assert.Equal(t, "3250", cfg.Port)
|
||
assert.Equal(t, "tok-xyz", cfg.MCPAuthToken)
|
||
assert.Equal(t, "http://localhost:4000", cfg.LiteLLMBaseURL)
|
||
assert.Equal(t, "lk", cfg.LiteLLMAPIKey)
|
||
assert.Equal(t, "http://localhost:3300", cfg.BrainURL)
|
||
assert.Equal(t, "qwen2-7b", cfg.LocalModel)
|
||
assert.Equal(t, "claude-opus-4-7", cfg.ClaudeModel)
|
||
assert.InDelta(t, 0.85, cfg.RouteLocalFloor, 1e-9)
|
||
assert.InDelta(t, 0.65, cfg.RouteLocalCeil, 1e-9)
|
||
assert.Equal(t, 30, cfg.PassRateTTLSeconds)
|
||
}
|
||
|
||
func TestLoadRoutingRejectsBadFloat(t *testing.T) {
|
||
t.Setenv("HYPERGUILD_ROUTE_LOCAL_FLOOR", "not-a-number")
|
||
_, err := config.LoadRouting()
|
||
require.Error(t, err)
|
||
assert.Contains(t, err.Error(), "HYPERGUILD_ROUTE_LOCAL_FLOOR")
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Run the test to confirm it fails**
|
||
|
||
```bash
|
||
cd ~/Documents/local-dev/AI/hyperguild/.worktrees/mode-2-routing-pod
|
||
go test ./internal/config/... -run TestLoadRouting -v
|
||
```
|
||
|
||
Expected: FAIL — `undefined: config.LoadRouting` and `undefined: config.RoutingConfig`.
|
||
|
||
- [ ] **Step 3: Write the implementation**
|
||
|
||
Create `internal/config/routing.go`:
|
||
|
||
```go
|
||
package config
|
||
|
||
import (
|
||
"fmt"
|
||
"os"
|
||
"strconv"
|
||
)
|
||
|
||
// RoutingConfig holds the runtime configuration for the routing pod.
|
||
// Separate from Config because the routing pod's surface differs from the supervisor's.
|
||
type RoutingConfig struct {
|
||
Port string // ROUTING_PORT, default 3210
|
||
MCPAuthToken string // ROUTING_MCP_TOKEN, optional bearer token
|
||
LiteLLMBaseURL string // LITELLM_BASE_URL, default http://piguard:4000
|
||
LiteLLMAPIKey string // LITELLM_API_KEY
|
||
BrainURL string // BRAIN_URL, default http://ingestion.supervisor:3300
|
||
LocalModel string // HYPERGUILD_LOCAL_MODEL, default qwen35
|
||
ClaudeModel string // HYPERGUILD_CLAUDE_MODEL, default claude-sonnet-4-6
|
||
RouteLocalFloor float64 // HYPERGUILD_ROUTE_LOCAL_FLOOR, default 0.90
|
||
RouteLocalCeil float64 // HYPERGUILD_ROUTE_LOCAL_CEIL, default 0.70
|
||
PassRateTTLSeconds int // HYPERGUILD_PASS_RATE_TTL_SECONDS, default 60
|
||
}
|
||
|
||
func LoadRouting() (RoutingConfig, error) {
|
||
cfg := RoutingConfig{
|
||
Port: envOr("ROUTING_PORT", "3210"),
|
||
MCPAuthToken: os.Getenv("ROUTING_MCP_TOKEN"),
|
||
LiteLLMBaseURL: envOr("LITELLM_BASE_URL", "http://piguard:4000"),
|
||
LiteLLMAPIKey: os.Getenv("LITELLM_API_KEY"),
|
||
BrainURL: envOr("BRAIN_URL", "http://ingestion.supervisor:3300"),
|
||
LocalModel: envOr("HYPERGUILD_LOCAL_MODEL", "qwen35"),
|
||
ClaudeModel: envOr("HYPERGUILD_CLAUDE_MODEL", "claude-sonnet-4-6"),
|
||
}
|
||
|
||
floor, err := parseFloatEnv("HYPERGUILD_ROUTE_LOCAL_FLOOR", 0.90)
|
||
if err != nil {
|
||
return RoutingConfig{}, err
|
||
}
|
||
cfg.RouteLocalFloor = floor
|
||
|
||
ceil, err := parseFloatEnv("HYPERGUILD_ROUTE_LOCAL_CEIL", 0.70)
|
||
if err != nil {
|
||
return RoutingConfig{}, err
|
||
}
|
||
cfg.RouteLocalCeil = ceil
|
||
|
||
ttl, err := parseIntEnv("HYPERGUILD_PASS_RATE_TTL_SECONDS", 60)
|
||
if err != nil {
|
||
return RoutingConfig{}, err
|
||
}
|
||
cfg.PassRateTTLSeconds = ttl
|
||
|
||
return cfg, nil
|
||
}
|
||
|
||
func parseFloatEnv(key string, def float64) (float64, error) {
|
||
v := os.Getenv(key)
|
||
if v == "" {
|
||
return def, nil
|
||
}
|
||
f, err := strconv.ParseFloat(v, 64)
|
||
if err != nil {
|
||
return 0, fmt.Errorf("config: %s: %w", key, err)
|
||
}
|
||
return f, nil
|
||
}
|
||
|
||
func parseIntEnv(key string, def int) (int, error) {
|
||
v := os.Getenv(key)
|
||
if v == "" {
|
||
return def, nil
|
||
}
|
||
n, err := strconv.Atoi(v)
|
||
if err != nil {
|
||
return 0, fmt.Errorf("config: %s: %w", key, err)
|
||
}
|
||
return n, nil
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Run the test to confirm it passes**
|
||
|
||
```bash
|
||
go test ./internal/config/... -run TestLoadRouting -v
|
||
```
|
||
|
||
Expected: PASS — three subtests green.
|
||
|
||
- [ ] **Step 5: Run `task check`**
|
||
|
||
```bash
|
||
task check 2>&1 | tail -20
|
||
```
|
||
|
||
Expected: lint clean, test green, vet clean, no drift, govulncheck clean.
|
||
|
||
- [ ] **Step 6: Commit**
|
||
|
||
```bash
|
||
git add internal/config/routing.go internal/config/routing_test.go
|
||
git commit -m "feat(routing): RoutingConfig + LoadRouting"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 2: Decision policy
|
||
|
||
**Worktree:** hyperguild
|
||
|
||
Pure-function policy with no I/O. Decision rules in priority order: null → local; ≥floor → local; <ceil → claude; otherwise sample-band hash split.
|
||
|
||
**Files:**
|
||
- Create: `internal/routing/policy.go`
|
||
- Create: `internal/routing/policy_test.go`
|
||
|
||
- [ ] **Step 1: Write the failing test**
|
||
|
||
Create `internal/routing/policy_test.go`:
|
||
|
||
```go
|
||
package routing_test
|
||
|
||
import (
|
||
"testing"
|
||
|
||
"github.com/mathiasbq/supervisor/internal/routing"
|
||
"github.com/stretchr/testify/assert"
|
||
)
|
||
|
||
func ptr(f float64) *float64 { return &f }
|
||
|
||
func TestPolicyDecide(t *testing.T) {
|
||
p := routing.Policy{Floor: 0.9, Ceil: 0.7}
|
||
|
||
cases := []struct {
|
||
name string
|
||
passRate *float64
|
||
hash uint64
|
||
want routing.Decision
|
||
}{
|
||
{"null pass rate → local", nil, 0, routing.DecideLocal},
|
||
{"null pass rate, hash irrelevant → local", nil, 0xDEADBEEF, routing.DecideLocal},
|
||
{"at floor → local", ptr(0.9), 0, routing.DecideLocal},
|
||
{"above floor → local", ptr(0.95), 0, routing.DecideLocal},
|
||
{"below ceil → claude", ptr(0.5), 0, routing.DecideClaude},
|
||
{"at ceil → sample-band even-hash → local", ptr(0.7), 0, routing.DecideLocal},
|
||
{"sample band, even hash → local", ptr(0.8), 2, routing.DecideLocal},
|
||
{"sample band, odd hash → claude", ptr(0.8), 3, routing.DecideClaude},
|
||
}
|
||
|
||
for _, tc := range cases {
|
||
t.Run(tc.name, func(t *testing.T) {
|
||
assert.Equal(t, tc.want, p.Decide(tc.passRate, tc.hash))
|
||
})
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Run the test**
|
||
|
||
```bash
|
||
go test ./internal/routing/... -run TestPolicyDecide -v
|
||
```
|
||
|
||
Expected: FAIL — package `internal/routing` does not exist.
|
||
|
||
- [ ] **Step 3: Write the implementation**
|
||
|
||
Create `internal/routing/policy.go`:
|
||
|
||
```go
|
||
package routing
|
||
|
||
// Decision is the route picked for a single skill call.
|
||
type Decision int
|
||
|
||
const (
|
||
DecideLocal Decision = iota
|
||
DecideClaude
|
||
)
|
||
|
||
func (d Decision) String() string {
|
||
if d == DecideLocal {
|
||
return "local"
|
||
}
|
||
return "claude"
|
||
}
|
||
|
||
// Policy holds the floor/ceil thresholds for routing decisions.
|
||
//
|
||
// Rules (in order):
|
||
//
|
||
// 1. passRate == nil → DecideLocal (default-to-local for cost-routable skills)
|
||
// 2. *passRate >= Floor → DecideLocal (trust local)
|
||
// 3. *passRate < Ceil → DecideClaude (don't trust local)
|
||
// 4. otherwise (sample band) → requestHash low bit picks: 0=local, 1=claude
|
||
type Policy struct {
|
||
Floor float64
|
||
Ceil float64
|
||
}
|
||
|
||
// Decide returns the routing decision for a single call.
|
||
// requestHash is consulted only when passRate is in the sample band [Ceil, Floor).
|
||
func (p Policy) Decide(passRate *float64, requestHash uint64) Decision {
|
||
if passRate == nil {
|
||
return DecideLocal
|
||
}
|
||
if *passRate >= p.Floor {
|
||
return DecideLocal
|
||
}
|
||
if *passRate < p.Ceil {
|
||
return DecideClaude
|
||
}
|
||
if requestHash&1 == 0 {
|
||
return DecideLocal
|
||
}
|
||
return DecideClaude
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Run the test to confirm it passes**
|
||
|
||
```bash
|
||
go test ./internal/routing/... -run TestPolicyDecide -v
|
||
```
|
||
|
||
Expected: PASS — eight subtests green.
|
||
|
||
- [ ] **Step 5: Run `task check`**
|
||
|
||
```bash
|
||
task check 2>&1 | tail -10
|
||
```
|
||
|
||
- [ ] **Step 6: Commit**
|
||
|
||
```bash
|
||
git add internal/routing/policy.go internal/routing/policy_test.go
|
||
git commit -m "feat(routing): decision policy"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 3: Canonical request hash
|
||
|
||
**Worktree:** hyperguild
|
||
|
||
SHA-256-based hash of `(system, user)` for deterministic sample-band routing. Same prompt pair → same decision across calls.
|
||
|
||
**Files:**
|
||
- Create: `internal/routing/hash.go`
|
||
- Create: `internal/routing/hash_test.go`
|
||
|
||
- [ ] **Step 1: Write the failing test**
|
||
|
||
Create `internal/routing/hash_test.go`:
|
||
|
||
```go
|
||
package routing_test
|
||
|
||
import (
|
||
"testing"
|
||
|
||
"github.com/mathiasbq/supervisor/internal/routing"
|
||
"github.com/stretchr/testify/assert"
|
||
)
|
||
|
||
func TestCanonicalHashDeterministic(t *testing.T) {
|
||
a := routing.CanonicalHash("system one", "user one")
|
||
b := routing.CanonicalHash("system one", "user one")
|
||
assert.Equal(t, a, b, "same inputs must produce same hash")
|
||
}
|
||
|
||
func TestCanonicalHashDistinguishesInputs(t *testing.T) {
|
||
cases := [][2]string{
|
||
{"sys", "user"},
|
||
{"sys", "user2"},
|
||
{"sys2", "user"},
|
||
{"", "system\x00user"}, // separator collision attempt
|
||
{"system\x00user", ""},
|
||
}
|
||
seen := make(map[uint64]bool)
|
||
for _, c := range cases {
|
||
h := routing.CanonicalHash(c[0], c[1])
|
||
assert.False(t, seen[h], "collision on %v", c)
|
||
seen[h] = true
|
||
}
|
||
}
|
||
|
||
func TestCanonicalHashLowBitDistribution(t *testing.T) {
|
||
// Sanity check: across 1000 distinct inputs, low-bit split is roughly even.
|
||
zeros, ones := 0, 0
|
||
for i := 0; i < 1000; i++ {
|
||
h := routing.CanonicalHash("sys", string(rune('a'+(i%26)))+string(rune(i)))
|
||
if h&1 == 0 {
|
||
zeros++
|
||
} else {
|
||
ones++
|
||
}
|
||
}
|
||
// Allow ±15% deviation from 500/500. Tighter would be flaky on real data.
|
||
assert.InDelta(t, 500, zeros, 150)
|
||
assert.InDelta(t, 500, ones, 150)
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Run the test**
|
||
|
||
```bash
|
||
go test ./internal/routing/... -run TestCanonicalHash -v
|
||
```
|
||
|
||
Expected: FAIL — `undefined: routing.CanonicalHash`.
|
||
|
||
- [ ] **Step 3: Write the implementation**
|
||
|
||
Create `internal/routing/hash.go`:
|
||
|
||
```go
|
||
package routing
|
||
|
||
import (
|
||
"crypto/sha256"
|
||
"encoding/binary"
|
||
)
|
||
|
||
// CanonicalHash returns a deterministic 64-bit hash of (system, user).
|
||
// Used to make sample-band routing decisions reproducible: identical input
|
||
// strings produce the same hash on every call, independent of process state.
|
||
//
|
||
// Inputs are joined with a 0x00 byte separator before hashing — distinguishes
|
||
// (system="ab", user="cd") from (system="abcd", user="").
|
||
func CanonicalHash(system, user string) uint64 {
|
||
h := sha256.New()
|
||
h.Write([]byte(system))
|
||
h.Write([]byte{0})
|
||
h.Write([]byte(user))
|
||
sum := h.Sum(nil)
|
||
return binary.BigEndian.Uint64(sum[:8])
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Run tests + `task check`**
|
||
|
||
```bash
|
||
go test ./internal/routing/... -run TestCanonicalHash -v
|
||
task check 2>&1 | tail -10
|
||
```
|
||
|
||
Expected: PASS, all checks green.
|
||
|
||
- [ ] **Step 5: Commit**
|
||
|
||
```bash
|
||
git add internal/routing/hash.go internal/routing/hash_test.go
|
||
git commit -m "feat(routing): canonical request hash"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 4: Pass-rate fetcher with TTL cache
|
||
|
||
**Worktree:** hyperguild
|
||
|
||
HTTP client that calls `GET ${BrainURL}/pass-rate?skill=X&window=7d`, caches the response (`*float64`, possibly nil) for `TTL`. On error, returns `(nil, err)` so the dispatch wrapper falls through to default-to-local.
|
||
|
||
**Files:**
|
||
- Create: `internal/routing/passrate.go`
|
||
- Create: `internal/routing/passrate_test.go`
|
||
|
||
- [ ] **Step 1: Write the failing test**
|
||
|
||
Create `internal/routing/passrate_test.go`:
|
||
|
||
```go
|
||
package routing_test
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
"net/http"
|
||
"net/http/httptest"
|
||
"sync/atomic"
|
||
"testing"
|
||
"time"
|
||
|
||
"github.com/mathiasbq/supervisor/internal/routing"
|
||
"github.com/stretchr/testify/assert"
|
||
"github.com/stretchr/testify/require"
|
||
)
|
||
|
||
func TestFetcherGetReturnsPassRate(t *testing.T) {
|
||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||
assert.Equal(t, http.MethodGet, r.Method)
|
||
assert.Equal(t, "/pass-rate", r.URL.Path)
|
||
assert.Equal(t, "tdd", r.URL.Query().Get("skill"))
|
||
assert.Equal(t, "7d", r.URL.Query().Get("window"))
|
||
w.Header().Set("Content-Type", "application/json")
|
||
_ = json.NewEncoder(w).Encode(map[string]any{"skill": "tdd", "pass_rate": 0.94})
|
||
}))
|
||
defer srv.Close()
|
||
|
||
f := routing.NewFetcher(srv.URL, "7d", time.Minute)
|
||
pr, err := f.Get(context.Background(), "tdd")
|
||
require.NoError(t, err)
|
||
require.NotNil(t, pr)
|
||
assert.InDelta(t, 0.94, *pr, 1e-9)
|
||
}
|
||
|
||
func TestFetcherGetReturnsNilWhenNoData(t *testing.T) {
|
||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||
_ = json.NewEncoder(w).Encode(map[string]any{"skill": "novel", "pass_rate": nil})
|
||
}))
|
||
defer srv.Close()
|
||
|
||
f := routing.NewFetcher(srv.URL, "7d", time.Minute)
|
||
pr, err := f.Get(context.Background(), "novel")
|
||
require.NoError(t, err)
|
||
assert.Nil(t, pr)
|
||
}
|
||
|
||
func TestFetcherCachesWithinTTL(t *testing.T) {
|
||
var calls int32
|
||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||
atomic.AddInt32(&calls, 1)
|
||
_ = json.NewEncoder(w).Encode(map[string]any{"pass_rate": 0.5})
|
||
}))
|
||
defer srv.Close()
|
||
|
||
f := routing.NewFetcher(srv.URL, "7d", time.Minute)
|
||
for i := 0; i < 5; i++ {
|
||
_, err := f.Get(context.Background(), "tdd")
|
||
require.NoError(t, err)
|
||
}
|
||
assert.Equal(t, int32(1), atomic.LoadInt32(&calls), "should hit upstream once and serve four times from cache")
|
||
}
|
||
|
||
func TestFetcherSurfacesUpstreamError(t *testing.T) {
|
||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||
http.Error(w, "boom", http.StatusInternalServerError)
|
||
}))
|
||
defer srv.Close()
|
||
|
||
f := routing.NewFetcher(srv.URL, "7d", time.Minute)
|
||
pr, err := f.Get(context.Background(), "tdd")
|
||
require.Error(t, err)
|
||
assert.Nil(t, pr)
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Run the test**
|
||
|
||
```bash
|
||
go test ./internal/routing/... -run TestFetcher -v
|
||
```
|
||
|
||
Expected: FAIL — `undefined: routing.NewFetcher`.
|
||
|
||
- [ ] **Step 3: Write the implementation**
|
||
|
||
Create `internal/routing/passrate.go`:
|
||
|
||
```go
|
||
package routing
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
"fmt"
|
||
"net/http"
|
||
"net/url"
|
||
"sync"
|
||
"time"
|
||
)
|
||
|
||
// Fetcher reads /pass-rate from the brain pod with a per-skill TTL cache.
|
||
type Fetcher struct {
|
||
BaseURL string
|
||
Window string
|
||
TTL time.Duration
|
||
HTTP *http.Client
|
||
|
||
mu sync.Mutex
|
||
cache map[string]cachedRate
|
||
}
|
||
|
||
type cachedRate struct {
|
||
value *float64
|
||
at time.Time
|
||
}
|
||
|
||
type passRateResponse struct {
|
||
PassRate *float64 `json:"pass_rate"`
|
||
}
|
||
|
||
// NewFetcher returns a Fetcher that calls baseURL + /pass-rate with the
|
||
// given window string. If ttl is zero, defaults to 60 seconds. The HTTP
|
||
// client uses a 1-second total timeout.
|
||
func NewFetcher(baseURL, window string, ttl time.Duration) *Fetcher {
|
||
if ttl == 0 {
|
||
ttl = 60 * time.Second
|
||
}
|
||
return &Fetcher{
|
||
BaseURL: baseURL,
|
||
Window: window,
|
||
TTL: ttl,
|
||
HTTP: &http.Client{Timeout: time.Second},
|
||
cache: make(map[string]cachedRate),
|
||
}
|
||
}
|
||
|
||
// Get returns the pass rate for the named skill, or nil if no data exists,
|
||
// or an error if the brain is unreachable. Caches successful results.
|
||
func (f *Fetcher) Get(ctx context.Context, skill string) (*float64, error) {
|
||
f.mu.Lock()
|
||
if c, ok := f.cache[skill]; ok && time.Since(c.at) < f.TTL {
|
||
v := c.value
|
||
f.mu.Unlock()
|
||
return v, nil
|
||
}
|
||
f.mu.Unlock()
|
||
|
||
u := fmt.Sprintf("%s/pass-rate?skill=%s&window=%s",
|
||
f.BaseURL, url.QueryEscape(skill), url.QueryEscape(f.Window))
|
||
req, err := http.NewRequestWithContext(ctx, http.MethodGet, u, nil)
|
||
if err != nil {
|
||
return nil, fmt.Errorf("passrate: build request: %w", err)
|
||
}
|
||
resp, err := f.HTTP.Do(req)
|
||
if err != nil {
|
||
return nil, fmt.Errorf("passrate: request: %w", err)
|
||
}
|
||
defer func() { _ = resp.Body.Close() }()
|
||
if resp.StatusCode != http.StatusOK {
|
||
return nil, fmt.Errorf("passrate: server returned status %d", resp.StatusCode)
|
||
}
|
||
|
||
var body passRateResponse
|
||
if err := json.NewDecoder(resp.Body).Decode(&body); err != nil {
|
||
return nil, fmt.Errorf("passrate: decode: %w", err)
|
||
}
|
||
|
||
f.mu.Lock()
|
||
f.cache[skill] = cachedRate{value: body.PassRate, at: time.Now()}
|
||
f.mu.Unlock()
|
||
|
||
return body.PassRate, nil
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Run tests + `task check`**
|
||
|
||
```bash
|
||
go test ./internal/routing/... -run TestFetcher -v
|
||
task check 2>&1 | tail -10
|
||
```
|
||
|
||
- [ ] **Step 5: Commit**
|
||
|
||
```bash
|
||
git add internal/routing/passrate.go internal/routing/passrate_test.go
|
||
git commit -m "feat(routing): pass-rate fetcher with TTL cache"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 5: Decision logger
|
||
|
||
**Worktree:** hyperguild
|
||
|
||
Posts a `session_log` MCP call to the brain pod's `/mcp` endpoint after every routing decision. Best-effort: returns errors but the caller does not block real work on them.
|
||
|
||
**Files:**
|
||
- Create: `internal/routing/log.go`
|
||
- Create: `internal/routing/log_test.go`
|
||
|
||
- [ ] **Step 1: Write the failing test**
|
||
|
||
Create `internal/routing/log_test.go`:
|
||
|
||
```go
|
||
package routing_test
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
"io"
|
||
"net/http"
|
||
"net/http/httptest"
|
||
"testing"
|
||
|
||
"github.com/mathiasbq/supervisor/internal/routing"
|
||
"github.com/stretchr/testify/assert"
|
||
"github.com/stretchr/testify/require"
|
||
)
|
||
|
||
func TestLoggerLogDecision(t *testing.T) {
|
||
var captured map[string]any
|
||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||
assert.Equal(t, http.MethodPost, r.Method)
|
||
assert.Equal(t, "/mcp", r.URL.Path)
|
||
body, _ := io.ReadAll(r.Body)
|
||
require.NoError(t, json.Unmarshal(body, &captured))
|
||
_ = json.NewEncoder(w).Encode(map[string]any{"jsonrpc": "2.0", "id": 1, "result": map[string]any{"content": []map[string]any{{"type": "text", "text": "ok"}}}})
|
||
}))
|
||
defer srv.Close()
|
||
|
||
l := routing.NewLogger(srv.URL)
|
||
err := l.LogDecision(context.Background(), routing.LogEntry{
|
||
SessionID: "sess-1",
|
||
Skill: "code_review",
|
||
Decision: "local",
|
||
Message: "model=qwen35, pass_rate=0.94",
|
||
ProjectRoot: "/home/x/proj",
|
||
DurationMs: 1234,
|
||
Failed: false,
|
||
})
|
||
require.NoError(t, err)
|
||
|
||
params := captured["params"].(map[string]any)
|
||
assert.Equal(t, "tools/call", captured["method"])
|
||
assert.Equal(t, "session_log", params["name"])
|
||
|
||
args := params["arguments"].(map[string]any)
|
||
assert.Equal(t, "_routing", args["skill"])
|
||
assert.Equal(t, "decide", args["phase"])
|
||
assert.Equal(t, "skip", args["final_status"])
|
||
assert.Contains(t, args["message"].(string), "code_review: local")
|
||
assert.Equal(t, "sess-1", args["session_id"])
|
||
assert.Equal(t, "/home/x/proj", args["project_root"])
|
||
assert.Equal(t, float64(1234), args["duration_ms"])
|
||
}
|
||
|
||
func TestLoggerLogFailure(t *testing.T) {
|
||
var captured map[string]any
|
||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||
body, _ := io.ReadAll(r.Body)
|
||
_ = json.Unmarshal(body, &captured)
|
||
_ = json.NewEncoder(w).Encode(map[string]any{"jsonrpc": "2.0", "id": 1, "result": map[string]any{}})
|
||
}))
|
||
defer srv.Close()
|
||
|
||
l := routing.NewLogger(srv.URL)
|
||
err := l.LogDecision(context.Background(), routing.LogEntry{
|
||
SessionID: "s", Skill: "debug", Decision: "local", Message: "litellm down", Failed: true,
|
||
})
|
||
require.NoError(t, err)
|
||
|
||
args := captured["params"].(map[string]any)["arguments"].(map[string]any)
|
||
assert.Equal(t, "fail", args["final_status"])
|
||
}
|
||
|
||
func TestLoggerSurfacesUpstreamError(t *testing.T) {
|
||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||
http.Error(w, "down", http.StatusBadGateway)
|
||
}))
|
||
defer srv.Close()
|
||
|
||
l := routing.NewLogger(srv.URL)
|
||
err := l.LogDecision(context.Background(), routing.LogEntry{Skill: "x", SessionID: "y", Decision: "local"})
|
||
require.Error(t, err)
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Run the test**
|
||
|
||
```bash
|
||
go test ./internal/routing/... -run TestLogger -v
|
||
```
|
||
|
||
Expected: FAIL — `undefined: routing.NewLogger`.
|
||
|
||
- [ ] **Step 3: Write the implementation**
|
||
|
||
Create `internal/routing/log.go`:
|
||
|
||
```go
|
||
package routing
|
||
|
||
import (
|
||
"bytes"
|
||
"context"
|
||
"encoding/json"
|
||
"fmt"
|
||
"net/http"
|
||
"time"
|
||
)
|
||
|
||
// LogEntry describes a single routing decision to log via the brain MCP.
|
||
type LogEntry struct {
|
||
SessionID string
|
||
Skill string // the original skill the call routed (e.g., "code_review")
|
||
Decision string // "local" or "claude" or "claude_fallback"
|
||
Message string // free-form, e.g. "model=qwen35, pass_rate=0.94"
|
||
ProjectRoot string
|
||
DurationMs int64
|
||
Failed bool // true → final_status: "fail"; false → "skip"
|
||
}
|
||
|
||
// Logger posts session_log entries to a brain MCP at BrainURL + /mcp.
|
||
type Logger struct {
|
||
BrainURL string
|
||
HTTP *http.Client
|
||
}
|
||
|
||
// NewLogger creates a Logger with a 2-second HTTP timeout.
|
||
func NewLogger(brainURL string) *Logger {
|
||
return &Logger{
|
||
BrainURL: brainURL,
|
||
HTTP: &http.Client{Timeout: 2 * time.Second},
|
||
}
|
||
}
|
||
|
||
// LogDecision posts a session_log MCP call. Errors are returned but the caller
|
||
// MUST NOT block real work on them — logging is best-effort.
|
||
func (l *Logger) LogDecision(ctx context.Context, e LogEntry) error {
|
||
status := "skip"
|
||
if e.Failed {
|
||
status = "fail"
|
||
}
|
||
payload := map[string]any{
|
||
"jsonrpc": "2.0",
|
||
"id": 1,
|
||
"method": "tools/call",
|
||
"params": map[string]any{
|
||
"name": "session_log",
|
||
"arguments": map[string]any{
|
||
"session_id": e.SessionID,
|
||
"skill": "_routing",
|
||
"phase": "decide",
|
||
"final_status": status,
|
||
"message": fmt.Sprintf("%s: %s — %s", e.Skill, e.Decision, e.Message),
|
||
"duration_ms": e.DurationMs,
|
||
"project_root": e.ProjectRoot,
|
||
},
|
||
},
|
||
}
|
||
body, err := json.Marshal(payload)
|
||
if err != nil {
|
||
return fmt.Errorf("log: marshal: %w", err)
|
||
}
|
||
req, err := http.NewRequestWithContext(ctx, http.MethodPost, l.BrainURL+"/mcp", bytes.NewReader(body))
|
||
if err != nil {
|
||
return fmt.Errorf("log: build request: %w", err)
|
||
}
|
||
req.Header.Set("Content-Type", "application/json")
|
||
resp, err := l.HTTP.Do(req)
|
||
if err != nil {
|
||
return fmt.Errorf("log: request: %w", err)
|
||
}
|
||
defer func() { _ = resp.Body.Close() }()
|
||
if resp.StatusCode != http.StatusOK {
|
||
return fmt.Errorf("log: server returned status %d", resp.StatusCode)
|
||
}
|
||
return nil
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Run tests + `task check`**
|
||
|
||
```bash
|
||
go test ./internal/routing/... -run TestLogger -v
|
||
task check 2>&1 | tail -10
|
||
```
|
||
|
||
- [ ] **Step 5: Commit**
|
||
|
||
```bash
|
||
git add internal/routing/log.go internal/routing/log_test.go
|
||
git commit -m "feat(routing): decision logger via brain MCP session_log"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 6: Router (dispatch wrapper)
|
||
|
||
**Worktree:** hyperguild
|
||
|
||
Composes Fetcher + Policy + Logger + a `CompleteFunc`. The wrapper is what the four skill packages receive as their `CompleteFunc`. On a local-route error, it falls open by retrying once on the Claude model.
|
||
|
||
**Files:**
|
||
- Create: `internal/routing/router.go`
|
||
- Create: `internal/routing/router_test.go`
|
||
|
||
- [ ] **Step 1: Write the failing test**
|
||
|
||
Create `internal/routing/router_test.go`:
|
||
|
||
```go
|
||
package routing_test
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
"errors"
|
||
"net/http"
|
||
"net/http/httptest"
|
||
"sync"
|
||
"testing"
|
||
"time"
|
||
|
||
"github.com/mathiasbq/supervisor/internal/routing"
|
||
"github.com/stretchr/testify/assert"
|
||
"github.com/stretchr/testify/require"
|
||
)
|
||
|
||
type fakeLLM struct {
|
||
mu sync.Mutex
|
||
calls []struct{ Model, System, User string }
|
||
resp string
|
||
err error
|
||
errOn string // if non-empty, only the named model errors
|
||
}
|
||
|
||
func (f *fakeLLM) Complete(_ context.Context, model, system, user string) (string, int64, error) {
|
||
f.mu.Lock()
|
||
defer f.mu.Unlock()
|
||
f.calls = append(f.calls, struct{ Model, System, User string }{model, system, user})
|
||
if f.errOn == model {
|
||
return "", 0, f.err
|
||
}
|
||
if f.err != nil && f.errOn == "" {
|
||
return "", 0, f.err
|
||
}
|
||
return f.resp, 100, nil
|
||
}
|
||
|
||
func newRouter(t *testing.T, llm *fakeLLM, passRate float64) (*routing.Router, *httptest.Server, *httptest.Server) {
|
||
t.Helper()
|
||
brain := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||
switch r.URL.Path {
|
||
case "/pass-rate":
|
||
_ = json.NewEncoder(w).Encode(map[string]any{"pass_rate": passRate})
|
||
case "/mcp":
|
||
_ = json.NewEncoder(w).Encode(map[string]any{"jsonrpc": "2.0", "id": 1, "result": map[string]any{}})
|
||
}
|
||
}))
|
||
t.Cleanup(brain.Close)
|
||
|
||
r := &routing.Router{
|
||
Fetcher: routing.NewFetcher(brain.URL, "7d", time.Minute),
|
||
Logger: routing.NewLogger(brain.URL),
|
||
Policy: routing.Policy{Floor: 0.9, Ceil: 0.7},
|
||
LocalModel: "qwen35",
|
||
ClaudeModel: "claude-sonnet-4-6",
|
||
Complete: llm.Complete,
|
||
}
|
||
return r, brain, brain
|
||
}
|
||
|
||
func TestRouterRoutesLocalAtHighPassRate(t *testing.T) {
|
||
llm := &fakeLLM{resp: "ok"}
|
||
r, _, _ := newRouter(t, llm, 0.95)
|
||
|
||
out, _, err := r.Run(context.Background(), routing.RunInput{
|
||
Skill: "code_review", System: "sys", User: "user", SessionID: "s1", ProjectRoot: "/p",
|
||
})
|
||
require.NoError(t, err)
|
||
assert.Equal(t, "ok", out)
|
||
|
||
llm.mu.Lock()
|
||
defer llm.mu.Unlock()
|
||
require.Len(t, llm.calls, 1)
|
||
assert.Equal(t, "qwen35", llm.calls[0].Model)
|
||
}
|
||
|
||
func TestRouterRoutesClaudeAtLowPassRate(t *testing.T) {
|
||
llm := &fakeLLM{resp: "ok"}
|
||
r, _, _ := newRouter(t, llm, 0.3)
|
||
|
||
_, _, err := r.Run(context.Background(), routing.RunInput{
|
||
Skill: "code_review", System: "sys", User: "user", SessionID: "s2",
|
||
})
|
||
require.NoError(t, err)
|
||
|
||
llm.mu.Lock()
|
||
defer llm.mu.Unlock()
|
||
require.Len(t, llm.calls, 1)
|
||
assert.Equal(t, "claude-sonnet-4-6", llm.calls[0].Model)
|
||
}
|
||
|
||
func TestRouterFailsOpenLocalErrorToClaude(t *testing.T) {
|
||
llm := &fakeLLM{resp: "ok-after-fallback", err: errors.New("local boom"), errOn: "qwen35"}
|
||
r, _, _ := newRouter(t, llm, 0.95) // would route local
|
||
|
||
out, _, err := r.Run(context.Background(), routing.RunInput{
|
||
Skill: "code_review", System: "sys", User: "user", SessionID: "s3",
|
||
})
|
||
require.NoError(t, err)
|
||
assert.Equal(t, "ok-after-fallback", out)
|
||
|
||
llm.mu.Lock()
|
||
defer llm.mu.Unlock()
|
||
require.Len(t, llm.calls, 2)
|
||
assert.Equal(t, "qwen35", llm.calls[0].Model)
|
||
assert.Equal(t, "claude-sonnet-4-6", llm.calls[1].Model)
|
||
}
|
||
|
||
func TestRouterDefaultsToLocalWhenBrainUnreachable(t *testing.T) {
|
||
// Brain returns 500 → fetcher errors → router treats pass rate as nil → local.
|
||
brain := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||
http.Error(w, "down", http.StatusInternalServerError)
|
||
}))
|
||
defer brain.Close()
|
||
|
||
llm := &fakeLLM{resp: "ok"}
|
||
r := &routing.Router{
|
||
Fetcher: routing.NewFetcher(brain.URL, "7d", time.Minute),
|
||
Logger: routing.NewLogger(brain.URL),
|
||
Policy: routing.Policy{Floor: 0.9, Ceil: 0.7},
|
||
LocalModel: "qwen35",
|
||
ClaudeModel: "claude-sonnet-4-6",
|
||
Complete: llm.Complete,
|
||
}
|
||
|
||
_, _, err := r.Run(context.Background(), routing.RunInput{
|
||
Skill: "code_review", System: "sys", User: "user", SessionID: "s4",
|
||
})
|
||
require.NoError(t, err)
|
||
|
||
llm.mu.Lock()
|
||
defer llm.mu.Unlock()
|
||
require.Len(t, llm.calls, 1)
|
||
assert.Equal(t, "qwen35", llm.calls[0].Model)
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Run the test**
|
||
|
||
```bash
|
||
go test ./internal/routing/... -run TestRouter -v
|
||
```
|
||
|
||
Expected: FAIL — `undefined: routing.Router`, `undefined: routing.RunInput`.
|
||
|
||
- [ ] **Step 3: Write the implementation**
|
||
|
||
Create `internal/routing/router.go`:
|
||
|
||
```go
|
||
package routing
|
||
|
||
import (
|
||
"context"
|
||
"fmt"
|
||
"log/slog"
|
||
)
|
||
|
||
// CompleteFunc matches the signature used by every skill package's Config.
|
||
type CompleteFunc func(ctx context.Context, model, system, user string) (string, int64, error)
|
||
|
||
// RunInput captures the per-call inputs the dispatch wrapper needs.
|
||
type RunInput struct {
|
||
Skill string
|
||
System string
|
||
User string
|
||
SessionID string
|
||
ProjectRoot string
|
||
}
|
||
|
||
// Router composes a pass-rate fetcher, a decision policy, a session logger,
|
||
// and a LiteLLM client. Skill packages receive Router.Run as their CompleteFunc.
|
||
type Router struct {
|
||
Fetcher *Fetcher
|
||
Logger *Logger
|
||
Policy Policy
|
||
LocalModel string
|
||
ClaudeModel string
|
||
Complete CompleteFunc
|
||
}
|
||
|
||
// Run executes one skill call: decides local vs claude, calls LiteLLM, logs the
|
||
// decision. On local-side error, falls open by retrying once on the Claude model.
|
||
func (r *Router) Run(ctx context.Context, in RunInput) (string, int64, error) {
|
||
pr, ferr := r.Fetcher.Get(ctx, in.Skill)
|
||
if ferr != nil {
|
||
slog.Warn("router: pass-rate unreachable, defaulting to local", "skill", in.Skill, "err", ferr)
|
||
pr = nil
|
||
}
|
||
hash := CanonicalHash(in.System, in.User)
|
||
decision := r.Policy.Decide(pr, hash)
|
||
|
||
model := r.ClaudeModel
|
||
if decision == DecideLocal {
|
||
model = r.LocalModel
|
||
}
|
||
|
||
out, ms, err := r.Complete(ctx, model, in.System, in.User)
|
||
_ = r.Logger.LogDecision(ctx, LogEntry{
|
||
SessionID: in.SessionID,
|
||
Skill: in.Skill,
|
||
Decision: decision.String(),
|
||
Message: fmt.Sprintf("model=%s, pass_rate=%s", model, formatPassRate(pr)),
|
||
ProjectRoot: in.ProjectRoot,
|
||
DurationMs: ms,
|
||
Failed: err != nil,
|
||
})
|
||
|
||
if err != nil && decision == DecideLocal {
|
||
slog.Warn("router: local failed, falling open to claude", "skill", in.Skill, "err", err)
|
||
out, ms, err = r.Complete(ctx, r.ClaudeModel, in.System, in.User)
|
||
_ = r.Logger.LogDecision(ctx, LogEntry{
|
||
SessionID: in.SessionID,
|
||
Skill: in.Skill,
|
||
Decision: "claude_fallback",
|
||
Message: fmt.Sprintf("model=%s, after-local-error", r.ClaudeModel),
|
||
ProjectRoot: in.ProjectRoot,
|
||
DurationMs: ms,
|
||
Failed: err != nil,
|
||
})
|
||
}
|
||
return out, ms, err
|
||
}
|
||
|
||
func formatPassRate(pr *float64) string {
|
||
if pr == nil {
|
||
return "null"
|
||
}
|
||
return fmt.Sprintf("%.2f", *pr)
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Run tests + `task check`**
|
||
|
||
```bash
|
||
go test ./internal/routing/... -run TestRouter -v
|
||
task check 2>&1 | tail -10
|
||
```
|
||
|
||
- [ ] **Step 5: Commit**
|
||
|
||
```bash
|
||
git add internal/routing/router.go internal/routing/router_test.go
|
||
git commit -m "feat(routing): router dispatch wrapper"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 7: Snapshot test for tool-schema parity
|
||
|
||
**Worktree:** hyperguild
|
||
|
||
Capture the supervisor's current advertisement of the four routed skills (`code_review`, `debug`, `retrospective`, `trainer`) into a JSON snapshot file. Add a test that spins up a registry with the same four skill packages and asserts `tools/list` output byte-equals the snapshot. Pins the schema contract so a downstream change in any skill package fails the routing pod's test loudly.
|
||
|
||
**Files:**
|
||
- Create: `internal/routing/testdata/tools_list.snapshot.json`
|
||
- Create: `internal/routing/snapshot_test.go`
|
||
|
||
- [ ] **Step 1: Capture the supervisor's current advertisement**
|
||
|
||
```bash
|
||
cd ~/Documents/local-dev/AI/hyperguild/.worktrees/mode-2-routing-pod
|
||
mkdir -p internal/routing/testdata
|
||
go run ./cmd/supervisor &
|
||
SUPERVISOR_PID=$!
|
||
sleep 2
|
||
curl -sS -X POST http://localhost:3200/mcp \
|
||
-H 'Content-Type: application/json' \
|
||
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' \
|
||
| jq '.result.tools | map(select(.name == "code_review" or .name == "debug" or .name == "retrospective" or .name == "trainer")) | sort_by(.name)' \
|
||
> internal/routing/testdata/tools_list.snapshot.json
|
||
kill $SUPERVISOR_PID
|
||
wait $SUPERVISOR_PID 2>/dev/null
|
||
```
|
||
|
||
If the supervisor binary requires extra env vars to start, set them inline:
|
||
|
||
```bash
|
||
SUPERVISOR_CONFIG_DIR=./config/supervisor go run ./cmd/supervisor &
|
||
```
|
||
|
||
Inspect the file:
|
||
|
||
```bash
|
||
cat internal/routing/testdata/tools_list.snapshot.json | jq 'length'
|
||
```
|
||
|
||
Expected: `4`.
|
||
|
||
- [ ] **Step 2: Write the failing test**
|
||
|
||
Create `internal/routing/snapshot_test.go`:
|
||
|
||
```go
|
||
package routing_test
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
"os"
|
||
"sort"
|
||
"testing"
|
||
|
||
iexec "github.com/mathiasbq/supervisor/internal/exec"
|
||
"github.com/mathiasbq/supervisor/internal/registry"
|
||
"github.com/mathiasbq/supervisor/internal/skills/debug"
|
||
"github.com/mathiasbq/supervisor/internal/skills/retrospective"
|
||
"github.com/mathiasbq/supervisor/internal/skills/review"
|
||
"github.com/mathiasbq/supervisor/internal/skills/trainer"
|
||
"github.com/stretchr/testify/assert"
|
||
"github.com/stretchr/testify/require"
|
||
)
|
||
|
||
// TestToolsListMatchesSupervisorSnapshot pins the four routed skills' tool
|
||
// definitions to the supervisor's current advertisement. If a skill package
|
||
// changes its schema, this test fails loudly so the snapshot can be updated
|
||
// in lockstep with the consumer.
|
||
func TestToolsListMatchesSupervisorSnapshot(t *testing.T) {
|
||
complete := func(_ context.Context, _, _, _ string) (string, int64, error) {
|
||
return "", 0, nil
|
||
}
|
||
_ = iexec.NewLiteLLM // keep import for future use
|
||
|
||
reg := registry.New()
|
||
reg.Register(review.New(review.Config{
|
||
SkillPrompt: "stub",
|
||
DefaultModel: "stub",
|
||
CompleteFunc: complete,
|
||
}))
|
||
reg.Register(debug.New(debug.Config{
|
||
SkillPrompt: "stub",
|
||
DefaultModel: "stub",
|
||
CompleteFunc: complete,
|
||
}))
|
||
reg.Register(retrospective.New(retrospective.Config{
|
||
SkillPrompt: "stub",
|
||
DefaultModel: "stub",
|
||
CompleteFunc: complete,
|
||
}))
|
||
reg.Register(trainer.New(trainer.Config{
|
||
ReaderPrompt: "stub",
|
||
WriterPrompt: "stub",
|
||
DefaultModel: "stub",
|
||
CompleteFunc: complete,
|
||
}))
|
||
|
||
tools := reg.Tools()
|
||
// Filter to the four routed skills only (registry may expose additional tools).
|
||
wanted := map[string]bool{"code_review": true, "debug": true, "retrospective": true, "trainer": true}
|
||
var routed []registry.ToolDef
|
||
for _, td := range tools {
|
||
if wanted[td.Name] {
|
||
routed = append(routed, td)
|
||
}
|
||
}
|
||
sort.Slice(routed, func(i, j int) bool { return routed[i].Name < routed[j].Name })
|
||
|
||
got, err := json.MarshalIndent(routed, "", " ")
|
||
require.NoError(t, err)
|
||
|
||
want, err := os.ReadFile("testdata/tools_list.snapshot.json")
|
||
require.NoError(t, err)
|
||
|
||
// Normalize both via re-encode so whitespace differences don't dominate.
|
||
var gotV, wantV any
|
||
require.NoError(t, json.Unmarshal(got, &gotV))
|
||
require.NoError(t, json.Unmarshal(want, &wantV))
|
||
|
||
gotN, _ := json.MarshalIndent(gotV, "", " ")
|
||
wantN, _ := json.MarshalIndent(wantV, "", " ")
|
||
|
||
assert.Equal(t, string(wantN), string(gotN),
|
||
"tool advertisement drifted from supervisor snapshot — update testdata/tools_list.snapshot.json deliberately if the schema change is intentional")
|
||
}
|
||
```
|
||
|
||
If the actual skill tool name is `review` rather than `code_review` (or vice versa), discover by inspecting `internal/skills/review/skill.go`'s `Tools()` and adjust both the snapshot capture filter and the test's `wanted` map. Use the discovered name throughout the rest of the plan.
|
||
|
||
- [ ] **Step 3: Run the test**
|
||
|
||
```bash
|
||
go test ./internal/routing/... -run TestToolsListMatchesSupervisorSnapshot -v
|
||
```
|
||
|
||
Expected: PASS — the snapshot was captured from the same registry the test exercises. If FAIL, the captured names differ from the wanted map; reconcile names per the note above.
|
||
|
||
- [ ] **Step 4: `task check`**
|
||
|
||
```bash
|
||
task check 2>&1 | tail -10
|
||
```
|
||
|
||
- [ ] **Step 5: Commit**
|
||
|
||
```bash
|
||
git add internal/routing/snapshot_test.go internal/routing/testdata/tools_list.snapshot.json
|
||
git commit -m "test(routing): pin tool-schema parity with supervisor"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 8: `cmd/routing/main.go` wiring
|
||
|
||
**Worktree:** hyperguild
|
||
|
||
Compose the binary: load config, build LiteLLM client, build Fetcher/Logger/Router, register the four skills, mount on the existing `internal/mcp` server with bearer auth.
|
||
|
||
**Files:**
|
||
- Create: `cmd/routing/main.go`
|
||
- Create: `cmd/routing/main_test.go`
|
||
|
||
- [ ] **Step 1: Write the integration test first**
|
||
|
||
Create `cmd/routing/main_test.go`:
|
||
|
||
```go
|
||
package main_test
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
"net/http"
|
||
"net/http/httptest"
|
||
"os/exec"
|
||
"strings"
|
||
"testing"
|
||
"time"
|
||
|
||
"github.com/stretchr/testify/assert"
|
||
"github.com/stretchr/testify/require"
|
||
)
|
||
|
||
// TestRoutingPodEndToEnd boots the binary against fake LiteLLM + brain servers,
|
||
// calls tools/list and one tools/call, and verifies the brain saw a session_log POST.
|
||
func TestRoutingPodEndToEnd(t *testing.T) {
|
||
if testing.Short() {
|
||
t.Skip("end-to-end binary boot")
|
||
}
|
||
|
||
var brainHits int
|
||
llm := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||
_ = json.NewEncoder(w).Encode(map[string]any{
|
||
"choices": []map[string]any{{"message": map[string]any{"role": "assistant", "content": "stub"}}},
|
||
})
|
||
}))
|
||
defer llm.Close()
|
||
|
||
brain := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||
switch r.URL.Path {
|
||
case "/pass-rate":
|
||
_ = json.NewEncoder(w).Encode(map[string]any{"pass_rate": 0.95})
|
||
case "/mcp":
|
||
brainHits++
|
||
_ = json.NewEncoder(w).Encode(map[string]any{"jsonrpc": "2.0", "id": 1, "result": map[string]any{}})
|
||
}
|
||
}))
|
||
defer brain.Close()
|
||
|
||
bin := buildRouting(t)
|
||
cmd := exec.Command(bin)
|
||
cmd.Env = append(cmd.Env,
|
||
"ROUTING_PORT=33310",
|
||
"LITELLM_BASE_URL="+llm.URL,
|
||
"LITELLM_API_KEY=stub",
|
||
"BRAIN_URL="+brain.URL,
|
||
"SUPERVISOR_CONFIG_DIR=./config/supervisor",
|
||
"PATH="+osPath(),
|
||
)
|
||
require.NoError(t, cmd.Start())
|
||
t.Cleanup(func() { _ = cmd.Process.Kill() })
|
||
|
||
require.NoError(t, waitForPort(t, "127.0.0.1:33310", 5*time.Second))
|
||
|
||
resp := mcpCall(t, "http://127.0.0.1:33310/mcp", `{"jsonrpc":"2.0","id":1,"method":"tools/list"}`)
|
||
assert.Contains(t, resp, "code_review")
|
||
|
||
resp = mcpCall(t, "http://127.0.0.1:33310/mcp", `{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"code_review","arguments":{"project_root":"/tmp","files":["README.md"]}}}`)
|
||
_ = resp // shape varies by skill; we only need a 200
|
||
|
||
// Wait briefly for the async session_log to land.
|
||
deadline := time.Now().Add(2 * time.Second)
|
||
for time.Now().Before(deadline) && brainHits < 2 {
|
||
time.Sleep(50 * time.Millisecond)
|
||
}
|
||
assert.GreaterOrEqual(t, brainHits, 2, "expected at least one /pass-rate hit and one /mcp session_log hit")
|
||
}
|
||
```
|
||
|
||
Add helpers in the same file:
|
||
|
||
```go
|
||
func buildRouting(t *testing.T) string {
|
||
t.Helper()
|
||
bin := t.TempDir() + "/routing"
|
||
out, err := exec.Command("go", "build", "-o", bin, "./cmd/routing").CombinedOutput()
|
||
require.NoError(t, err, "build failed: %s", out)
|
||
return bin
|
||
}
|
||
|
||
func waitForPort(_ *testing.T, addr string, dur time.Duration) error {
|
||
deadline := time.Now().Add(dur)
|
||
for time.Now().Before(deadline) {
|
||
c, err := http.Get("http://" + addr + "/healthz")
|
||
if err == nil {
|
||
c.Body.Close()
|
||
return nil
|
||
}
|
||
// fallback: try /mcp tools/list — it'll 400 but TCP open is enough
|
||
conn, err := http.NewRequest(http.MethodPost, "http://"+addr+"/mcp", strings.NewReader(`{}`))
|
||
if err == nil {
|
||
r, err := http.DefaultClient.Do(conn)
|
||
if err == nil {
|
||
r.Body.Close()
|
||
return nil
|
||
}
|
||
}
|
||
time.Sleep(50 * time.Millisecond)
|
||
}
|
||
return context.DeadlineExceeded
|
||
}
|
||
|
||
func mcpCall(t *testing.T, url, body string) string {
|
||
t.Helper()
|
||
r, err := http.Post(url, "application/json", strings.NewReader(body))
|
||
require.NoError(t, err)
|
||
defer r.Body.Close()
|
||
var b strings.Builder
|
||
_, _ = b.ReadFrom(r.Body)
|
||
return b.String()
|
||
}
|
||
|
||
func osPath() string {
|
||
for _, e := range append([]string{}, exec.Command("env").Env...) {
|
||
if strings.HasPrefix(e, "PATH=") {
|
||
return strings.TrimPrefix(e, "PATH=")
|
||
}
|
||
}
|
||
return "/usr/bin:/bin"
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Run the test**
|
||
|
||
```bash
|
||
go test ./cmd/routing/... -v
|
||
```
|
||
|
||
Expected: FAIL — `cmd/routing/main.go` doesn't exist.
|
||
|
||
- [ ] **Step 3: Write the binary**
|
||
|
||
Create `cmd/routing/main.go`:
|
||
|
||
```go
|
||
// cmd/routing/main.go
|
||
package main
|
||
|
||
import (
|
||
"context"
|
||
"log/slog"
|
||
"net/http"
|
||
"os"
|
||
"time"
|
||
|
||
"github.com/mathiasbq/supervisor/internal/config"
|
||
iexec "github.com/mathiasbq/supervisor/internal/exec"
|
||
"github.com/mathiasbq/supervisor/internal/mcp"
|
||
"github.com/mathiasbq/supervisor/internal/registry"
|
||
"github.com/mathiasbq/supervisor/internal/routing"
|
||
"github.com/mathiasbq/supervisor/internal/skills/debug"
|
||
"github.com/mathiasbq/supervisor/internal/skills/retrospective"
|
||
"github.com/mathiasbq/supervisor/internal/skills/review"
|
||
"github.com/mathiasbq/supervisor/internal/skills/trainer"
|
||
)
|
||
|
||
func main() {
|
||
logger := slog.New(slog.NewTextHandler(os.Stderr, nil))
|
||
slog.SetDefault(logger)
|
||
|
||
cfg, err := config.LoadRouting()
|
||
if err != nil {
|
||
logger.Error("config load failed", "err", err)
|
||
os.Exit(1)
|
||
}
|
||
|
||
// Load prompts from config dir (same files the supervisor uses).
|
||
configDir := envOr("SUPERVISOR_CONFIG_DIR", "/app/config/supervisor")
|
||
mustRead := func(path string) string {
|
||
b, err := os.ReadFile(configDir + "/" + path)
|
||
if err != nil {
|
||
logger.Error("read prompt failed", "path", path, "err", err)
|
||
os.Exit(1)
|
||
}
|
||
return string(b)
|
||
}
|
||
|
||
llm := iexec.NewLiteLLM(cfg.LiteLLMBaseURL, cfg.LiteLLMAPIKey, 0)
|
||
|
||
router := &routing.Router{
|
||
Fetcher: routing.NewFetcher(cfg.BrainURL, "7d", time.Duration(cfg.PassRateTTLSeconds)*time.Second),
|
||
Logger: routing.NewLogger(cfg.BrainURL),
|
||
Policy: routing.Policy{Floor: cfg.RouteLocalFloor, Ceil: cfg.RouteLocalCeil},
|
||
LocalModel: cfg.LocalModel,
|
||
ClaudeModel: cfg.ClaudeModel,
|
||
Complete: llm.Complete,
|
||
}
|
||
|
||
// Skill packages call CompleteFunc(ctx, model, system, user) — no session_id
|
||
// or project_root in the signature. Rather than modifying every skill's API
|
||
// (and inflating Plan 6's blast radius), the routing pod logs every decision
|
||
// under a fixed session_id "_routing". Operators query
|
||
// `GET /pass-rate?skill=_routing&window=...` to inspect routing health; per-
|
||
// session correlation is sacrificed for a much simpler implementation.
|
||
const routingSessionID = "_routing"
|
||
wrap := func(skillName string) routing.CompleteFunc {
|
||
return func(ctx context.Context, _, system, user string) (string, int64, error) {
|
||
// The model param is ignored: the router picks the model based on policy.
|
||
return router.Run(ctx, routing.RunInput{
|
||
Skill: skillName,
|
||
System: system,
|
||
User: user,
|
||
SessionID: routingSessionID,
|
||
ProjectRoot: "",
|
||
})
|
||
}
|
||
}
|
||
|
||
reg := registry.New()
|
||
reg.Register(review.New(review.Config{
|
||
SkillPrompt: mustRead("review.md"),
|
||
DefaultModel: cfg.LocalModel,
|
||
CompleteFunc: wrap("code_review"),
|
||
}))
|
||
reg.Register(debug.New(debug.Config{
|
||
SkillPrompt: mustRead("debug.md"),
|
||
DefaultModel: cfg.LocalModel,
|
||
CompleteFunc: wrap("debug"),
|
||
}))
|
||
reg.Register(retrospective.New(retrospective.Config{
|
||
SkillPrompt: mustRead("retrospective.md"),
|
||
DefaultModel: cfg.LocalModel,
|
||
CompleteFunc: wrap("retrospective"),
|
||
}))
|
||
reg.Register(trainer.New(trainer.Config{
|
||
ReaderPrompt: mustRead("trainer-reader.md"),
|
||
WriterPrompt: mustRead("trainer-writer.md"),
|
||
DefaultModel: cfg.LocalModel,
|
||
CompleteFunc: wrap("trainer"),
|
||
}))
|
||
|
||
srv := mcp.NewServer(reg, cfg.MCPAuthToken)
|
||
mux := http.NewServeMux()
|
||
mux.Handle("/mcp", srv)
|
||
mux.HandleFunc("/healthz", func(w http.ResponseWriter, _ *http.Request) {
|
||
w.WriteHeader(http.StatusOK)
|
||
})
|
||
|
||
addr := ":" + cfg.Port
|
||
logger.Info("routing pod starting", "addr", addr,
|
||
"local", cfg.LocalModel, "claude", cfg.ClaudeModel,
|
||
"floor", cfg.RouteLocalFloor, "ceil", cfg.RouteLocalCeil)
|
||
if err := http.ListenAndServe(addr, mux); err != nil {
|
||
logger.Error("server stopped", "err", err)
|
||
os.Exit(1)
|
||
}
|
||
}
|
||
|
||
func envOr(key, def string) string {
|
||
if v := os.Getenv(key); v != "" {
|
||
return v
|
||
}
|
||
return def
|
||
}
|
||
```
|
||
|
||
If the existing skill packages' `Config` field names differ from what's used here (e.g. `SkillPrompt` vs `Prompt`), adjust by reading each package's `skill.go`.
|
||
|
||
- [ ] **Step 4: Run integration test + `task check`**
|
||
|
||
```bash
|
||
go test ./cmd/routing/... -v
|
||
task check 2>&1 | tail -15
|
||
```
|
||
|
||
Expected: PASS for both.
|
||
|
||
- [ ] **Step 5: Commit**
|
||
|
||
```bash
|
||
git add cmd/routing/main.go cmd/routing/main_test.go
|
||
git commit -m "feat(routing): cmd/routing binary"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 9: Update `mode client-local` template
|
||
|
||
**Worktree:** hyperguild
|
||
|
||
Replace the `_routing_pending` placeholder with a real `headers` block carrying `X-Hyperguild-Mode: client-local`. URL stays at `koala:30310/mcp`.
|
||
|
||
**Files:**
|
||
- Modify: `cmd/hyperguild/mode.go`
|
||
- Modify: `cmd/hyperguild/mode_test.go`
|
||
- Modify: `cmd/hyperguild/README.md`
|
||
|
||
- [ ] **Step 1: Update the failing test**
|
||
|
||
In `cmd/hyperguild/mode_test.go`, find the existing `TestModeClientLocal` (or equivalent). Add an assertion for the new shape:
|
||
|
||
```go
|
||
func TestModeClientLocalHasRoutingHeader(t *testing.T) {
|
||
tmp := t.TempDir() + "/mcp.json"
|
||
out := &bytes.Buffer{}
|
||
stderr := &bytes.Buffer{}
|
||
require.NoError(t, runMode(context.Background(), []string{"client-local", "--out", tmp}, nil, out, stderr))
|
||
|
||
body, err := os.ReadFile(tmp)
|
||
require.NoError(t, err)
|
||
var doc map[string]any
|
||
require.NoError(t, json.Unmarshal(body, &doc))
|
||
|
||
servers := doc["mcpServers"].(map[string]any)
|
||
routing := servers["routing"].(map[string]any)
|
||
assert.Equal(t, "http://koala:30310/mcp", routing["url"])
|
||
assert.NotContains(t, routing, "_routing_pending", "placeholder should be removed once Plan 6 ships")
|
||
|
||
headers, ok := routing["headers"].(map[string]any)
|
||
require.True(t, ok, "routing entry should have headers block")
|
||
assert.Equal(t, "client-local", headers["X-Hyperguild-Mode"])
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Run the test**
|
||
|
||
```bash
|
||
go test ./cmd/hyperguild/... -run TestModeClientLocal -v
|
||
```
|
||
|
||
Expected: FAIL — `_routing_pending` is still there OR `headers` is missing.
|
||
|
||
- [ ] **Step 3: Update `mode.go`**
|
||
|
||
Replace the `routing` entry inside `modeClientLocal`:
|
||
|
||
```go
|
||
"routing": map[string]any{
|
||
"url": "http://koala:30310/mcp",
|
||
"description": "Mode 2 routing pod — routes skill calls to LiteLLM/local",
|
||
"headers": map[string]any{
|
||
"X-Hyperguild-Mode": "client-local",
|
||
},
|
||
},
|
||
```
|
||
|
||
- [ ] **Step 4: Update `cmd/hyperguild/README.md`**
|
||
|
||
Find the section that mentions "Plan 6 — routing pod not deployed yet" and rewrite that paragraph:
|
||
|
||
```markdown
|
||
The `routing` entry points at `koala:30310/mcp` (the routing pod, deployed
|
||
in Plan 6). The `X-Hyperguild-Mode: client-local` header is forward-compat
|
||
for future modes; the pod treats absent or unknown values as `client-local`.
|
||
```
|
||
|
||
- [ ] **Step 5: Run tests + `task check`**
|
||
|
||
```bash
|
||
go test ./cmd/hyperguild/... -run TestModeClientLocal -v
|
||
task check 2>&1 | tail -10
|
||
```
|
||
|
||
- [ ] **Step 6: Commit**
|
||
|
||
```bash
|
||
git add cmd/hyperguild/mode.go cmd/hyperguild/mode_test.go cmd/hyperguild/README.md
|
||
git commit -m "feat(hyperguild): mode client-local writes routing headers"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 10: `Dockerfile.routing` + CD workflow extension
|
||
|
||
**Worktree:** hyperguild
|
||
|
||
Add a Dockerfile for the routing binary and extend the CD workflow to build + push the image and update the infra repo's routing deployment manifest.
|
||
|
||
**Files:**
|
||
- Create: `Dockerfile.routing`
|
||
- Modify: `.gitea/workflows/cd.yml`
|
||
|
||
- [ ] **Step 1: Write `Dockerfile.routing`**
|
||
|
||
```dockerfile
|
||
# syntax=docker/dockerfile:1
|
||
|
||
# ── Build stage ───────────────────────────────────────────────────────────────
|
||
FROM golang:1.26-bookworm AS builder
|
||
|
||
ARG VERSION=dev
|
||
WORKDIR /src
|
||
|
||
COPY go.mod go.sum ./
|
||
RUN go mod download
|
||
|
||
COPY . .
|
||
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 \
|
||
go build -trimpath -ldflags="-s -w -X main.version=${VERSION}" \
|
||
-o /out/routing ./cmd/routing
|
||
|
||
# ── Runtime stage ─────────────────────────────────────────────────────────────
|
||
FROM gcr.io/distroless/base-debian12
|
||
|
||
COPY --from=builder /out/routing /usr/local/bin/routing
|
||
COPY config/ /app/config/
|
||
|
||
ENV SUPERVISOR_CONFIG_DIR=/app/config/supervisor
|
||
ENV ROUTING_PORT=3210
|
||
|
||
EXPOSE 3210
|
||
|
||
USER 65532:65532
|
||
|
||
ENTRYPOINT ["/usr/local/bin/routing"]
|
||
```
|
||
|
||
- [ ] **Step 2: Extend `.gitea/workflows/cd.yml`**
|
||
|
||
Add an `env:` entry:
|
||
|
||
```yaml
|
||
env:
|
||
SERVICE: supervisor
|
||
IMAGE: gitea.d-ma.be/mathias/supervisor
|
||
INGESTION_IMAGE: gitea.d-ma.be/mathias/ingestion
|
||
ROUTING_IMAGE: gitea.d-ma.be/mathias/routing
|
||
INFRA_REPO: git@gitea.d-ma.be:mathias/infra.git
|
||
BUILDKIT_HOST: unix:///run/buildkit/buildkitd.sock
|
||
```
|
||
|
||
Add a new step after the ingestion build step:
|
||
|
||
```yaml
|
||
- name: Build and push routing image
|
||
run: |
|
||
set -e
|
||
trap 'rm -f /tmp/routing-image.tar' EXIT
|
||
IMAGE_TAG="${{ github.sha }}"
|
||
echo "Building ${ROUTING_IMAGE}:${IMAGE_TAG}"
|
||
|
||
buildctl --addr "${BUILDKIT_HOST}" build \
|
||
--frontend dockerfile.v0 \
|
||
--local context=. \
|
||
--local dockerfile=. \
|
||
--opt filename=Dockerfile.routing \
|
||
--opt build-arg:VERSION="${IMAGE_TAG}" \
|
||
--output type=oci,dest=/tmp/routing-image.tar
|
||
|
||
skopeo copy \
|
||
oci-archive:/tmp/routing-image.tar \
|
||
docker://${ROUTING_IMAGE}:${IMAGE_TAG} \
|
||
--dest-creds "${{ secrets.REGISTRY_CREDS }}"
|
||
|
||
echo "Built and pushed ${ROUTING_IMAGE}:${IMAGE_TAG}"
|
||
```
|
||
|
||
In the "Update infra repo" step, add a third sed and update the commit:
|
||
|
||
```yaml
|
||
sed -i "s|gitea.d-ma.be/mathias/routing:.*|gitea.d-ma.be/mathias/routing:${IMAGE_TAG}|" \
|
||
"k3s/apps/routing/deployment.yaml"
|
||
|
||
git config user.email "cd-bot@d-ma.be"
|
||
git config user.name "CD Bot"
|
||
git add "k3s/apps/${SERVICE}/deployment.yaml" \
|
||
"k3s/apps/${SERVICE}/ingestion-deployment.yaml" \
|
||
"k3s/apps/routing/deployment.yaml"
|
||
git commit -m "chore(deploy): supervisor+ingestion+routing → ${IMAGE_TAG}"
|
||
```
|
||
|
||
- [ ] **Step 3: Validate the YAML locally**
|
||
|
||
```bash
|
||
yq eval '.jobs.deploy.steps | length' .gitea/workflows/cd.yml
|
||
```
|
||
|
||
Expected: a number greater than the original (one new step added).
|
||
|
||
- [ ] **Step 4: Commit**
|
||
|
||
The workflow change is hot — once pushed, CD will try to build the routing image. Until the infra repo has `k3s/apps/routing/deployment.yaml`, the sed line is a no-op (sed succeeds because the file isn't matched anywhere; but the `git add` will fail). Two options:
|
||
|
||
**Option A (preferred):** Land the infra-repo manifests (Tasks 11–12) in the infra worktree FIRST, push them so they exist on `infra` main, then push this commit. Order: Tasks 11 → 12 → 10.
|
||
|
||
**Option B:** Land the workflow change with a guard, then drop the guard once manifests exist.
|
||
|
||
Implementer should pick Option A. After the manifests are in place:
|
||
|
||
```bash
|
||
git add Dockerfile.routing .gitea/workflows/cd.yml
|
||
git commit -m "build(routing): Dockerfile + CD workflow"
|
||
```
|
||
|
||
DO NOT push this commit until Tasks 11 and 12 have been pushed to the infra repo's `main`.
|
||
|
||
---
|
||
|
||
## Task 11: Routing pod manifests (infra worktree)
|
||
|
||
**Worktree:** infra
|
||
|
||
Create the k3s manifests for the routing pod. Mirror the supervisor's structure for operator familiarity.
|
||
|
||
**Files:**
|
||
- Create: `k3s/apps/routing/namespace.yaml`
|
||
- Create: `k3s/apps/routing/deployment.yaml`
|
||
- Create: `k3s/apps/routing/service.yaml`
|
||
- Create: `k3s/apps/routing/nodeport.yaml`
|
||
- Create: `k3s/apps/routing/kustomization.yaml`
|
||
- Modify: `k3s/apps/kustomization.yaml`
|
||
|
||
- [ ] **Step 1: `namespace.yaml`**
|
||
|
||
```yaml
|
||
apiVersion: v1
|
||
kind: Namespace
|
||
metadata:
|
||
name: routing
|
||
```
|
||
|
||
- [ ] **Step 2: `deployment.yaml`**
|
||
|
||
The image tag will be bumped by CD; seed it with a placeholder that gets overwritten on first deploy.
|
||
|
||
```yaml
|
||
apiVersion: apps/v1
|
||
kind: Deployment
|
||
metadata:
|
||
name: routing
|
||
namespace: routing
|
||
spec:
|
||
replicas: 1
|
||
selector:
|
||
matchLabels:
|
||
app: routing
|
||
template:
|
||
metadata:
|
||
labels:
|
||
app: routing
|
||
spec:
|
||
nodeSelector:
|
||
kubernetes.io/hostname: koala
|
||
imagePullSecrets:
|
||
- name: gitea-registry
|
||
containers:
|
||
- name: routing
|
||
image: gitea.d-ma.be/mathias/routing:initial
|
||
ports:
|
||
- containerPort: 3210
|
||
envFrom:
|
||
- secretRef:
|
||
name: routing-secrets
|
||
env:
|
||
- name: ROUTING_PORT
|
||
value: "3210"
|
||
- name: LITELLM_BASE_URL
|
||
value: "http://piguard:4000"
|
||
- name: BRAIN_URL
|
||
value: "http://ingestion.supervisor:3300"
|
||
- name: HYPERGUILD_LOCAL_MODEL
|
||
value: "qwen35"
|
||
- name: HYPERGUILD_CLAUDE_MODEL
|
||
value: "claude-sonnet-4-6"
|
||
- name: HYPERGUILD_ROUTE_LOCAL_FLOOR
|
||
value: "0.90"
|
||
- name: HYPERGUILD_ROUTE_LOCAL_CEIL
|
||
value: "0.70"
|
||
- name: HYPERGUILD_PASS_RATE_TTL_SECONDS
|
||
value: "60"
|
||
readinessProbe:
|
||
httpGet:
|
||
path: /healthz
|
||
port: 3210
|
||
initialDelaySeconds: 2
|
||
periodSeconds: 10
|
||
```
|
||
|
||
The `gitea-registry` imagePullSecret needs to exist in the `routing` namespace. If only present in `supervisor`, copy it (Step 6 below).
|
||
|
||
- [ ] **Step 3: `service.yaml`**
|
||
|
||
```yaml
|
||
apiVersion: v1
|
||
kind: Service
|
||
metadata:
|
||
name: routing
|
||
namespace: routing
|
||
spec:
|
||
selector:
|
||
app: routing
|
||
ports:
|
||
- port: 3210
|
||
targetPort: 3210
|
||
protocol: TCP
|
||
```
|
||
|
||
- [ ] **Step 4: `nodeport.yaml`**
|
||
|
||
```yaml
|
||
apiVersion: v1
|
||
kind: Service
|
||
metadata:
|
||
name: routing-nodeport
|
||
namespace: routing
|
||
spec:
|
||
type: NodePort
|
||
selector:
|
||
app: routing
|
||
ports:
|
||
- port: 3210
|
||
targetPort: 3210
|
||
nodePort: 30310
|
||
protocol: TCP
|
||
```
|
||
|
||
- [ ] **Step 5: `kustomization.yaml`** (inside `k3s/apps/routing/`)
|
||
|
||
```yaml
|
||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||
kind: Kustomization
|
||
resources:
|
||
- namespace.yaml
|
||
- deployment.yaml
|
||
- service.yaml
|
||
- nodeport.yaml
|
||
- secrets.enc.yaml
|
||
```
|
||
|
||
`secrets.enc.yaml` is added in Task 12; reference it now so the directory is complete.
|
||
|
||
- [ ] **Step 6: Add `routing` to the apps `kustomization.yaml`**
|
||
|
||
Modify `k3s/apps/kustomization.yaml`:
|
||
|
||
```yaml
|
||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||
kind: Kustomization
|
||
resources:
|
||
- imagepullsecret
|
||
- registry
|
||
- gitea
|
||
- infra-mcp
|
||
- supervisor
|
||
- cobalt-dingo
|
||
- routing
|
||
```
|
||
|
||
If `imagepullsecret/` only seeds the secret in specific namespaces, ensure `routing` is added to that list — inspect `k3s/apps/imagepullsecret/` and follow the existing pattern.
|
||
|
||
- [ ] **Step 7: Validate manifest syntax with `kustomize build`**
|
||
|
||
```bash
|
||
cd ~/Documents/local-dev/AI/infra/.worktrees/mode-2-routing-pod
|
||
kustomize build k3s/apps/routing 2>&1 | head -20
|
||
```
|
||
|
||
Expected: valid YAML output, no errors. If `secrets.enc.yaml` is referenced but missing, suppress for now by temporarily commenting that line; uncomment in Task 12.
|
||
|
||
- [ ] **Step 8: Commit (do NOT push yet)**
|
||
|
||
```bash
|
||
git add k3s/apps/routing/ k3s/apps/kustomization.yaml
|
||
git commit -m "feat(routing): k3s manifests for the new pod"
|
||
```
|
||
|
||
Push happens after Task 12 (with the encrypted Secret) so the kustomization is consistent on first Flux apply.
|
||
|
||
---
|
||
|
||
## Task 12: Routing-secrets Secret + Flux verification
|
||
|
||
**Worktree:** infra
|
||
|
||
Encrypt and add the `routing-secrets` Secret. The Secret carries `LITELLM_API_KEY` (reused from supervisor's secret) and optionally a `ROUTING_MCP_TOKEN` for bearer auth.
|
||
|
||
**Files:**
|
||
- Create: `k3s/apps/routing/secrets.enc.yaml`
|
||
|
||
- [ ] **Step 1: Generate a token (or skip auth for first deploy)**
|
||
|
||
```bash
|
||
# generate (or omit ROUTING_MCP_TOKEN for unauthenticated first deploy):
|
||
openssl rand -hex 32
|
||
```
|
||
|
||
Record the value; it will be set in the operator's shell env when Mode 2 is exercised in any project.
|
||
|
||
- [ ] **Step 2: Decode the cluster's age key**
|
||
|
||
```bash
|
||
export SOPS_AGE_KEY="$(kubectl get secret sops-age -n flux-system -o jsonpath='{.data.age\.agekey}' | base64 -d)"
|
||
[ -n "$SOPS_AGE_KEY" ] && echo "age key loaded ($(echo -n "$SOPS_AGE_KEY" | wc -c) bytes)" || (echo "FAIL"; exit 1)
|
||
```
|
||
|
||
- [ ] **Step 3: Pull `LITELLM_API_KEY` value from the supervisor's secret**
|
||
|
||
Decrypt the supervisor's Secret to read the existing value:
|
||
|
||
```bash
|
||
LITELLM_API_KEY="$(sops -d k3s/apps/supervisor/secrets.enc.yaml | yq eval '.stringData.DMABE_LLMAPI_KEY' -)"
|
||
[ -n "$LITELLM_API_KEY" ] && echo "found litellm key" || (echo "FAIL: empty"; exit 1)
|
||
```
|
||
|
||
(`DMABE_LLMAPI_KEY` is the supervisor's name for the LiteLLM key — same value, different env-var name in the consumer.)
|
||
|
||
- [ ] **Step 4: Create the routing Secret**
|
||
|
||
```bash
|
||
cat > /tmp/routing-secrets.yaml <<EOF
|
||
apiVersion: v1
|
||
kind: Secret
|
||
metadata:
|
||
name: routing-secrets
|
||
namespace: routing
|
||
type: Opaque
|
||
stringData:
|
||
LITELLM_API_KEY: "${LITELLM_API_KEY}"
|
||
ROUTING_MCP_TOKEN: "<paste-token-from-step-1-or-leave-empty>"
|
||
EOF
|
||
```
|
||
|
||
Edit `/tmp/routing-secrets.yaml` and paste the token (or leave the field as `""` for unauthenticated first deploy).
|
||
|
||
- [ ] **Step 5: Encrypt with SOPS**
|
||
|
||
```bash
|
||
sops --encrypt --age age15xez8pcmgg3daxpuqnye9ewawvzjtallheddcrq88ph573yle3nsr5hdq6 \
|
||
--encrypted-regex '^(stringData|data)$' \
|
||
/tmp/routing-secrets.yaml \
|
||
> k3s/apps/routing/secrets.enc.yaml
|
||
|
||
rm /tmp/routing-secrets.yaml
|
||
unset SOPS_AGE_KEY LITELLM_API_KEY
|
||
```
|
||
|
||
Verify the file:
|
||
|
||
```bash
|
||
head -10 k3s/apps/routing/secrets.enc.yaml
|
||
```
|
||
|
||
Expected: `apiVersion: v1`, `kind: Secret`, `stringData:` with `ENC[...]` values.
|
||
|
||
- [ ] **Step 6: `kustomize build` re-check**
|
||
|
||
```bash
|
||
kustomize build k3s/apps/routing | head -30
|
||
```
|
||
|
||
Expected: namespaces, deployment, services, and a Secret with encrypted data fields. Should succeed.
|
||
|
||
- [ ] **Step 7: Commit and push (this is the Flux activation)**
|
||
|
||
```bash
|
||
git add k3s/apps/routing/secrets.enc.yaml
|
||
git commit -m "feat(routing): SOPS-encrypted routing-secrets"
|
||
git pull --rebase origin main
|
||
git push origin main
|
||
```
|
||
|
||
`git pull --rebase` accommodates intervening CD-bot commits on `main` (per the auth-rollout precedent earlier today).
|
||
|
||
- [ ] **Step 8: Wait for Flux to reconcile**
|
||
|
||
```bash
|
||
NEW_SHA=$(git rev-parse HEAD)
|
||
until kubectl -n flux-system get kustomization apps -o jsonpath='{.status.lastAppliedRevision}' 2>/dev/null | grep -qE "${NEW_SHA:0:7}"; do
|
||
sleep 3
|
||
done
|
||
echo "Flux applied $NEW_SHA"
|
||
```
|
||
|
||
The pod will be in `ImagePullBackOff` because the `:initial` placeholder image doesn't exist yet — that's expected. The CD workflow (Task 10) will publish the real image and bump the tag.
|
||
|
||
- [ ] **Step 9: Verify expected partial state**
|
||
|
||
```bash
|
||
kubectl -n routing get all
|
||
```
|
||
|
||
Expected: namespace, deployment (0/1 ready), service, nodeport-service. Pod is in `ErrImagePull` until Task 10 runs end-to-end.
|
||
|
||
---
|
||
|
||
## Task 13: `task smoke:routing` live-contract test
|
||
|
||
**Worktree:** hyperguild
|
||
|
||
Boots the routing binary against the real `piguard:4000` LiteLLM and the real `koala:30330` brain. Calls each of the four advertised tools once, verifies a `_routing` entry appears in the brain.
|
||
|
||
**Files:**
|
||
- Create: `scripts/smoke-routing.sh`
|
||
- Modify: `Taskfile.yml`
|
||
|
||
- [ ] **Step 1: Write `scripts/smoke-routing.sh`**
|
||
|
||
```bash
|
||
#!/usr/bin/env bash
|
||
set -euo pipefail
|
||
|
||
# Boot the routing binary and exercise its four tools against live deps.
|
||
# Skipped when LITELLM_BASE_URL or BRAIN_URL is unreachable.
|
||
|
||
LITELLM_BASE_URL="${LITELLM_BASE_URL:-http://piguard:4000}"
|
||
BRAIN_URL="${BRAIN_URL:-http://koala:30330}"
|
||
|
||
if ! curl -sS --max-time 2 "${LITELLM_BASE_URL}/v1/models" >/dev/null 2>&1; then
|
||
echo "SKIP: LITELLM at ${LITELLM_BASE_URL} unreachable"
|
||
exit 0
|
||
fi
|
||
if ! curl -sS --max-time 2 "${BRAIN_URL}/query" -X POST -d '{"query":"x","k":1}' -H 'Content-Type: application/json' >/dev/null 2>&1; then
|
||
echo "SKIP: BRAIN at ${BRAIN_URL} unreachable"
|
||
exit 0
|
||
fi
|
||
|
||
PORT=33310
|
||
BIN=$(mktemp)
|
||
trap 'rm -f $BIN; pkill -P $$ -f "$BIN" 2>/dev/null || true' EXIT
|
||
|
||
go build -o "$BIN" ./cmd/routing
|
||
|
||
LITELLM_BASE_URL="$LITELLM_BASE_URL" BRAIN_URL="$BRAIN_URL" \
|
||
ROUTING_PORT="$PORT" SUPERVISOR_CONFIG_DIR="$(pwd)/config/supervisor" \
|
||
"$BIN" &
|
||
BIN_PID=$!
|
||
|
||
# Wait for the binary to bind.
|
||
for _ in $(seq 1 50); do
|
||
curl -sS "http://127.0.0.1:${PORT}/healthz" >/dev/null 2>&1 && break
|
||
sleep 0.1
|
||
done
|
||
|
||
call_tool() {
|
||
local tool="$1"
|
||
local args="$2"
|
||
curl -sS -X POST "http://127.0.0.1:${PORT}/mcp" \
|
||
-H 'Content-Type: application/json' \
|
||
-d "{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/call\",\"params\":{\"name\":\"${tool}\",\"arguments\":${args}}}" \
|
||
| jq -e '.result // .error' > /dev/null
|
||
}
|
||
|
||
echo "calling tools/list..."
|
||
curl -sS -X POST "http://127.0.0.1:${PORT}/mcp" \
|
||
-H 'Content-Type: application/json' \
|
||
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' \
|
||
| jq -r '.result.tools | map(.name) | sort | .[]'
|
||
|
||
echo "calling each tool..."
|
||
call_tool code_review '{"project_root":"/tmp","files":["README.md"],"session_id":"smoke-1"}'
|
||
call_tool debug '{"project_root":"/tmp","problem":"smoke test","session_id":"smoke-1"}'
|
||
call_tool retrospective '{"project_root":"/tmp","session_id":"smoke-1"}'
|
||
call_tool trainer '{"project_root":"/tmp","session_id":"smoke-1"}'
|
||
|
||
echo "checking brain has _routing entries..."
|
||
sleep 2
|
||
COUNT=$(curl -sS "${BRAIN_URL}/pass-rate?skill=_routing&window=1h" | jq -r '.total // 0')
|
||
if [ "${COUNT}" -lt 4 ]; then
|
||
echo "FAIL: expected ≥4 _routing entries in last 1h, got ${COUNT}"
|
||
exit 1
|
||
fi
|
||
|
||
echo "PASS: smoke:routing"
|
||
```
|
||
|
||
Make it executable:
|
||
|
||
```bash
|
||
chmod +x scripts/smoke-routing.sh
|
||
```
|
||
|
||
The exact `arguments` shape per tool may need to be adjusted based on each skill's required fields. If a smoke call returns a JSON-RPC error like "missing required argument", read the failing tool's `Tools()` definition in `internal/skills/<skill>/skill.go` and add the required field with a stub value.
|
||
|
||
- [ ] **Step 2: Add the Taskfile target**
|
||
|
||
In `Taskfile.yml`, append to the `tasks:` map:
|
||
|
||
```yaml
|
||
smoke:routing:
|
||
desc: Boot the routing pod against live LiteLLM + brain and verify _routing logs land
|
||
cmds:
|
||
- bash scripts/smoke-routing.sh
|
||
```
|
||
|
||
- [ ] **Step 3: Run it**
|
||
|
||
```bash
|
||
task smoke:routing
|
||
```
|
||
|
||
Expected: SKIP if offline; PASS otherwise.
|
||
|
||
- [ ] **Step 4: Commit**
|
||
|
||
```bash
|
||
git add scripts/smoke-routing.sh Taskfile.yml
|
||
git commit -m "test(routing): live-contract smoke target"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 14: Documentation updates
|
||
|
||
**Worktree:** hyperguild
|
||
|
||
Update the project-level docs to describe Mode 2 + the new env vars + the routing-pod URL.
|
||
|
||
**Files:**
|
||
- Modify: `README.md`
|
||
- Modify: `.context/PROJECT.md`
|
||
|
||
- [ ] **Step 1: Update `README.md`'s "Key env vars" table**
|
||
|
||
Append:
|
||
|
||
```markdown
|
||
| `ROUTING_PORT` | `3210` | Routing pod's listen port |
|
||
| `ROUTING_MCP_TOKEN` | — | Optional bearer token for the routing MCP HTTP endpoint |
|
||
| `BRAIN_URL` | `http://ingestion.supervisor:3300` | Routing pod → brain (in-cluster) |
|
||
| `HYPERGUILD_LOCAL_MODEL` | `qwen35` | Local model for routed-to-local skill calls |
|
||
| `HYPERGUILD_CLAUDE_MODEL` | `claude-sonnet-4-6` | Claude model for routed-to-Claude skill calls |
|
||
| `HYPERGUILD_ROUTE_LOCAL_FLOOR` | `0.90` | At/above pass rate, route to local |
|
||
| `HYPERGUILD_ROUTE_LOCAL_CEIL` | `0.70` | Below pass rate, route to Claude. Between CEIL and FLOOR is the sample band. |
|
||
| `HYPERGUILD_PASS_RATE_TTL_SECONDS` | `60` | Per-skill pass-rate cache TTL |
|
||
```
|
||
|
||
In the architecture diagram block at the top of the README, add the routing pod:
|
||
|
||
```
|
||
Your Claude Code session (in any project)
|
||
│
|
||
│ MCP over HTTP (Tailscale)
|
||
├──▶ supervisor :3200 (NodePort 30320 on koala) — skill workers: tdd, debug, spec, …
|
||
├──▶ routing :3210 (NodePort 30310 on koala) — Mode 2 only: code_review, debug, retrospective, trainer
|
||
└──▶ brain :3300 (NodePort 30330 on koala) — brain_query, brain_write, brain_ingest, session_log
|
||
```
|
||
|
||
- [ ] **Step 2: Update `.context/PROJECT.md`**
|
||
|
||
Find the "MCP endpoints" section and add a third bullet:
|
||
|
||
```markdown
|
||
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||
the same four cost-routable skills as the supervisor (`code_review`,
|
||
`debug`, `retrospective`, `trainer`) but per-call decides whether to use
|
||
a local model or Claude based on the brain's `/pass-rate` response.
|
||
Bearer auth via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local`
|
||
registers this endpoint; Mode 1 and Mode 3 do not.
|
||
```
|
||
|
||
- [ ] **Step 3: Run `task context:sync` so derived adapters update**
|
||
|
||
```bash
|
||
task context:sync
|
||
```
|
||
|
||
This regenerates `CLAUDE.md`, `AGENTS.md`, `.cursorrules`, `.aider.conventions.md`, and `.context/system-prompt.txt` from the canonical sources.
|
||
|
||
- [ ] **Step 4: `task check`**
|
||
|
||
```bash
|
||
task check 2>&1 | tail -10
|
||
```
|
||
|
||
Expected: drift check green (regenerated adapters tracked).
|
||
|
||
- [ ] **Step 5: Commit**
|
||
|
||
```bash
|
||
git add README.md .context/PROJECT.md CLAUDE.md AGENTS.md .cursorrules .aider.conventions.md .context/system-prompt.txt
|
||
git commit -m "docs(routing): document Mode 2 routing pod + env vars"
|
||
```
|
||
|
||
---
|
||
|
||
## Final verification before merge
|
||
|
||
After all 14 tasks land, on the hyperguild worktree's branch:
|
||
|
||
- [ ] **Run the full check chain**
|
||
|
||
```bash
|
||
cd ~/Documents/local-dev/AI/hyperguild/.worktrees/mode-2-routing-pod
|
||
task check 2>&1 | tail -15
|
||
```
|
||
|
||
Expected: 0 issues across lint, test, vet, drift, govulncheck.
|
||
|
||
- [ ] **Run smoke test if Tailscale available**
|
||
|
||
```bash
|
||
task smoke:routing
|
||
```
|
||
|
||
Expected: PASS or SKIP (with a clear reason).
|
||
|
||
- [ ] **Verify the snapshot test still passes**
|
||
|
||
The skill packages can drift between when the snapshot was captured and merge time. Re-run:
|
||
|
||
```bash
|
||
go test ./internal/routing/... -run TestToolsListMatchesSupervisorSnapshot -v
|
||
```
|
||
|
||
If it fails because of an intentional schema change in the merge window, re-capture the snapshot per Task 7's Step 1 and commit the update with a clear message.
|
||
|
||
- [ ] **Push the hyperguild branch and merge**
|
||
|
||
```bash
|
||
git push -u origin feat/mode-2-routing-pod
|
||
```
|
||
|
||
Open a PR (or merge to main if the workflow allows direct push). Once merged, gitea CI builds the routing image and CD pushes the image-tag bump to the infra repo.
|
||
|
||
- [ ] **Verify Flux applies the new image and the pod becomes Ready**
|
||
|
||
```bash
|
||
NEW_SHA=$(git -C ~/Documents/local-dev/AI/hyperguild rev-parse main)
|
||
echo "Watching for image tag $NEW_SHA on routing deployment..."
|
||
until kubectl -n routing get deployment routing -o jsonpath='{.spec.template.spec.containers[0].image}' 2>/dev/null | grep -qE "${NEW_SHA:0:7}"; do
|
||
sleep 5
|
||
done
|
||
kubectl -n routing rollout status deployment/routing --timeout=120s
|
||
```
|
||
|
||
Expected: deployment becomes `1/1 Ready` with the new image.
|
||
|
||
If the pod stays `Pending` or `ImagePullBackOff` past 2 minutes, check:
|
||
|
||
```bash
|
||
kubectl -n routing describe pod -l app=routing | tail -30
|
||
kubectl -n routing logs -l app=routing --tail=50
|
||
```
|
||
|
||
- [ ] **Final live verification**
|
||
|
||
```bash
|
||
# tools/list should return 4 tools
|
||
curl -sS -X POST http://koala:30310/mcp \
|
||
-H 'Content-Type: application/json' \
|
||
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' \
|
||
| jq '.result.tools | length'
|
||
# expected: 4
|
||
|
||
# auth check (only meaningful if ROUTING_MCP_TOKEN is set on the pod)
|
||
curl -isS -X POST http://koala:30310/mcp \
|
||
-H 'Content-Type: application/json' \
|
||
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' | head -1
|
||
# expected: 401 if token set, 200 otherwise
|
||
```
|
||
|
||
- [ ] **Restart pod the Flux-friendly way if needed**
|
||
|
||
For any post-merge restart that doesn't ride a fresh image bump, use `kubectl delete pod` (not `kubectl rollout restart` — Flux strips the annotation):
|
||
|
||
```bash
|
||
kubectl -n routing delete pod -l app=routing
|
||
```
|
||
|
||
The existing ReplicaSet recreates the pod, picking up any Secret data changes on startup.
|