Files
hyperguild/docs/superpowers/plans/2026-05-04-mode-2-routing-pod.md
Mathias Bergqvist b6bcc93048
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 3s
docs(plan6): implementation plan for Mode 2 routing pod
14 TDD-shaped tasks across two worktrees: hyperguild for code
(internal/routing package, cmd/routing binary, Dockerfile, CD
workflow, mode template, smoke test, docs) and infra for the
k3s manifests (deployment, service, nodeport, SOPS-encrypted
secret). Plan 7 amendment baked in: internal/skills/{review,
debug,retrospective,trainer} survive Plan 6 — Plan 7 only
deletes tdd, spec, and the supervisor binary.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:53:03 +02:00

74 KiB
Raw Permalink Blame History

Mode 2 Routing Pod Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Ship a thin policy pod at koala:30310 that routes the four cost-routable skill calls (code_review, debug, retrospective, trainer) to a LiteLLM-proxied local or Claude model based on per-skill pass rate. Replaces the unconditional supervisor-runs-locally behavior in client-local mode.

Architecture: New Go binary at cmd/routing/, reusing internal/skills/{review,debug,retrospective,trainer}/, internal/exec/litellm.go, internal/registry, and internal/mcp (bearer-auth handler from f49850d). A new internal/routing package adds (a) a pure-function decision policy, (b) a TTL-cached pass-rate fetcher, (c) a session-log decision logger, and (d) a router that wraps a CompleteFunc so the existing skill packages stay routing-oblivious. Deployed via Flux at NodePort :30310 alongside the supervisor and ingestion pods.

Tech Stack: Go 1.26 stdlib (net/http, crypto/sha256, encoding/json, time, sync); existing testify for tests; SOPS-encrypted Secret in the infra repo; gitea CI buildctl→skopeo; Flux Kustomize reconciliation.


Plan 6 of 7 — Hyperguild Skill Migration

Plans 15 merged. Plan 6 is the substantive routing-pod plan; Plan 7 (supervisor retirement) follows.

Spec: docs/superpowers/specs/2026-05-04-mode-2-routing-pod-design.md (committed 51e0123).

Two worktrees

  • Hyperguild worktree: ~/Documents/local-dev/AI/hyperguild/.worktrees/mode-2-routing-pod/ on branch feat/mode-2-routing-pod. Contains the Go code, Dockerfile addition, CD workflow update, mode-template update, README, and smoke test.
  • Infra worktree: ~/Documents/local-dev/AI/infra/.worktrees/mode-2-routing-pod/ on branch feat/routing-pod-manifests. Contains the k3s manifests for the new pod plus the SOPS-encrypted Secret.

Each task's "Files" header names the worktree. Implementer subagents must cd into the named worktree before any read/edit/git operation. Plan paths describe the post-merge canonical state (per 2026-05-03-plan-canonical-dispatch-ephemeral brain entry); dispatch prompts add the worktree translation.

Verification convention

Per task, the implementer runs task check (lint + test + vet + drift + govulncheck), not just go test ./.... CI's lint gate caught a Plan-1 errcheck regression that local tests missed (per feedback_per_task_verification memory). Append //nolint:errcheck to any fmt.Fprint* to stdout/stderr that ignores its return value. Ignored errors on defer resp.Body.Close() use defer func() { _ = resp.Body.Close() }().

Status taxonomy for implementer subagents

  • DONE — task completed, all checks green, verification commands ran clean.
  • DONE_WITH_CONCERNS — task completed, but the implementer noticed a plan bug, an environmental anomaly, or related code that looks suspicious. Controller decides: doc-patch, follow-up commit, or accept and roll on (per 2026-05-03-done-with-concerns-vs-blocked brain entry).
  • BLOCKED — implementer cannot complete the assigned work. Controller re-dispatches with more context.
  • NEEDS_CONTEXT — implementer needs information not in the dispatch (rare; usually a doc bug).

Code-reviewer expectations

The reviewer agent surfaces candidate improvements; the controller filters. Per 2026-05-03-code-reviewer-output-as-candidates, reject reviewer suggestions that add helpers for single-use sites, abstractions for hypothetical futures, or stylistic refactors that diverge from the plan's heredocs. Apply genuine bugs and security findings; defer the rest.

Flux operational note

The auth rollout (commit afe9a08 in infra) demonstrated that Flux server-side-applies the routing Deployment every ~30s and strips any kubectl rollout restart annotation, deleting the new ReplicaSet's pod. To force a pod restart on a Flux-managed deployment, use kubectl -n <ns> delete pod -l app=<name> — the existing ReplicaSet recreates without an annotation Flux can revert.

Plan 7 amendment baked in

internal/skills/{review,debug,retrospective,trainer}/ are reused by the routing pod and must not be deleted in Plan 7. Plan 7 deletes only internal/skills/{tdd,spec}/, the supervisor binary, the supervisor manifests, and frees NodePort :30320. The implementer of Plan 7 must read this paragraph and the matching note in the spec before deleting anything.

File Structure

Hyperguild worktree

Path Action Responsibility
internal/config/routing.go create RoutingConfig typed struct, LoadRouting() env parser
internal/config/routing_test.go create Defaults + env-override tests
internal/routing/policy.go create Decision enum, Policy.Decide(passRate, hash) Decision
internal/routing/policy_test.go create Table-driven coverage of all four rules
internal/routing/hash.go create CanonicalHash(system, user) uint64 (SHA-256 prefix)
internal/routing/hash_test.go create Determinism + low-bit distribution sanity
internal/routing/passrate.go create Fetcher with TTL cache, calls GET /pass-rate
internal/routing/passrate_test.go create httptest.Server; cache hit/miss, error path
internal/routing/log.go create Logger.LogDecision(...) posts to brain MCP session_log
internal/routing/log_test.go create httptest.Server capture + body shape assertion
internal/routing/router.go create Router.Run(...) wraps fetcher + policy + logger + LiteLLM
internal/routing/router_test.go create Mocked fetcher/logger/litellm; route + fail-open paths
internal/routing/snapshot_test.go create Asserts routing pod's tools/list byte-equals captured snapshot
internal/routing/testdata/tools_list.snapshot.json create Snapshot from current supervisor advertisement
cmd/routing/main.go create Wires Config → LiteLLM → Router → Skills → Registry → MCP server
cmd/routing/main_test.go create Integration test with fakes for LiteLLM + brain
cmd/hyperguild/mode.go:74-87 modify modeClientLocal adds headers: X-Hyperguild-Mode, removes _routing_pending
cmd/hyperguild/mode_test.go modify Updated assertion for the new shape
cmd/hyperguild/README.md modify Drop "not deployed yet" note; document the header
Dockerfile.routing create Builds cmd/routing, bakes config/, runs as non-root, no claude CLI
.gitea/workflows/cd.yml modify Build + push routing image; sed routing/deployment.yaml in infra
Taskfile.yml modify Add smoke:routing task
scripts/smoke-routing.sh create Boots binary, hits each tool, asserts brain has _routing entries
README.md modify Mode 2 + new env vars + routing pod URL
.context/PROJECT.md modify Document koala:30310/mcp + the four routed skills

Infra worktree

Path Action Responsibility
k3s/apps/routing/namespace.yaml create Namespace routing
k3s/apps/routing/deployment.yaml create One-replica Deployment, koala nodeSelector, image from gitea registry
k3s/apps/routing/service.yaml create ClusterIP routing on port 3210
k3s/apps/routing/nodeport.yaml create NodePort 30310 → service 3210
k3s/apps/routing/secrets.enc.yaml create SOPS-encrypted LITELLM_API_KEY + optional ROUTING_MCP_TOKEN
k3s/apps/routing/kustomization.yaml create Bundles the above
k3s/apps/kustomization.yaml modify Add routing to the apps list

Task 1: RoutingConfig struct + env parser

Worktree: hyperguild

Typed config struct for the routing pod. New struct (not appended to Config) because the routing pod's surface differs from the supervisor's; merging would force every routing field onto the supervisor and vice versa.

Files:

  • Create: internal/config/routing.go

  • Create: internal/config/routing_test.go

  • Step 1: Write the failing test

Create internal/config/routing_test.go:

package config_test

import (
	"testing"

	"github.com/mathiasbq/supervisor/internal/config"
	"github.com/stretchr/testify/assert"
	"github.com/stretchr/testify/require"
)

func TestLoadRoutingDefaults(t *testing.T) {
	for _, k := range []string{
		"ROUTING_PORT", "ROUTING_MCP_TOKEN", "LITELLM_BASE_URL", "LITELLM_API_KEY",
		"BRAIN_URL", "HYPERGUILD_LOCAL_MODEL", "HYPERGUILD_CLAUDE_MODEL",
		"HYPERGUILD_ROUTE_LOCAL_FLOOR", "HYPERGUILD_ROUTE_LOCAL_CEIL",
		"HYPERGUILD_PASS_RATE_TTL_SECONDS",
	} {
		t.Setenv(k, "")
	}

	cfg, err := config.LoadRouting()
	require.NoError(t, err)
	assert.Equal(t, "3210", cfg.Port)
	assert.Equal(t, "", cfg.MCPAuthToken)
	assert.Equal(t, "http://piguard:4000", cfg.LiteLLMBaseURL)
	assert.Equal(t, "http://ingestion.supervisor:3300", cfg.BrainURL)
	assert.Equal(t, "qwen35", cfg.LocalModel)
	assert.Equal(t, "claude-sonnet-4-6", cfg.ClaudeModel)
	assert.InDelta(t, 0.90, cfg.RouteLocalFloor, 1e-9)
	assert.InDelta(t, 0.70, cfg.RouteLocalCeil, 1e-9)
	assert.Equal(t, 60, cfg.PassRateTTLSeconds)
}

func TestLoadRoutingFromEnv(t *testing.T) {
	t.Setenv("ROUTING_PORT", "3250")
	t.Setenv("ROUTING_MCP_TOKEN", "tok-xyz")
	t.Setenv("LITELLM_BASE_URL", "http://localhost:4000")
	t.Setenv("LITELLM_API_KEY", "lk")
	t.Setenv("BRAIN_URL", "http://localhost:3300")
	t.Setenv("HYPERGUILD_LOCAL_MODEL", "qwen2-7b")
	t.Setenv("HYPERGUILD_CLAUDE_MODEL", "claude-opus-4-7")
	t.Setenv("HYPERGUILD_ROUTE_LOCAL_FLOOR", "0.85")
	t.Setenv("HYPERGUILD_ROUTE_LOCAL_CEIL", "0.65")
	t.Setenv("HYPERGUILD_PASS_RATE_TTL_SECONDS", "30")

	cfg, err := config.LoadRouting()
	require.NoError(t, err)
	assert.Equal(t, "3250", cfg.Port)
	assert.Equal(t, "tok-xyz", cfg.MCPAuthToken)
	assert.Equal(t, "http://localhost:4000", cfg.LiteLLMBaseURL)
	assert.Equal(t, "lk", cfg.LiteLLMAPIKey)
	assert.Equal(t, "http://localhost:3300", cfg.BrainURL)
	assert.Equal(t, "qwen2-7b", cfg.LocalModel)
	assert.Equal(t, "claude-opus-4-7", cfg.ClaudeModel)
	assert.InDelta(t, 0.85, cfg.RouteLocalFloor, 1e-9)
	assert.InDelta(t, 0.65, cfg.RouteLocalCeil, 1e-9)
	assert.Equal(t, 30, cfg.PassRateTTLSeconds)
}

func TestLoadRoutingRejectsBadFloat(t *testing.T) {
	t.Setenv("HYPERGUILD_ROUTE_LOCAL_FLOOR", "not-a-number")
	_, err := config.LoadRouting()
	require.Error(t, err)
	assert.Contains(t, err.Error(), "HYPERGUILD_ROUTE_LOCAL_FLOOR")
}
  • Step 2: Run the test to confirm it fails
cd ~/Documents/local-dev/AI/hyperguild/.worktrees/mode-2-routing-pod
go test ./internal/config/... -run TestLoadRouting -v

Expected: FAIL — undefined: config.LoadRouting and undefined: config.RoutingConfig.

  • Step 3: Write the implementation

Create internal/config/routing.go:

package config

import (
	"fmt"
	"os"
	"strconv"
)

// RoutingConfig holds the runtime configuration for the routing pod.
// Separate from Config because the routing pod's surface differs from the supervisor's.
type RoutingConfig struct {
	Port               string  // ROUTING_PORT, default 3210
	MCPAuthToken       string  // ROUTING_MCP_TOKEN, optional bearer token
	LiteLLMBaseURL     string  // LITELLM_BASE_URL, default http://piguard:4000
	LiteLLMAPIKey      string  // LITELLM_API_KEY
	BrainURL           string  // BRAIN_URL, default http://ingestion.supervisor:3300
	LocalModel         string  // HYPERGUILD_LOCAL_MODEL, default qwen35
	ClaudeModel        string  // HYPERGUILD_CLAUDE_MODEL, default claude-sonnet-4-6
	RouteLocalFloor    float64 // HYPERGUILD_ROUTE_LOCAL_FLOOR, default 0.90
	RouteLocalCeil     float64 // HYPERGUILD_ROUTE_LOCAL_CEIL, default 0.70
	PassRateTTLSeconds int     // HYPERGUILD_PASS_RATE_TTL_SECONDS, default 60
}

func LoadRouting() (RoutingConfig, error) {
	cfg := RoutingConfig{
		Port:           envOr("ROUTING_PORT", "3210"),
		MCPAuthToken:   os.Getenv("ROUTING_MCP_TOKEN"),
		LiteLLMBaseURL: envOr("LITELLM_BASE_URL", "http://piguard:4000"),
		LiteLLMAPIKey:  os.Getenv("LITELLM_API_KEY"),
		BrainURL:       envOr("BRAIN_URL", "http://ingestion.supervisor:3300"),
		LocalModel:     envOr("HYPERGUILD_LOCAL_MODEL", "qwen35"),
		ClaudeModel:    envOr("HYPERGUILD_CLAUDE_MODEL", "claude-sonnet-4-6"),
	}

	floor, err := parseFloatEnv("HYPERGUILD_ROUTE_LOCAL_FLOOR", 0.90)
	if err != nil {
		return RoutingConfig{}, err
	}
	cfg.RouteLocalFloor = floor

	ceil, err := parseFloatEnv("HYPERGUILD_ROUTE_LOCAL_CEIL", 0.70)
	if err != nil {
		return RoutingConfig{}, err
	}
	cfg.RouteLocalCeil = ceil

	ttl, err := parseIntEnv("HYPERGUILD_PASS_RATE_TTL_SECONDS", 60)
	if err != nil {
		return RoutingConfig{}, err
	}
	cfg.PassRateTTLSeconds = ttl

	return cfg, nil
}

func parseFloatEnv(key string, def float64) (float64, error) {
	v := os.Getenv(key)
	if v == "" {
		return def, nil
	}
	f, err := strconv.ParseFloat(v, 64)
	if err != nil {
		return 0, fmt.Errorf("config: %s: %w", key, err)
	}
	return f, nil
}

func parseIntEnv(key string, def int) (int, error) {
	v := os.Getenv(key)
	if v == "" {
		return def, nil
	}
	n, err := strconv.Atoi(v)
	if err != nil {
		return 0, fmt.Errorf("config: %s: %w", key, err)
	}
	return n, nil
}
  • Step 4: Run the test to confirm it passes
go test ./internal/config/... -run TestLoadRouting -v

Expected: PASS — three subtests green.

  • Step 5: Run task check
task check 2>&1 | tail -20

Expected: lint clean, test green, vet clean, no drift, govulncheck clean.

  • Step 6: Commit
git add internal/config/routing.go internal/config/routing_test.go
git commit -m "feat(routing): RoutingConfig + LoadRouting"

Task 2: Decision policy

Worktree: hyperguild

Pure-function policy with no I/O. Decision rules in priority order: null → local; ≥floor → local; <ceil → claude; otherwise sample-band hash split.

Files:

  • Create: internal/routing/policy.go

  • Create: internal/routing/policy_test.go

  • Step 1: Write the failing test

Create internal/routing/policy_test.go:

package routing_test

import (
	"testing"

	"github.com/mathiasbq/supervisor/internal/routing"
	"github.com/stretchr/testify/assert"
)

func ptr(f float64) *float64 { return &f }

func TestPolicyDecide(t *testing.T) {
	p := routing.Policy{Floor: 0.9, Ceil: 0.7}

	cases := []struct {
		name     string
		passRate *float64
		hash     uint64
		want     routing.Decision
	}{
		{"null pass rate → local", nil, 0, routing.DecideLocal},
		{"null pass rate, hash irrelevant → local", nil, 0xDEADBEEF, routing.DecideLocal},
		{"at floor → local", ptr(0.9), 0, routing.DecideLocal},
		{"above floor → local", ptr(0.95), 0, routing.DecideLocal},
		{"below ceil → claude", ptr(0.5), 0, routing.DecideClaude},
		{"at ceil → sample-band even-hash → local", ptr(0.7), 0, routing.DecideLocal},
		{"sample band, even hash → local", ptr(0.8), 2, routing.DecideLocal},
		{"sample band, odd hash → claude", ptr(0.8), 3, routing.DecideClaude},
	}

	for _, tc := range cases {
		t.Run(tc.name, func(t *testing.T) {
			assert.Equal(t, tc.want, p.Decide(tc.passRate, tc.hash))
		})
	}
}
  • Step 2: Run the test
go test ./internal/routing/... -run TestPolicyDecide -v

Expected: FAIL — package internal/routing does not exist.

  • Step 3: Write the implementation

Create internal/routing/policy.go:

package routing

// Decision is the route picked for a single skill call.
type Decision int

const (
	DecideLocal Decision = iota
	DecideClaude
)

func (d Decision) String() string {
	if d == DecideLocal {
		return "local"
	}
	return "claude"
}

// Policy holds the floor/ceil thresholds for routing decisions.
//
// Rules (in order):
//
//  1. passRate == nil           → DecideLocal (default-to-local for cost-routable skills)
//  2. *passRate >= Floor        → DecideLocal (trust local)
//  3. *passRate <  Ceil         → DecideClaude (don't trust local)
//  4. otherwise (sample band)   → requestHash low bit picks: 0=local, 1=claude
type Policy struct {
	Floor float64
	Ceil  float64
}

// Decide returns the routing decision for a single call.
// requestHash is consulted only when passRate is in the sample band [Ceil, Floor).
func (p Policy) Decide(passRate *float64, requestHash uint64) Decision {
	if passRate == nil {
		return DecideLocal
	}
	if *passRate >= p.Floor {
		return DecideLocal
	}
	if *passRate < p.Ceil {
		return DecideClaude
	}
	if requestHash&1 == 0 {
		return DecideLocal
	}
	return DecideClaude
}
  • Step 4: Run the test to confirm it passes
go test ./internal/routing/... -run TestPolicyDecide -v

Expected: PASS — eight subtests green.

  • Step 5: Run task check
task check 2>&1 | tail -10
  • Step 6: Commit
git add internal/routing/policy.go internal/routing/policy_test.go
git commit -m "feat(routing): decision policy"

Task 3: Canonical request hash

Worktree: hyperguild

SHA-256-based hash of (system, user) for deterministic sample-band routing. Same prompt pair → same decision across calls.

Files:

  • Create: internal/routing/hash.go

  • Create: internal/routing/hash_test.go

  • Step 1: Write the failing test

Create internal/routing/hash_test.go:

package routing_test

import (
	"testing"

	"github.com/mathiasbq/supervisor/internal/routing"
	"github.com/stretchr/testify/assert"
)

func TestCanonicalHashDeterministic(t *testing.T) {
	a := routing.CanonicalHash("system one", "user one")
	b := routing.CanonicalHash("system one", "user one")
	assert.Equal(t, a, b, "same inputs must produce same hash")
}

func TestCanonicalHashDistinguishesInputs(t *testing.T) {
	cases := [][2]string{
		{"sys", "user"},
		{"sys", "user2"},
		{"sys2", "user"},
		{"", "system\x00user"}, // separator collision attempt
		{"system\x00user", ""},
	}
	seen := make(map[uint64]bool)
	for _, c := range cases {
		h := routing.CanonicalHash(c[0], c[1])
		assert.False(t, seen[h], "collision on %v", c)
		seen[h] = true
	}
}

func TestCanonicalHashLowBitDistribution(t *testing.T) {
	// Sanity check: across 1000 distinct inputs, low-bit split is roughly even.
	zeros, ones := 0, 0
	for i := 0; i < 1000; i++ {
		h := routing.CanonicalHash("sys", string(rune('a'+(i%26)))+string(rune(i)))
		if h&1 == 0 {
			zeros++
		} else {
			ones++
		}
	}
	// Allow ±15% deviation from 500/500. Tighter would be flaky on real data.
	assert.InDelta(t, 500, zeros, 150)
	assert.InDelta(t, 500, ones, 150)
}
  • Step 2: Run the test
go test ./internal/routing/... -run TestCanonicalHash -v

Expected: FAIL — undefined: routing.CanonicalHash.

  • Step 3: Write the implementation

Create internal/routing/hash.go:

package routing

import (
	"crypto/sha256"
	"encoding/binary"
)

// CanonicalHash returns a deterministic 64-bit hash of (system, user).
// Used to make sample-band routing decisions reproducible: identical input
// strings produce the same hash on every call, independent of process state.
//
// Inputs are joined with a 0x00 byte separator before hashing — distinguishes
// (system="ab", user="cd") from (system="abcd", user="").
func CanonicalHash(system, user string) uint64 {
	h := sha256.New()
	h.Write([]byte(system))
	h.Write([]byte{0})
	h.Write([]byte(user))
	sum := h.Sum(nil)
	return binary.BigEndian.Uint64(sum[:8])
}
  • Step 4: Run tests + task check
go test ./internal/routing/... -run TestCanonicalHash -v
task check 2>&1 | tail -10

Expected: PASS, all checks green.

  • Step 5: Commit
git add internal/routing/hash.go internal/routing/hash_test.go
git commit -m "feat(routing): canonical request hash"

Task 4: Pass-rate fetcher with TTL cache

Worktree: hyperguild

HTTP client that calls GET ${BrainURL}/pass-rate?skill=X&window=7d, caches the response (*float64, possibly nil) for TTL. On error, returns (nil, err) so the dispatch wrapper falls through to default-to-local.

Files:

  • Create: internal/routing/passrate.go

  • Create: internal/routing/passrate_test.go

  • Step 1: Write the failing test

Create internal/routing/passrate_test.go:

package routing_test

import (
	"context"
	"encoding/json"
	"net/http"
	"net/http/httptest"
	"sync/atomic"
	"testing"
	"time"

	"github.com/mathiasbq/supervisor/internal/routing"
	"github.com/stretchr/testify/assert"
	"github.com/stretchr/testify/require"
)

func TestFetcherGetReturnsPassRate(t *testing.T) {
	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		assert.Equal(t, http.MethodGet, r.Method)
		assert.Equal(t, "/pass-rate", r.URL.Path)
		assert.Equal(t, "tdd", r.URL.Query().Get("skill"))
		assert.Equal(t, "7d", r.URL.Query().Get("window"))
		w.Header().Set("Content-Type", "application/json")
		_ = json.NewEncoder(w).Encode(map[string]any{"skill": "tdd", "pass_rate": 0.94})
	}))
	defer srv.Close()

	f := routing.NewFetcher(srv.URL, "7d", time.Minute)
	pr, err := f.Get(context.Background(), "tdd")
	require.NoError(t, err)
	require.NotNil(t, pr)
	assert.InDelta(t, 0.94, *pr, 1e-9)
}

func TestFetcherGetReturnsNilWhenNoData(t *testing.T) {
	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		_ = json.NewEncoder(w).Encode(map[string]any{"skill": "novel", "pass_rate": nil})
	}))
	defer srv.Close()

	f := routing.NewFetcher(srv.URL, "7d", time.Minute)
	pr, err := f.Get(context.Background(), "novel")
	require.NoError(t, err)
	assert.Nil(t, pr)
}

func TestFetcherCachesWithinTTL(t *testing.T) {
	var calls int32
	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
		atomic.AddInt32(&calls, 1)
		_ = json.NewEncoder(w).Encode(map[string]any{"pass_rate": 0.5})
	}))
	defer srv.Close()

	f := routing.NewFetcher(srv.URL, "7d", time.Minute)
	for i := 0; i < 5; i++ {
		_, err := f.Get(context.Background(), "tdd")
		require.NoError(t, err)
	}
	assert.Equal(t, int32(1), atomic.LoadInt32(&calls), "should hit upstream once and serve four times from cache")
}

func TestFetcherSurfacesUpstreamError(t *testing.T) {
	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
		http.Error(w, "boom", http.StatusInternalServerError)
	}))
	defer srv.Close()

	f := routing.NewFetcher(srv.URL, "7d", time.Minute)
	pr, err := f.Get(context.Background(), "tdd")
	require.Error(t, err)
	assert.Nil(t, pr)
}
  • Step 2: Run the test
go test ./internal/routing/... -run TestFetcher -v

Expected: FAIL — undefined: routing.NewFetcher.

  • Step 3: Write the implementation

Create internal/routing/passrate.go:

package routing

import (
	"context"
	"encoding/json"
	"fmt"
	"net/http"
	"net/url"
	"sync"
	"time"
)

// Fetcher reads /pass-rate from the brain pod with a per-skill TTL cache.
type Fetcher struct {
	BaseURL string
	Window  string
	TTL     time.Duration
	HTTP    *http.Client

	mu    sync.Mutex
	cache map[string]cachedRate
}

type cachedRate struct {
	value *float64
	at    time.Time
}

type passRateResponse struct {
	PassRate *float64 `json:"pass_rate"`
}

// NewFetcher returns a Fetcher that calls baseURL + /pass-rate with the
// given window string. If ttl is zero, defaults to 60 seconds. The HTTP
// client uses a 1-second total timeout.
func NewFetcher(baseURL, window string, ttl time.Duration) *Fetcher {
	if ttl == 0 {
		ttl = 60 * time.Second
	}
	return &Fetcher{
		BaseURL: baseURL,
		Window:  window,
		TTL:     ttl,
		HTTP:    &http.Client{Timeout: time.Second},
		cache:   make(map[string]cachedRate),
	}
}

// Get returns the pass rate for the named skill, or nil if no data exists,
// or an error if the brain is unreachable. Caches successful results.
func (f *Fetcher) Get(ctx context.Context, skill string) (*float64, error) {
	f.mu.Lock()
	if c, ok := f.cache[skill]; ok && time.Since(c.at) < f.TTL {
		v := c.value
		f.mu.Unlock()
		return v, nil
	}
	f.mu.Unlock()

	u := fmt.Sprintf("%s/pass-rate?skill=%s&window=%s",
		f.BaseURL, url.QueryEscape(skill), url.QueryEscape(f.Window))
	req, err := http.NewRequestWithContext(ctx, http.MethodGet, u, nil)
	if err != nil {
		return nil, fmt.Errorf("passrate: build request: %w", err)
	}
	resp, err := f.HTTP.Do(req)
	if err != nil {
		return nil, fmt.Errorf("passrate: request: %w", err)
	}
	defer func() { _ = resp.Body.Close() }()
	if resp.StatusCode != http.StatusOK {
		return nil, fmt.Errorf("passrate: server returned status %d", resp.StatusCode)
	}

	var body passRateResponse
	if err := json.NewDecoder(resp.Body).Decode(&body); err != nil {
		return nil, fmt.Errorf("passrate: decode: %w", err)
	}

	f.mu.Lock()
	f.cache[skill] = cachedRate{value: body.PassRate, at: time.Now()}
	f.mu.Unlock()

	return body.PassRate, nil
}
  • Step 4: Run tests + task check
go test ./internal/routing/... -run TestFetcher -v
task check 2>&1 | tail -10
  • Step 5: Commit
git add internal/routing/passrate.go internal/routing/passrate_test.go
git commit -m "feat(routing): pass-rate fetcher with TTL cache"

Task 5: Decision logger

Worktree: hyperguild

Posts a session_log MCP call to the brain pod's /mcp endpoint after every routing decision. Best-effort: returns errors but the caller does not block real work on them.

Files:

  • Create: internal/routing/log.go

  • Create: internal/routing/log_test.go

  • Step 1: Write the failing test

Create internal/routing/log_test.go:

package routing_test

import (
	"context"
	"encoding/json"
	"io"
	"net/http"
	"net/http/httptest"
	"testing"

	"github.com/mathiasbq/supervisor/internal/routing"
	"github.com/stretchr/testify/assert"
	"github.com/stretchr/testify/require"
)

func TestLoggerLogDecision(t *testing.T) {
	var captured map[string]any
	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		assert.Equal(t, http.MethodPost, r.Method)
		assert.Equal(t, "/mcp", r.URL.Path)
		body, _ := io.ReadAll(r.Body)
		require.NoError(t, json.Unmarshal(body, &captured))
		_ = json.NewEncoder(w).Encode(map[string]any{"jsonrpc": "2.0", "id": 1, "result": map[string]any{"content": []map[string]any{{"type": "text", "text": "ok"}}}})
	}))
	defer srv.Close()

	l := routing.NewLogger(srv.URL)
	err := l.LogDecision(context.Background(), routing.LogEntry{
		SessionID:   "sess-1",
		Skill:       "code_review",
		Decision:    "local",
		Message:     "model=qwen35, pass_rate=0.94",
		ProjectRoot: "/home/x/proj",
		DurationMs:  1234,
		Failed:      false,
	})
	require.NoError(t, err)

	params := captured["params"].(map[string]any)
	assert.Equal(t, "tools/call", captured["method"])
	assert.Equal(t, "session_log", params["name"])

	args := params["arguments"].(map[string]any)
	assert.Equal(t, "_routing", args["skill"])
	assert.Equal(t, "decide", args["phase"])
	assert.Equal(t, "skip", args["final_status"])
	assert.Contains(t, args["message"].(string), "code_review: local")
	assert.Equal(t, "sess-1", args["session_id"])
	assert.Equal(t, "/home/x/proj", args["project_root"])
	assert.Equal(t, float64(1234), args["duration_ms"])
}

func TestLoggerLogFailure(t *testing.T) {
	var captured map[string]any
	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		body, _ := io.ReadAll(r.Body)
		_ = json.Unmarshal(body, &captured)
		_ = json.NewEncoder(w).Encode(map[string]any{"jsonrpc": "2.0", "id": 1, "result": map[string]any{}})
	}))
	defer srv.Close()

	l := routing.NewLogger(srv.URL)
	err := l.LogDecision(context.Background(), routing.LogEntry{
		SessionID: "s", Skill: "debug", Decision: "local", Message: "litellm down", Failed: true,
	})
	require.NoError(t, err)

	args := captured["params"].(map[string]any)["arguments"].(map[string]any)
	assert.Equal(t, "fail", args["final_status"])
}

func TestLoggerSurfacesUpstreamError(t *testing.T) {
	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
		http.Error(w, "down", http.StatusBadGateway)
	}))
	defer srv.Close()

	l := routing.NewLogger(srv.URL)
	err := l.LogDecision(context.Background(), routing.LogEntry{Skill: "x", SessionID: "y", Decision: "local"})
	require.Error(t, err)
}
  • Step 2: Run the test
go test ./internal/routing/... -run TestLogger -v

Expected: FAIL — undefined: routing.NewLogger.

  • Step 3: Write the implementation

Create internal/routing/log.go:

package routing

import (
	"bytes"
	"context"
	"encoding/json"
	"fmt"
	"net/http"
	"time"
)

// LogEntry describes a single routing decision to log via the brain MCP.
type LogEntry struct {
	SessionID   string
	Skill       string // the original skill the call routed (e.g., "code_review")
	Decision    string // "local" or "claude" or "claude_fallback"
	Message     string // free-form, e.g. "model=qwen35, pass_rate=0.94"
	ProjectRoot string
	DurationMs  int64
	Failed      bool // true → final_status: "fail"; false → "skip"
}

// Logger posts session_log entries to a brain MCP at BrainURL + /mcp.
type Logger struct {
	BrainURL string
	HTTP     *http.Client
}

// NewLogger creates a Logger with a 2-second HTTP timeout.
func NewLogger(brainURL string) *Logger {
	return &Logger{
		BrainURL: brainURL,
		HTTP:     &http.Client{Timeout: 2 * time.Second},
	}
}

// LogDecision posts a session_log MCP call. Errors are returned but the caller
// MUST NOT block real work on them — logging is best-effort.
func (l *Logger) LogDecision(ctx context.Context, e LogEntry) error {
	status := "skip"
	if e.Failed {
		status = "fail"
	}
	payload := map[string]any{
		"jsonrpc": "2.0",
		"id":      1,
		"method":  "tools/call",
		"params": map[string]any{
			"name": "session_log",
			"arguments": map[string]any{
				"session_id":   e.SessionID,
				"skill":        "_routing",
				"phase":        "decide",
				"final_status": status,
				"message":      fmt.Sprintf("%s: %s — %s", e.Skill, e.Decision, e.Message),
				"duration_ms":  e.DurationMs,
				"project_root": e.ProjectRoot,
			},
		},
	}
	body, err := json.Marshal(payload)
	if err != nil {
		return fmt.Errorf("log: marshal: %w", err)
	}
	req, err := http.NewRequestWithContext(ctx, http.MethodPost, l.BrainURL+"/mcp", bytes.NewReader(body))
	if err != nil {
		return fmt.Errorf("log: build request: %w", err)
	}
	req.Header.Set("Content-Type", "application/json")
	resp, err := l.HTTP.Do(req)
	if err != nil {
		return fmt.Errorf("log: request: %w", err)
	}
	defer func() { _ = resp.Body.Close() }()
	if resp.StatusCode != http.StatusOK {
		return fmt.Errorf("log: server returned status %d", resp.StatusCode)
	}
	return nil
}
  • Step 4: Run tests + task check
go test ./internal/routing/... -run TestLogger -v
task check 2>&1 | tail -10
  • Step 5: Commit
git add internal/routing/log.go internal/routing/log_test.go
git commit -m "feat(routing): decision logger via brain MCP session_log"

Task 6: Router (dispatch wrapper)

Worktree: hyperguild

Composes Fetcher + Policy + Logger + a CompleteFunc. The wrapper is what the four skill packages receive as their CompleteFunc. On a local-route error, it falls open by retrying once on the Claude model.

Files:

  • Create: internal/routing/router.go

  • Create: internal/routing/router_test.go

  • Step 1: Write the failing test

Create internal/routing/router_test.go:

package routing_test

import (
	"context"
	"encoding/json"
	"errors"
	"net/http"
	"net/http/httptest"
	"sync"
	"testing"
	"time"

	"github.com/mathiasbq/supervisor/internal/routing"
	"github.com/stretchr/testify/assert"
	"github.com/stretchr/testify/require"
)

type fakeLLM struct {
	mu    sync.Mutex
	calls []struct{ Model, System, User string }
	resp  string
	err   error
	errOn string // if non-empty, only the named model errors
}

func (f *fakeLLM) Complete(_ context.Context, model, system, user string) (string, int64, error) {
	f.mu.Lock()
	defer f.mu.Unlock()
	f.calls = append(f.calls, struct{ Model, System, User string }{model, system, user})
	if f.errOn == model {
		return "", 0, f.err
	}
	if f.err != nil && f.errOn == "" {
		return "", 0, f.err
	}
	return f.resp, 100, nil
}

func newRouter(t *testing.T, llm *fakeLLM, passRate float64) (*routing.Router, *httptest.Server, *httptest.Server) {
	t.Helper()
	brain := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		switch r.URL.Path {
		case "/pass-rate":
			_ = json.NewEncoder(w).Encode(map[string]any{"pass_rate": passRate})
		case "/mcp":
			_ = json.NewEncoder(w).Encode(map[string]any{"jsonrpc": "2.0", "id": 1, "result": map[string]any{}})
		}
	}))
	t.Cleanup(brain.Close)

	r := &routing.Router{
		Fetcher:     routing.NewFetcher(brain.URL, "7d", time.Minute),
		Logger:      routing.NewLogger(brain.URL),
		Policy:      routing.Policy{Floor: 0.9, Ceil: 0.7},
		LocalModel:  "qwen35",
		ClaudeModel: "claude-sonnet-4-6",
		Complete:    llm.Complete,
	}
	return r, brain, brain
}

func TestRouterRoutesLocalAtHighPassRate(t *testing.T) {
	llm := &fakeLLM{resp: "ok"}
	r, _, _ := newRouter(t, llm, 0.95)

	out, _, err := r.Run(context.Background(), routing.RunInput{
		Skill: "code_review", System: "sys", User: "user", SessionID: "s1", ProjectRoot: "/p",
	})
	require.NoError(t, err)
	assert.Equal(t, "ok", out)

	llm.mu.Lock()
	defer llm.mu.Unlock()
	require.Len(t, llm.calls, 1)
	assert.Equal(t, "qwen35", llm.calls[0].Model)
}

func TestRouterRoutesClaudeAtLowPassRate(t *testing.T) {
	llm := &fakeLLM{resp: "ok"}
	r, _, _ := newRouter(t, llm, 0.3)

	_, _, err := r.Run(context.Background(), routing.RunInput{
		Skill: "code_review", System: "sys", User: "user", SessionID: "s2",
	})
	require.NoError(t, err)

	llm.mu.Lock()
	defer llm.mu.Unlock()
	require.Len(t, llm.calls, 1)
	assert.Equal(t, "claude-sonnet-4-6", llm.calls[0].Model)
}

func TestRouterFailsOpenLocalErrorToClaude(t *testing.T) {
	llm := &fakeLLM{resp: "ok-after-fallback", err: errors.New("local boom"), errOn: "qwen35"}
	r, _, _ := newRouter(t, llm, 0.95) // would route local

	out, _, err := r.Run(context.Background(), routing.RunInput{
		Skill: "code_review", System: "sys", User: "user", SessionID: "s3",
	})
	require.NoError(t, err)
	assert.Equal(t, "ok-after-fallback", out)

	llm.mu.Lock()
	defer llm.mu.Unlock()
	require.Len(t, llm.calls, 2)
	assert.Equal(t, "qwen35", llm.calls[0].Model)
	assert.Equal(t, "claude-sonnet-4-6", llm.calls[1].Model)
}

func TestRouterDefaultsToLocalWhenBrainUnreachable(t *testing.T) {
	// Brain returns 500 → fetcher errors → router treats pass rate as nil → local.
	brain := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
		http.Error(w, "down", http.StatusInternalServerError)
	}))
	defer brain.Close()

	llm := &fakeLLM{resp: "ok"}
	r := &routing.Router{
		Fetcher:     routing.NewFetcher(brain.URL, "7d", time.Minute),
		Logger:      routing.NewLogger(brain.URL),
		Policy:      routing.Policy{Floor: 0.9, Ceil: 0.7},
		LocalModel:  "qwen35",
		ClaudeModel: "claude-sonnet-4-6",
		Complete:    llm.Complete,
	}

	_, _, err := r.Run(context.Background(), routing.RunInput{
		Skill: "code_review", System: "sys", User: "user", SessionID: "s4",
	})
	require.NoError(t, err)

	llm.mu.Lock()
	defer llm.mu.Unlock()
	require.Len(t, llm.calls, 1)
	assert.Equal(t, "qwen35", llm.calls[0].Model)
}
  • Step 2: Run the test
go test ./internal/routing/... -run TestRouter -v

Expected: FAIL — undefined: routing.Router, undefined: routing.RunInput.

  • Step 3: Write the implementation

Create internal/routing/router.go:

package routing

import (
	"context"
	"fmt"
	"log/slog"
)

// CompleteFunc matches the signature used by every skill package's Config.
type CompleteFunc func(ctx context.Context, model, system, user string) (string, int64, error)

// RunInput captures the per-call inputs the dispatch wrapper needs.
type RunInput struct {
	Skill       string
	System      string
	User        string
	SessionID   string
	ProjectRoot string
}

// Router composes a pass-rate fetcher, a decision policy, a session logger,
// and a LiteLLM client. Skill packages receive Router.Run as their CompleteFunc.
type Router struct {
	Fetcher     *Fetcher
	Logger      *Logger
	Policy      Policy
	LocalModel  string
	ClaudeModel string
	Complete    CompleteFunc
}

// Run executes one skill call: decides local vs claude, calls LiteLLM, logs the
// decision. On local-side error, falls open by retrying once on the Claude model.
func (r *Router) Run(ctx context.Context, in RunInput) (string, int64, error) {
	pr, ferr := r.Fetcher.Get(ctx, in.Skill)
	if ferr != nil {
		slog.Warn("router: pass-rate unreachable, defaulting to local", "skill", in.Skill, "err", ferr)
		pr = nil
	}
	hash := CanonicalHash(in.System, in.User)
	decision := r.Policy.Decide(pr, hash)

	model := r.ClaudeModel
	if decision == DecideLocal {
		model = r.LocalModel
	}

	out, ms, err := r.Complete(ctx, model, in.System, in.User)
	_ = r.Logger.LogDecision(ctx, LogEntry{
		SessionID:   in.SessionID,
		Skill:       in.Skill,
		Decision:    decision.String(),
		Message:     fmt.Sprintf("model=%s, pass_rate=%s", model, formatPassRate(pr)),
		ProjectRoot: in.ProjectRoot,
		DurationMs:  ms,
		Failed:      err != nil,
	})

	if err != nil && decision == DecideLocal {
		slog.Warn("router: local failed, falling open to claude", "skill", in.Skill, "err", err)
		out, ms, err = r.Complete(ctx, r.ClaudeModel, in.System, in.User)
		_ = r.Logger.LogDecision(ctx, LogEntry{
			SessionID:   in.SessionID,
			Skill:       in.Skill,
			Decision:    "claude_fallback",
			Message:     fmt.Sprintf("model=%s, after-local-error", r.ClaudeModel),
			ProjectRoot: in.ProjectRoot,
			DurationMs:  ms,
			Failed:      err != nil,
		})
	}
	return out, ms, err
}

func formatPassRate(pr *float64) string {
	if pr == nil {
		return "null"
	}
	return fmt.Sprintf("%.2f", *pr)
}
  • Step 4: Run tests + task check
go test ./internal/routing/... -run TestRouter -v
task check 2>&1 | tail -10
  • Step 5: Commit
git add internal/routing/router.go internal/routing/router_test.go
git commit -m "feat(routing): router dispatch wrapper"

Task 7: Snapshot test for tool-schema parity

Worktree: hyperguild

Capture the supervisor's current advertisement of the four routed skills (code_review, debug, retrospective, trainer) into a JSON snapshot file. Add a test that spins up a registry with the same four skill packages and asserts tools/list output byte-equals the snapshot. Pins the schema contract so a downstream change in any skill package fails the routing pod's test loudly.

Files:

  • Create: internal/routing/testdata/tools_list.snapshot.json

  • Create: internal/routing/snapshot_test.go

  • Step 1: Capture the supervisor's current advertisement

cd ~/Documents/local-dev/AI/hyperguild/.worktrees/mode-2-routing-pod
mkdir -p internal/routing/testdata
go run ./cmd/supervisor &
SUPERVISOR_PID=$!
sleep 2
curl -sS -X POST http://localhost:3200/mcp \
  -H 'Content-Type: application/json' \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' \
  | jq '.result.tools | map(select(.name == "code_review" or .name == "debug" or .name == "retrospective" or .name == "trainer")) | sort_by(.name)' \
  > internal/routing/testdata/tools_list.snapshot.json
kill $SUPERVISOR_PID
wait $SUPERVISOR_PID 2>/dev/null

If the supervisor binary requires extra env vars to start, set them inline:

SUPERVISOR_CONFIG_DIR=./config/supervisor go run ./cmd/supervisor &

Inspect the file:

cat internal/routing/testdata/tools_list.snapshot.json | jq 'length'

Expected: 4.

  • Step 2: Write the failing test

Create internal/routing/snapshot_test.go:

package routing_test

import (
	"context"
	"encoding/json"
	"os"
	"sort"
	"testing"

	iexec "github.com/mathiasbq/supervisor/internal/exec"
	"github.com/mathiasbq/supervisor/internal/registry"
	"github.com/mathiasbq/supervisor/internal/skills/debug"
	"github.com/mathiasbq/supervisor/internal/skills/retrospective"
	"github.com/mathiasbq/supervisor/internal/skills/review"
	"github.com/mathiasbq/supervisor/internal/skills/trainer"
	"github.com/stretchr/testify/assert"
	"github.com/stretchr/testify/require"
)

// TestToolsListMatchesSupervisorSnapshot pins the four routed skills' tool
// definitions to the supervisor's current advertisement. If a skill package
// changes its schema, this test fails loudly so the snapshot can be updated
// in lockstep with the consumer.
func TestToolsListMatchesSupervisorSnapshot(t *testing.T) {
	complete := func(_ context.Context, _, _, _ string) (string, int64, error) {
		return "", 0, nil
	}
	_ = iexec.NewLiteLLM // keep import for future use

	reg := registry.New()
	reg.Register(review.New(review.Config{
		SkillPrompt:  "stub",
		DefaultModel: "stub",
		CompleteFunc: complete,
	}))
	reg.Register(debug.New(debug.Config{
		SkillPrompt:  "stub",
		DefaultModel: "stub",
		CompleteFunc: complete,
	}))
	reg.Register(retrospective.New(retrospective.Config{
		SkillPrompt:  "stub",
		DefaultModel: "stub",
		CompleteFunc: complete,
	}))
	reg.Register(trainer.New(trainer.Config{
		ReaderPrompt: "stub",
		WriterPrompt: "stub",
		DefaultModel: "stub",
		CompleteFunc: complete,
	}))

	tools := reg.Tools()
	// Filter to the four routed skills only (registry may expose additional tools).
	wanted := map[string]bool{"code_review": true, "debug": true, "retrospective": true, "trainer": true}
	var routed []registry.ToolDef
	for _, td := range tools {
		if wanted[td.Name] {
			routed = append(routed, td)
		}
	}
	sort.Slice(routed, func(i, j int) bool { return routed[i].Name < routed[j].Name })

	got, err := json.MarshalIndent(routed, "", "  ")
	require.NoError(t, err)

	want, err := os.ReadFile("testdata/tools_list.snapshot.json")
	require.NoError(t, err)

	// Normalize both via re-encode so whitespace differences don't dominate.
	var gotV, wantV any
	require.NoError(t, json.Unmarshal(got, &gotV))
	require.NoError(t, json.Unmarshal(want, &wantV))

	gotN, _ := json.MarshalIndent(gotV, "", "  ")
	wantN, _ := json.MarshalIndent(wantV, "", "  ")

	assert.Equal(t, string(wantN), string(gotN),
		"tool advertisement drifted from supervisor snapshot — update testdata/tools_list.snapshot.json deliberately if the schema change is intentional")
}

If the actual skill tool name is review rather than code_review (or vice versa), discover by inspecting internal/skills/review/skill.go's Tools() and adjust both the snapshot capture filter and the test's wanted map. Use the discovered name throughout the rest of the plan.

  • Step 3: Run the test
go test ./internal/routing/... -run TestToolsListMatchesSupervisorSnapshot -v

Expected: PASS — the snapshot was captured from the same registry the test exercises. If FAIL, the captured names differ from the wanted map; reconcile names per the note above.

  • Step 4: task check
task check 2>&1 | tail -10
  • Step 5: Commit
git add internal/routing/snapshot_test.go internal/routing/testdata/tools_list.snapshot.json
git commit -m "test(routing): pin tool-schema parity with supervisor"

Task 8: cmd/routing/main.go wiring

Worktree: hyperguild

Compose the binary: load config, build LiteLLM client, build Fetcher/Logger/Router, register the four skills, mount on the existing internal/mcp server with bearer auth.

Files:

  • Create: cmd/routing/main.go

  • Create: cmd/routing/main_test.go

  • Step 1: Write the integration test first

Create cmd/routing/main_test.go:

package main_test

import (
	"context"
	"encoding/json"
	"net/http"
	"net/http/httptest"
	"os/exec"
	"strings"
	"testing"
	"time"

	"github.com/stretchr/testify/assert"
	"github.com/stretchr/testify/require"
)

// TestRoutingPodEndToEnd boots the binary against fake LiteLLM + brain servers,
// calls tools/list and one tools/call, and verifies the brain saw a session_log POST.
func TestRoutingPodEndToEnd(t *testing.T) {
	if testing.Short() {
		t.Skip("end-to-end binary boot")
	}

	var brainHits int
	llm := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
		_ = json.NewEncoder(w).Encode(map[string]any{
			"choices": []map[string]any{{"message": map[string]any{"role": "assistant", "content": "stub"}}},
		})
	}))
	defer llm.Close()

	brain := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		switch r.URL.Path {
		case "/pass-rate":
			_ = json.NewEncoder(w).Encode(map[string]any{"pass_rate": 0.95})
		case "/mcp":
			brainHits++
			_ = json.NewEncoder(w).Encode(map[string]any{"jsonrpc": "2.0", "id": 1, "result": map[string]any{}})
		}
	}))
	defer brain.Close()

	bin := buildRouting(t)
	cmd := exec.Command(bin)
	cmd.Env = append(cmd.Env,
		"ROUTING_PORT=33310",
		"LITELLM_BASE_URL="+llm.URL,
		"LITELLM_API_KEY=stub",
		"BRAIN_URL="+brain.URL,
		"SUPERVISOR_CONFIG_DIR=./config/supervisor",
		"PATH="+osPath(),
	)
	require.NoError(t, cmd.Start())
	t.Cleanup(func() { _ = cmd.Process.Kill() })

	require.NoError(t, waitForPort(t, "127.0.0.1:33310", 5*time.Second))

	resp := mcpCall(t, "http://127.0.0.1:33310/mcp", `{"jsonrpc":"2.0","id":1,"method":"tools/list"}`)
	assert.Contains(t, resp, "code_review")

	resp = mcpCall(t, "http://127.0.0.1:33310/mcp", `{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"code_review","arguments":{"project_root":"/tmp","files":["README.md"]}}}`)
	_ = resp // shape varies by skill; we only need a 200

	// Wait briefly for the async session_log to land.
	deadline := time.Now().Add(2 * time.Second)
	for time.Now().Before(deadline) && brainHits < 2 {
		time.Sleep(50 * time.Millisecond)
	}
	assert.GreaterOrEqual(t, brainHits, 2, "expected at least one /pass-rate hit and one /mcp session_log hit")
}

Add helpers in the same file:

func buildRouting(t *testing.T) string {
	t.Helper()
	bin := t.TempDir() + "/routing"
	out, err := exec.Command("go", "build", "-o", bin, "./cmd/routing").CombinedOutput()
	require.NoError(t, err, "build failed: %s", out)
	return bin
}

func waitForPort(_ *testing.T, addr string, dur time.Duration) error {
	deadline := time.Now().Add(dur)
	for time.Now().Before(deadline) {
		c, err := http.Get("http://" + addr + "/healthz")
		if err == nil {
			c.Body.Close()
			return nil
		}
		// fallback: try /mcp tools/list — it'll 400 but TCP open is enough
		conn, err := http.NewRequest(http.MethodPost, "http://"+addr+"/mcp", strings.NewReader(`{}`))
		if err == nil {
			r, err := http.DefaultClient.Do(conn)
			if err == nil {
				r.Body.Close()
				return nil
			}
		}
		time.Sleep(50 * time.Millisecond)
	}
	return context.DeadlineExceeded
}

func mcpCall(t *testing.T, url, body string) string {
	t.Helper()
	r, err := http.Post(url, "application/json", strings.NewReader(body))
	require.NoError(t, err)
	defer r.Body.Close()
	var b strings.Builder
	_, _ = b.ReadFrom(r.Body)
	return b.String()
}

func osPath() string {
	for _, e := range append([]string{}, exec.Command("env").Env...) {
		if strings.HasPrefix(e, "PATH=") {
			return strings.TrimPrefix(e, "PATH=")
		}
	}
	return "/usr/bin:/bin"
}
  • Step 2: Run the test
go test ./cmd/routing/... -v

Expected: FAIL — cmd/routing/main.go doesn't exist.

  • Step 3: Write the binary

Create cmd/routing/main.go:

// cmd/routing/main.go
package main

import (
	"context"
	"log/slog"
	"net/http"
	"os"
	"time"

	"github.com/mathiasbq/supervisor/internal/config"
	iexec "github.com/mathiasbq/supervisor/internal/exec"
	"github.com/mathiasbq/supervisor/internal/mcp"
	"github.com/mathiasbq/supervisor/internal/registry"
	"github.com/mathiasbq/supervisor/internal/routing"
	"github.com/mathiasbq/supervisor/internal/skills/debug"
	"github.com/mathiasbq/supervisor/internal/skills/retrospective"
	"github.com/mathiasbq/supervisor/internal/skills/review"
	"github.com/mathiasbq/supervisor/internal/skills/trainer"
)

func main() {
	logger := slog.New(slog.NewTextHandler(os.Stderr, nil))
	slog.SetDefault(logger)

	cfg, err := config.LoadRouting()
	if err != nil {
		logger.Error("config load failed", "err", err)
		os.Exit(1)
	}

	// Load prompts from config dir (same files the supervisor uses).
	configDir := envOr("SUPERVISOR_CONFIG_DIR", "/app/config/supervisor")
	mustRead := func(path string) string {
		b, err := os.ReadFile(configDir + "/" + path)
		if err != nil {
			logger.Error("read prompt failed", "path", path, "err", err)
			os.Exit(1)
		}
		return string(b)
	}

	llm := iexec.NewLiteLLM(cfg.LiteLLMBaseURL, cfg.LiteLLMAPIKey, 0)

	router := &routing.Router{
		Fetcher:     routing.NewFetcher(cfg.BrainURL, "7d", time.Duration(cfg.PassRateTTLSeconds)*time.Second),
		Logger:      routing.NewLogger(cfg.BrainURL),
		Policy:      routing.Policy{Floor: cfg.RouteLocalFloor, Ceil: cfg.RouteLocalCeil},
		LocalModel:  cfg.LocalModel,
		ClaudeModel: cfg.ClaudeModel,
		Complete:    llm.Complete,
	}

	// Skill packages call CompleteFunc(ctx, model, system, user) — no session_id
	// or project_root in the signature. Rather than modifying every skill's API
	// (and inflating Plan 6's blast radius), the routing pod logs every decision
	// under a fixed session_id "_routing". Operators query
	// `GET /pass-rate?skill=_routing&window=...` to inspect routing health; per-
	// session correlation is sacrificed for a much simpler implementation.
	const routingSessionID = "_routing"
	wrap := func(skillName string) routing.CompleteFunc {
		return func(ctx context.Context, _, system, user string) (string, int64, error) {
			// The model param is ignored: the router picks the model based on policy.
			return router.Run(ctx, routing.RunInput{
				Skill:       skillName,
				System:      system,
				User:        user,
				SessionID:   routingSessionID,
				ProjectRoot: "",
			})
		}
	}

	reg := registry.New()
	reg.Register(review.New(review.Config{
		SkillPrompt:  mustRead("review.md"),
		DefaultModel: cfg.LocalModel,
		CompleteFunc: wrap("code_review"),
	}))
	reg.Register(debug.New(debug.Config{
		SkillPrompt:  mustRead("debug.md"),
		DefaultModel: cfg.LocalModel,
		CompleteFunc: wrap("debug"),
	}))
	reg.Register(retrospective.New(retrospective.Config{
		SkillPrompt:  mustRead("retrospective.md"),
		DefaultModel: cfg.LocalModel,
		CompleteFunc: wrap("retrospective"),
	}))
	reg.Register(trainer.New(trainer.Config{
		ReaderPrompt: mustRead("trainer-reader.md"),
		WriterPrompt: mustRead("trainer-writer.md"),
		DefaultModel: cfg.LocalModel,
		CompleteFunc: wrap("trainer"),
	}))

	srv := mcp.NewServer(reg, cfg.MCPAuthToken)
	mux := http.NewServeMux()
	mux.Handle("/mcp", srv)
	mux.HandleFunc("/healthz", func(w http.ResponseWriter, _ *http.Request) {
		w.WriteHeader(http.StatusOK)
	})

	addr := ":" + cfg.Port
	logger.Info("routing pod starting", "addr", addr,
		"local", cfg.LocalModel, "claude", cfg.ClaudeModel,
		"floor", cfg.RouteLocalFloor, "ceil", cfg.RouteLocalCeil)
	if err := http.ListenAndServe(addr, mux); err != nil {
		logger.Error("server stopped", "err", err)
		os.Exit(1)
	}
}

func envOr(key, def string) string {
	if v := os.Getenv(key); v != "" {
		return v
	}
	return def
}

If the existing skill packages' Config field names differ from what's used here (e.g. SkillPrompt vs Prompt), adjust by reading each package's skill.go.

  • Step 4: Run integration test + task check
go test ./cmd/routing/... -v
task check 2>&1 | tail -15

Expected: PASS for both.

  • Step 5: Commit
git add cmd/routing/main.go cmd/routing/main_test.go
git commit -m "feat(routing): cmd/routing binary"

Task 9: Update mode client-local template

Worktree: hyperguild

Replace the _routing_pending placeholder with a real headers block carrying X-Hyperguild-Mode: client-local. URL stays at koala:30310/mcp.

Files:

  • Modify: cmd/hyperguild/mode.go

  • Modify: cmd/hyperguild/mode_test.go

  • Modify: cmd/hyperguild/README.md

  • Step 1: Update the failing test

In cmd/hyperguild/mode_test.go, find the existing TestModeClientLocal (or equivalent). Add an assertion for the new shape:

func TestModeClientLocalHasRoutingHeader(t *testing.T) {
	tmp := t.TempDir() + "/mcp.json"
	out := &bytes.Buffer{}
	stderr := &bytes.Buffer{}
	require.NoError(t, runMode(context.Background(), []string{"client-local", "--out", tmp}, nil, out, stderr))

	body, err := os.ReadFile(tmp)
	require.NoError(t, err)
	var doc map[string]any
	require.NoError(t, json.Unmarshal(body, &doc))

	servers := doc["mcpServers"].(map[string]any)
	routing := servers["routing"].(map[string]any)
	assert.Equal(t, "http://koala:30310/mcp", routing["url"])
	assert.NotContains(t, routing, "_routing_pending", "placeholder should be removed once Plan 6 ships")

	headers, ok := routing["headers"].(map[string]any)
	require.True(t, ok, "routing entry should have headers block")
	assert.Equal(t, "client-local", headers["X-Hyperguild-Mode"])
}
  • Step 2: Run the test
go test ./cmd/hyperguild/... -run TestModeClientLocal -v

Expected: FAIL — _routing_pending is still there OR headers is missing.

  • Step 3: Update mode.go

Replace the routing entry inside modeClientLocal:

"routing": map[string]any{
    "url":         "http://koala:30310/mcp",
    "description": "Mode 2 routing pod — routes skill calls to LiteLLM/local",
    "headers": map[string]any{
        "X-Hyperguild-Mode": "client-local",
    },
},
  • Step 4: Update cmd/hyperguild/README.md

Find the section that mentions "Plan 6 — routing pod not deployed yet" and rewrite that paragraph:

The `routing` entry points at `koala:30310/mcp` (the routing pod, deployed
in Plan 6). The `X-Hyperguild-Mode: client-local` header is forward-compat
for future modes; the pod treats absent or unknown values as `client-local`.
  • Step 5: Run tests + task check
go test ./cmd/hyperguild/... -run TestModeClientLocal -v
task check 2>&1 | tail -10
  • Step 6: Commit
git add cmd/hyperguild/mode.go cmd/hyperguild/mode_test.go cmd/hyperguild/README.md
git commit -m "feat(hyperguild): mode client-local writes routing headers"

Task 10: Dockerfile.routing + CD workflow extension

Worktree: hyperguild

Add a Dockerfile for the routing binary and extend the CD workflow to build + push the image and update the infra repo's routing deployment manifest.

Files:

  • Create: Dockerfile.routing

  • Modify: .gitea/workflows/cd.yml

  • Step 1: Write Dockerfile.routing

# syntax=docker/dockerfile:1

# ── Build stage ───────────────────────────────────────────────────────────────
FROM golang:1.26-bookworm AS builder

ARG VERSION=dev
WORKDIR /src

COPY go.mod go.sum ./
RUN go mod download

COPY . .
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 \
    go build -trimpath -ldflags="-s -w -X main.version=${VERSION}" \
    -o /out/routing ./cmd/routing

# ── Runtime stage ─────────────────────────────────────────────────────────────
FROM gcr.io/distroless/base-debian12

COPY --from=builder /out/routing /usr/local/bin/routing
COPY config/ /app/config/

ENV SUPERVISOR_CONFIG_DIR=/app/config/supervisor
ENV ROUTING_PORT=3210

EXPOSE 3210

USER 65532:65532

ENTRYPOINT ["/usr/local/bin/routing"]
  • Step 2: Extend .gitea/workflows/cd.yml

Add an env: entry:

env:
  SERVICE: supervisor
  IMAGE: gitea.d-ma.be/mathias/supervisor
  INGESTION_IMAGE: gitea.d-ma.be/mathias/ingestion
  ROUTING_IMAGE: gitea.d-ma.be/mathias/routing
  INFRA_REPO: git@gitea.d-ma.be:mathias/infra.git
  BUILDKIT_HOST: unix:///run/buildkit/buildkitd.sock

Add a new step after the ingestion build step:

- name: Build and push routing image
  run: |
    set -e
    trap 'rm -f /tmp/routing-image.tar' EXIT
    IMAGE_TAG="${{ github.sha }}"
    echo "Building ${ROUTING_IMAGE}:${IMAGE_TAG}"

    buildctl --addr "${BUILDKIT_HOST}" build \
      --frontend dockerfile.v0 \
      --local context=. \
      --local dockerfile=. \
      --opt filename=Dockerfile.routing \
      --opt build-arg:VERSION="${IMAGE_TAG}" \
      --output type=oci,dest=/tmp/routing-image.tar

    skopeo copy \
      oci-archive:/tmp/routing-image.tar \
      docker://${ROUTING_IMAGE}:${IMAGE_TAG} \
      --dest-creds "${{ secrets.REGISTRY_CREDS }}"

    echo "Built and pushed ${ROUTING_IMAGE}:${IMAGE_TAG}"

In the "Update infra repo" step, add a third sed and update the commit:

sed -i "s|gitea.d-ma.be/mathias/routing:.*|gitea.d-ma.be/mathias/routing:${IMAGE_TAG}|" \
  "k3s/apps/routing/deployment.yaml"

git config user.email "cd-bot@d-ma.be"
git config user.name "CD Bot"
git add "k3s/apps/${SERVICE}/deployment.yaml" \
        "k3s/apps/${SERVICE}/ingestion-deployment.yaml" \
        "k3s/apps/routing/deployment.yaml"
git commit -m "chore(deploy): supervisor+ingestion+routing → ${IMAGE_TAG}"
  • Step 3: Validate the YAML locally
yq eval '.jobs.deploy.steps | length' .gitea/workflows/cd.yml

Expected: a number greater than the original (one new step added).

  • Step 4: Commit

The workflow change is hot — once pushed, CD will try to build the routing image. Until the infra repo has k3s/apps/routing/deployment.yaml, the sed line is a no-op (sed succeeds because the file isn't matched anywhere; but the git add will fail). Two options:

Option A (preferred): Land the infra-repo manifests (Tasks 1112) in the infra worktree FIRST, push them so they exist on infra main, then push this commit. Order: Tasks 11 → 12 → 10.

Option B: Land the workflow change with a guard, then drop the guard once manifests exist.

Implementer should pick Option A. After the manifests are in place:

git add Dockerfile.routing .gitea/workflows/cd.yml
git commit -m "build(routing): Dockerfile + CD workflow"

DO NOT push this commit until Tasks 11 and 12 have been pushed to the infra repo's main.


Task 11: Routing pod manifests (infra worktree)

Worktree: infra

Create the k3s manifests for the routing pod. Mirror the supervisor's structure for operator familiarity.

Files:

  • Create: k3s/apps/routing/namespace.yaml

  • Create: k3s/apps/routing/deployment.yaml

  • Create: k3s/apps/routing/service.yaml

  • Create: k3s/apps/routing/nodeport.yaml

  • Create: k3s/apps/routing/kustomization.yaml

  • Modify: k3s/apps/kustomization.yaml

  • Step 1: namespace.yaml

apiVersion: v1
kind: Namespace
metadata:
  name: routing
  • Step 2: deployment.yaml

The image tag will be bumped by CD; seed it with a placeholder that gets overwritten on first deploy.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: routing
  namespace: routing
spec:
  replicas: 1
  selector:
    matchLabels:
      app: routing
  template:
    metadata:
      labels:
        app: routing
    spec:
      nodeSelector:
        kubernetes.io/hostname: koala
      imagePullSecrets:
        - name: gitea-registry
      containers:
        - name: routing
          image: gitea.d-ma.be/mathias/routing:initial
          ports:
            - containerPort: 3210
          envFrom:
            - secretRef:
                name: routing-secrets
          env:
            - name: ROUTING_PORT
              value: "3210"
            - name: LITELLM_BASE_URL
              value: "http://piguard:4000"
            - name: BRAIN_URL
              value: "http://ingestion.supervisor:3300"
            - name: HYPERGUILD_LOCAL_MODEL
              value: "qwen35"
            - name: HYPERGUILD_CLAUDE_MODEL
              value: "claude-sonnet-4-6"
            - name: HYPERGUILD_ROUTE_LOCAL_FLOOR
              value: "0.90"
            - name: HYPERGUILD_ROUTE_LOCAL_CEIL
              value: "0.70"
            - name: HYPERGUILD_PASS_RATE_TTL_SECONDS
              value: "60"
          readinessProbe:
            httpGet:
              path: /healthz
              port: 3210
            initialDelaySeconds: 2
            periodSeconds: 10

The gitea-registry imagePullSecret needs to exist in the routing namespace. If only present in supervisor, copy it (Step 6 below).

  • Step 3: service.yaml
apiVersion: v1
kind: Service
metadata:
  name: routing
  namespace: routing
spec:
  selector:
    app: routing
  ports:
    - port: 3210
      targetPort: 3210
      protocol: TCP
  • Step 4: nodeport.yaml
apiVersion: v1
kind: Service
metadata:
  name: routing-nodeport
  namespace: routing
spec:
  type: NodePort
  selector:
    app: routing
  ports:
    - port: 3210
      targetPort: 3210
      nodePort: 30310
      protocol: TCP
  • Step 5: kustomization.yaml (inside k3s/apps/routing/)
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - namespace.yaml
  - deployment.yaml
  - service.yaml
  - nodeport.yaml
  - secrets.enc.yaml

secrets.enc.yaml is added in Task 12; reference it now so the directory is complete.

  • Step 6: Add routing to the apps kustomization.yaml

Modify k3s/apps/kustomization.yaml:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - imagepullsecret
  - registry
  - gitea
  - infra-mcp
  - supervisor
  - cobalt-dingo
  - routing

If imagepullsecret/ only seeds the secret in specific namespaces, ensure routing is added to that list — inspect k3s/apps/imagepullsecret/ and follow the existing pattern.

  • Step 7: Validate manifest syntax with kustomize build
cd ~/Documents/local-dev/AI/infra/.worktrees/mode-2-routing-pod
kustomize build k3s/apps/routing 2>&1 | head -20

Expected: valid YAML output, no errors. If secrets.enc.yaml is referenced but missing, suppress for now by temporarily commenting that line; uncomment in Task 12.

  • Step 8: Commit (do NOT push yet)
git add k3s/apps/routing/ k3s/apps/kustomization.yaml
git commit -m "feat(routing): k3s manifests for the new pod"

Push happens after Task 12 (with the encrypted Secret) so the kustomization is consistent on first Flux apply.


Task 12: Routing-secrets Secret + Flux verification

Worktree: infra

Encrypt and add the routing-secrets Secret. The Secret carries LITELLM_API_KEY (reused from supervisor's secret) and optionally a ROUTING_MCP_TOKEN for bearer auth.

Files:

  • Create: k3s/apps/routing/secrets.enc.yaml

  • Step 1: Generate a token (or skip auth for first deploy)

# generate (or omit ROUTING_MCP_TOKEN for unauthenticated first deploy):
openssl rand -hex 32

Record the value; it will be set in the operator's shell env when Mode 2 is exercised in any project.

  • Step 2: Decode the cluster's age key
export SOPS_AGE_KEY="$(kubectl get secret sops-age -n flux-system -o jsonpath='{.data.age\.agekey}' | base64 -d)"
[ -n "$SOPS_AGE_KEY" ] && echo "age key loaded ($(echo -n "$SOPS_AGE_KEY" | wc -c) bytes)" || (echo "FAIL"; exit 1)
  • Step 3: Pull LITELLM_API_KEY value from the supervisor's secret

Decrypt the supervisor's Secret to read the existing value:

LITELLM_API_KEY="$(sops -d k3s/apps/supervisor/secrets.enc.yaml | yq eval '.stringData.DMABE_LLMAPI_KEY' -)"
[ -n "$LITELLM_API_KEY" ] && echo "found litellm key" || (echo "FAIL: empty"; exit 1)

(DMABE_LLMAPI_KEY is the supervisor's name for the LiteLLM key — same value, different env-var name in the consumer.)

  • Step 4: Create the routing Secret
cat > /tmp/routing-secrets.yaml <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: routing-secrets
  namespace: routing
type: Opaque
stringData:
  LITELLM_API_KEY: "${LITELLM_API_KEY}"
  ROUTING_MCP_TOKEN: "<paste-token-from-step-1-or-leave-empty>"
EOF

Edit /tmp/routing-secrets.yaml and paste the token (or leave the field as "" for unauthenticated first deploy).

  • Step 5: Encrypt with SOPS
sops --encrypt --age age15xez8pcmgg3daxpuqnye9ewawvzjtallheddcrq88ph573yle3nsr5hdq6 \
  --encrypted-regex '^(stringData|data)$' \
  /tmp/routing-secrets.yaml \
  > k3s/apps/routing/secrets.enc.yaml

rm /tmp/routing-secrets.yaml
unset SOPS_AGE_KEY LITELLM_API_KEY

Verify the file:

head -10 k3s/apps/routing/secrets.enc.yaml

Expected: apiVersion: v1, kind: Secret, stringData: with ENC[...] values.

  • Step 6: kustomize build re-check
kustomize build k3s/apps/routing | head -30

Expected: namespaces, deployment, services, and a Secret with encrypted data fields. Should succeed.

  • Step 7: Commit and push (this is the Flux activation)
git add k3s/apps/routing/secrets.enc.yaml
git commit -m "feat(routing): SOPS-encrypted routing-secrets"
git pull --rebase origin main
git push origin main

git pull --rebase accommodates intervening CD-bot commits on main (per the auth-rollout precedent earlier today).

  • Step 8: Wait for Flux to reconcile
NEW_SHA=$(git rev-parse HEAD)
until kubectl -n flux-system get kustomization apps -o jsonpath='{.status.lastAppliedRevision}' 2>/dev/null | grep -qE "${NEW_SHA:0:7}"; do
  sleep 3
done
echo "Flux applied $NEW_SHA"

The pod will be in ImagePullBackOff because the :initial placeholder image doesn't exist yet — that's expected. The CD workflow (Task 10) will publish the real image and bump the tag.

  • Step 9: Verify expected partial state
kubectl -n routing get all

Expected: namespace, deployment (0/1 ready), service, nodeport-service. Pod is in ErrImagePull until Task 10 runs end-to-end.


Task 13: task smoke:routing live-contract test

Worktree: hyperguild

Boots the routing binary against the real piguard:4000 LiteLLM and the real koala:30330 brain. Calls each of the four advertised tools once, verifies a _routing entry appears in the brain.

Files:

  • Create: scripts/smoke-routing.sh

  • Modify: Taskfile.yml

  • Step 1: Write scripts/smoke-routing.sh

#!/usr/bin/env bash
set -euo pipefail

# Boot the routing binary and exercise its four tools against live deps.
# Skipped when LITELLM_BASE_URL or BRAIN_URL is unreachable.

LITELLM_BASE_URL="${LITELLM_BASE_URL:-http://piguard:4000}"
BRAIN_URL="${BRAIN_URL:-http://koala:30330}"

if ! curl -sS --max-time 2 "${LITELLM_BASE_URL}/v1/models" >/dev/null 2>&1; then
  echo "SKIP: LITELLM at ${LITELLM_BASE_URL} unreachable"
  exit 0
fi
if ! curl -sS --max-time 2 "${BRAIN_URL}/query" -X POST -d '{"query":"x","k":1}' -H 'Content-Type: application/json' >/dev/null 2>&1; then
  echo "SKIP: BRAIN at ${BRAIN_URL} unreachable"
  exit 0
fi

PORT=33310
BIN=$(mktemp)
trap 'rm -f $BIN; pkill -P $$ -f "$BIN" 2>/dev/null || true' EXIT

go build -o "$BIN" ./cmd/routing

LITELLM_BASE_URL="$LITELLM_BASE_URL" BRAIN_URL="$BRAIN_URL" \
  ROUTING_PORT="$PORT" SUPERVISOR_CONFIG_DIR="$(pwd)/config/supervisor" \
  "$BIN" &
BIN_PID=$!

# Wait for the binary to bind.
for _ in $(seq 1 50); do
  curl -sS "http://127.0.0.1:${PORT}/healthz" >/dev/null 2>&1 && break
  sleep 0.1
done

call_tool() {
  local tool="$1"
  local args="$2"
  curl -sS -X POST "http://127.0.0.1:${PORT}/mcp" \
    -H 'Content-Type: application/json' \
    -d "{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/call\",\"params\":{\"name\":\"${tool}\",\"arguments\":${args}}}" \
    | jq -e '.result // .error' > /dev/null
}

echo "calling tools/list..."
curl -sS -X POST "http://127.0.0.1:${PORT}/mcp" \
  -H 'Content-Type: application/json' \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' \
  | jq -r '.result.tools | map(.name) | sort | .[]'

echo "calling each tool..."
call_tool code_review     '{"project_root":"/tmp","files":["README.md"],"session_id":"smoke-1"}'
call_tool debug           '{"project_root":"/tmp","problem":"smoke test","session_id":"smoke-1"}'
call_tool retrospective   '{"project_root":"/tmp","session_id":"smoke-1"}'
call_tool trainer         '{"project_root":"/tmp","session_id":"smoke-1"}'

echo "checking brain has _routing entries..."
sleep 2
COUNT=$(curl -sS "${BRAIN_URL}/pass-rate?skill=_routing&window=1h" | jq -r '.total // 0')
if [ "${COUNT}" -lt 4 ]; then
  echo "FAIL: expected ≥4 _routing entries in last 1h, got ${COUNT}"
  exit 1
fi

echo "PASS: smoke:routing"

Make it executable:

chmod +x scripts/smoke-routing.sh

The exact arguments shape per tool may need to be adjusted based on each skill's required fields. If a smoke call returns a JSON-RPC error like "missing required argument", read the failing tool's Tools() definition in internal/skills/<skill>/skill.go and add the required field with a stub value.

  • Step 2: Add the Taskfile target

In Taskfile.yml, append to the tasks: map:

  smoke:routing:
    desc: Boot the routing pod against live LiteLLM + brain and verify _routing logs land
    cmds:
      - bash scripts/smoke-routing.sh
  • Step 3: Run it
task smoke:routing

Expected: SKIP if offline; PASS otherwise.

  • Step 4: Commit
git add scripts/smoke-routing.sh Taskfile.yml
git commit -m "test(routing): live-contract smoke target"

Task 14: Documentation updates

Worktree: hyperguild

Update the project-level docs to describe Mode 2 + the new env vars + the routing-pod URL.

Files:

  • Modify: README.md

  • Modify: .context/PROJECT.md

  • Step 1: Update README.md's "Key env vars" table

Append:

| `ROUTING_PORT` | `3210` | Routing pod's listen port |
| `ROUTING_MCP_TOKEN` | — | Optional bearer token for the routing MCP HTTP endpoint |
| `BRAIN_URL` | `http://ingestion.supervisor:3300` | Routing pod → brain (in-cluster) |
| `HYPERGUILD_LOCAL_MODEL` | `qwen35` | Local model for routed-to-local skill calls |
| `HYPERGUILD_CLAUDE_MODEL` | `claude-sonnet-4-6` | Claude model for routed-to-Claude skill calls |
| `HYPERGUILD_ROUTE_LOCAL_FLOOR` | `0.90` | At/above pass rate, route to local |
| `HYPERGUILD_ROUTE_LOCAL_CEIL` | `0.70` | Below pass rate, route to Claude. Between CEIL and FLOOR is the sample band. |
| `HYPERGUILD_PASS_RATE_TTL_SECONDS` | `60` | Per-skill pass-rate cache TTL |

In the architecture diagram block at the top of the README, add the routing pod:

Your Claude Code session (in any project)
    │
    │  MCP over HTTP (Tailscale)
    ├──▶ supervisor  :3200 (NodePort 30320 on koala) — skill workers: tdd, debug, spec, …
    ├──▶ routing     :3210 (NodePort 30310 on koala) — Mode 2 only: code_review, debug, retrospective, trainer
    └──▶ brain       :3300 (NodePort 30330 on koala) — brain_query, brain_write, brain_ingest, session_log
  • Step 2: Update .context/PROJECT.md

Find the "MCP endpoints" section and add a third bullet:

- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
  the same four cost-routable skills as the supervisor (`code_review`,
  `debug`, `retrospective`, `trainer`) but per-call decides whether to use
  a local model or Claude based on the brain's `/pass-rate` response.
  Bearer auth via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local`
  registers this endpoint; Mode 1 and Mode 3 do not.
  • Step 3: Run task context:sync so derived adapters update
task context:sync

This regenerates CLAUDE.md, AGENTS.md, .cursorrules, .aider.conventions.md, and .context/system-prompt.txt from the canonical sources.

  • Step 4: task check
task check 2>&1 | tail -10

Expected: drift check green (regenerated adapters tracked).

  • Step 5: Commit
git add README.md .context/PROJECT.md CLAUDE.md AGENTS.md .cursorrules .aider.conventions.md .context/system-prompt.txt
git commit -m "docs(routing): document Mode 2 routing pod + env vars"

Final verification before merge

After all 14 tasks land, on the hyperguild worktree's branch:

  • Run the full check chain
cd ~/Documents/local-dev/AI/hyperguild/.worktrees/mode-2-routing-pod
task check 2>&1 | tail -15

Expected: 0 issues across lint, test, vet, drift, govulncheck.

  • Run smoke test if Tailscale available
task smoke:routing

Expected: PASS or SKIP (with a clear reason).

  • Verify the snapshot test still passes

The skill packages can drift between when the snapshot was captured and merge time. Re-run:

go test ./internal/routing/... -run TestToolsListMatchesSupervisorSnapshot -v

If it fails because of an intentional schema change in the merge window, re-capture the snapshot per Task 7's Step 1 and commit the update with a clear message.

  • Push the hyperguild branch and merge
git push -u origin feat/mode-2-routing-pod

Open a PR (or merge to main if the workflow allows direct push). Once merged, gitea CI builds the routing image and CD pushes the image-tag bump to the infra repo.

  • Verify Flux applies the new image and the pod becomes Ready
NEW_SHA=$(git -C ~/Documents/local-dev/AI/hyperguild rev-parse main)
echo "Watching for image tag $NEW_SHA on routing deployment..."
until kubectl -n routing get deployment routing -o jsonpath='{.spec.template.spec.containers[0].image}' 2>/dev/null | grep -qE "${NEW_SHA:0:7}"; do
  sleep 5
done
kubectl -n routing rollout status deployment/routing --timeout=120s

Expected: deployment becomes 1/1 Ready with the new image.

If the pod stays Pending or ImagePullBackOff past 2 minutes, check:

kubectl -n routing describe pod -l app=routing | tail -30
kubectl -n routing logs -l app=routing --tail=50
  • Final live verification
# tools/list should return 4 tools
curl -sS -X POST http://koala:30310/mcp \
  -H 'Content-Type: application/json' \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' \
  | jq '.result.tools | length'
# expected: 4

# auth check (only meaningful if ROUTING_MCP_TOKEN is set on the pod)
curl -isS -X POST http://koala:30310/mcp \
  -H 'Content-Type: application/json' \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' | head -1
# expected: 401 if token set, 200 otherwise
  • Restart pod the Flux-friendly way if needed

For any post-merge restart that doesn't ride a fresh image bump, use kubectl delete pod (not kubectl rollout restart — Flux strips the annotation):

kubectl -n routing delete pod -l app=routing

The existing ReplicaSet recreates the pod, picking up any Secret data changes on startup.