docs: update CLAUDE.md and DECISIONS.md for completed 7-plan migration
Some checks failed
CI / Mirror to GitHub (push) Has been cancelled
CI / Lint / Test / Vet (push) Has been cancelled

Reflects Plan 7 (supervisor retirement) and brain_answer/brain_classify
addition. Supervisor MCP endpoint removed; brain now exposes HTTPS domain
with Dex JWT auth. Routing decisions documented for LLM berget→iguana chain.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Mathias Bergqvist
2026-05-12 14:53:08 +02:00
parent 46adaf2148
commit 87cf9d0afc
2 changed files with 51 additions and 35 deletions

View File

@@ -47,31 +47,28 @@
## MCP endpoints
Two MCP servers expose this project's tooling, both reachable over Tailscale:
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
`brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
by the ingestion service directly.
- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
migration.
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
the same four cost-routable skills as the supervisor (`review`, `debug`,
`retrospective`, `trainer`) but per-call decides whether to use a local
model or Claude based on the brain's `/pass-rate` response. Bearer auth
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
endpoint; Mode 1 and Mode 3 do not.
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
(opt-in). Only `mode client-local` registers this endpoint.
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
the routing pod; brain tools moved to the brain MCP.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
shell scripts and non-MCP clients.
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
for shell scripts and non-MCP clients.
The brain HTTP REST API also serves a read-only `GET /pass-rate?skill=X&window=Y`
endpoint that aggregates `final_status` counts from session logs and returns
`{skill, window, pass, fail, skip, total, pass_rate}`. Plan 6 (routing pod)
reads this to decide whether to route skill calls to local models. Pass rate
is `null` when no logged invocations are in the window.
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
## Agent instructions

View File

@@ -72,23 +72,42 @@ Record *why* things are the way they are. Future-you will thank present-you.
Plan 6 (Mode 2 routing pod, 2026-05-04) introduces a second consumer of
the four cost-routable skill packages. The routing pod constructs each
skill via `<pkg>.New(Config{...})` and hands it `routing.Router.Run` as
the `CompleteFunc`. Plan 7 (supervisor retirement) MUST NOT delete the
four packages.
the `CompleteFunc`.
**Plan 7's allowed deletions:**
- `internal/skills/{tdd,spec,tier}/` (not consumed by the routing pod)
- `cmd/supervisor/` (binary)
- `Dockerfile` (supervisor's, at repo root — distinct from `Dockerfile.routing`)
- supervisor manifests in the infra repo
- NodePort `:30320`
**Plan 7's preserved code:**
**Preserved code (do not delete):**
- `internal/skills/{review,debug,retrospective,trainer}/`
- `internal/registry`
- `internal/mcp`
- `internal/exec/litellm.go`
- `internal/routing/` (entirely new in Plan 6)
- `cmd/routing/`
- `internal/registry`, `internal/mcp`, `internal/exec/litellm.go`
- `internal/routing/`, `cmd/routing/`
---
## Plan 7: supervisor pod retired (2026-05-12)
**What was deleted:** `cmd/supervisor/`, `internal/skills/{tdd,spec}/`,
root `Dockerfile`, supervisor k8s manifests (Deployment, Service, Ingress,
NodePort 30320), `supervisor` entry removed from all `.mcp.json` configs.
**Coverage:** `tdd`/`spec` → SKILL.md files in `~/dev/.skills/`; `review`,
`debug`, `retrospective`, `trainer` → routing pod; `brain_*`/`session_log`
brain MCP; `tier``hyperguild tier` CLI.
---
## 2026-05-12 — brain_answer and brain_classify: LLM routing via berget.ai → iguana
**Context:** Brain MCP returned raw BM25 excerpts with no synthesis. Adding
LLM-backed tools enables Q&A and ingestion enrichment without a separate service.
**Decision:** Two new MCP tools in the ingestion service (`ingestion/internal/mcp/`):
- `brain_answer(query)` — BM25 top-10 → LLM synthesis → answer + sources
- `brain_classify(text)` — LLM classifies doc into type/title/tags
Primary LLM: berget.ai `gemma4:31b` (EU cloud, spend tokens while available).
Fallback: iguana `gemma4:31b` (local Ollama). Reranker deferred to follow-up.
Router lives in `ingestion/internal/llm.Router`; opt-in via `BRAIN_LLM_PRIMARY_URL`.
**Consequences:** Brain becomes a knowledge assistant, not just a search index.
When berget.ai tokens run out, flip `BRAIN_LLM_PRIMARY_URL` to iguana.
---