docs: update CLAUDE.md and DECISIONS.md for completed 7-plan migration
Reflects Plan 7 (supervisor retirement) and brain_answer/brain_classify addition. Supervisor MCP endpoint removed; brain now exposes HTTPS domain with Dex JWT auth. Routing decisions documented for LLM berget→iguana chain. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
37
CLAUDE.md
37
CLAUDE.md
@@ -47,31 +47,28 @@
|
||||
|
||||
## MCP endpoints
|
||||
|
||||
Two MCP servers expose this project's tooling, both reachable over Tailscale:
|
||||
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
|
||||
|
||||
- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
|
||||
`brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
|
||||
by the ingestion service directly.
|
||||
- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
|
||||
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
||||
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
||||
migration.
|
||||
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
|
||||
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
|
||||
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
|
||||
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
|
||||
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||
the same four cost-routable skills as the supervisor (`review`, `debug`,
|
||||
`retrospective`, `trainer`) but per-call decides whether to use a local
|
||||
model or Claude based on the brain's `/pass-rate` response. Bearer auth
|
||||
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
|
||||
endpoint; Mode 1 and Mode 3 do not.
|
||||
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
|
||||
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
|
||||
(opt-in). Only `mode client-local` registers this endpoint.
|
||||
|
||||
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
|
||||
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
|
||||
the routing pod; brain tools moved to the brain MCP.
|
||||
|
||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
||||
shell scripts and non-MCP clients.
|
||||
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
|
||||
for shell scripts and non-MCP clients.
|
||||
|
||||
The brain HTTP REST API also serves a read-only `GET /pass-rate?skill=X&window=Y`
|
||||
endpoint that aggregates `final_status` counts from session logs and returns
|
||||
`{skill, window, pass, fail, skip, total, pass_rate}`. Plan 6 (routing pod)
|
||||
reads this to decide whether to route skill calls to local models. Pass rate
|
||||
is `null` when no logged invocations are in the window.
|
||||
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
|
||||
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
|
||||
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
|
||||
|
||||
## Agent instructions
|
||||
|
||||
|
||||
49
DECISIONS.md
49
DECISIONS.md
@@ -72,23 +72,42 @@ Record *why* things are the way they are. Future-you will thank present-you.
|
||||
Plan 6 (Mode 2 routing pod, 2026-05-04) introduces a second consumer of
|
||||
the four cost-routable skill packages. The routing pod constructs each
|
||||
skill via `<pkg>.New(Config{...})` and hands it `routing.Router.Run` as
|
||||
the `CompleteFunc`. Plan 7 (supervisor retirement) MUST NOT delete the
|
||||
four packages.
|
||||
the `CompleteFunc`.
|
||||
|
||||
**Plan 7's allowed deletions:**
|
||||
- `internal/skills/{tdd,spec,tier}/` (not consumed by the routing pod)
|
||||
- `cmd/supervisor/` (binary)
|
||||
- `Dockerfile` (supervisor's, at repo root — distinct from `Dockerfile.routing`)
|
||||
- supervisor manifests in the infra repo
|
||||
- NodePort `:30320`
|
||||
|
||||
**Plan 7's preserved code:**
|
||||
**Preserved code (do not delete):**
|
||||
- `internal/skills/{review,debug,retrospective,trainer}/`
|
||||
- `internal/registry`
|
||||
- `internal/mcp`
|
||||
- `internal/exec/litellm.go`
|
||||
- `internal/routing/` (entirely new in Plan 6)
|
||||
- `cmd/routing/`
|
||||
- `internal/registry`, `internal/mcp`, `internal/exec/litellm.go`
|
||||
- `internal/routing/`, `cmd/routing/`
|
||||
|
||||
---
|
||||
|
||||
## Plan 7: supervisor pod retired (2026-05-12)
|
||||
|
||||
**What was deleted:** `cmd/supervisor/`, `internal/skills/{tdd,spec}/`,
|
||||
root `Dockerfile`, supervisor k8s manifests (Deployment, Service, Ingress,
|
||||
NodePort 30320), `supervisor` entry removed from all `.mcp.json` configs.
|
||||
|
||||
**Coverage:** `tdd`/`spec` → SKILL.md files in `~/dev/.skills/`; `review`,
|
||||
`debug`, `retrospective`, `trainer` → routing pod; `brain_*`/`session_log` →
|
||||
brain MCP; `tier` → `hyperguild tier` CLI.
|
||||
|
||||
---
|
||||
|
||||
## 2026-05-12 — brain_answer and brain_classify: LLM routing via berget.ai → iguana
|
||||
|
||||
**Context:** Brain MCP returned raw BM25 excerpts with no synthesis. Adding
|
||||
LLM-backed tools enables Q&A and ingestion enrichment without a separate service.
|
||||
|
||||
**Decision:** Two new MCP tools in the ingestion service (`ingestion/internal/mcp/`):
|
||||
- `brain_answer(query)` — BM25 top-10 → LLM synthesis → answer + sources
|
||||
- `brain_classify(text)` — LLM classifies doc into type/title/tags
|
||||
|
||||
Primary LLM: berget.ai `gemma4:31b` (EU cloud, spend tokens while available).
|
||||
Fallback: iguana `gemma4:31b` (local Ollama). Reranker deferred to follow-up.
|
||||
Router lives in `ingestion/internal/llm.Router`; opt-in via `BRAIN_LLM_PRIMARY_URL`.
|
||||
|
||||
**Consequences:** Brain becomes a knowledge assistant, not just a search index.
|
||||
When berget.ai tokens run out, flip `BRAIN_LLM_PRIMARY_URL` to iguana.
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user