19 Commits

Author SHA1 Message Date
Mathias
a220fcaf2b feat(routing): create GitHub destination repo before configuring push-mirror
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Has been skipped
Gitea's push-mirror cannot push to a non-existent remote — it just
runs 'git push' against whatever URL it's given. So a project_create
flow that only configures the mirror leaves the GitHub side as an
unfulfillable URL.

New internal/githubclient package: single-purpose client that POSTs
/user/repos to create an empty private repo (auto_init=false so the
first mirror push doesn't conflict with a generated README). Treats
422 'name already exists' as idempotent success via ErrAlreadyExists.
401/403 are surfaced as 'PAT missing repo scope or invalid' so the
operator sees the real cause instead of a vague upstream error.

Skill wiring:
- New stepCreateGitHub between stepCreateRepo and stepMirror in the
  orchestrator.
- Skipped entirely when Config.GitHub is nil (degraded mode — the
  routing pod runs without GITHUB_PAT, mirror config still lands,
  but the actual sync to github fails until the repo exists).
- cmd/routing/main.go constructs githubclient.New(GitHubPAT) only
  when the PAT is set; the skill receives nil otherwise.

Tests:
- happy path: fake github 201 + assertions that the 'reached' array
  is [create_repo, create_github_repo, mirror, infra_commit, issue].
- github 422 already-exists: idempotent, all gitea steps still run.
- github 401: returns failed_step=create_github_repo, no mirror or
  later steps.
- degraded mode (Config.GitHub nil): reached omits create_github_repo,
  rest of the flow runs unchanged.

Updated existing tests to read [skill, gh] from newSkill instead of
just skill, and adjusted reached-array expectations to include the
new step.

Tracks #10.
2026-05-18 13:42:03 +02:00
Mathias
d1c8e3396f fix(cd): drop retired supervisor build, add routing rollout verification
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 4s
Plan 7 (2026-05-12) retired the supervisor pod, deleted cmd/supervisor/
and the root Dockerfile, but cd.yml still tried to:

- buildctl a supervisor image using the (non-existent) root Dockerfile
- sed gitea.d-ma.be/mathias/supervisor: in k3s/apps/supervisor/deployment.yaml
  (also non-existent — k3s/apps/supervisor/ only ships ingestion-* files now)
- wait for and rollout-verify a supervisor Deployment that no longer exists

Result: every CD run since the retirement has been failing at 'Build and push
supervisor image', leaving ingestion + routing un-deployed despite the binaries
being built. The routing pod was last deployed at sha 189ff89c (weeks stale).

This commit:
- Removes the supervisor build step and supervisor sed/git add lines.
- Adds 'Wait for Flux to apply new routing image' + 'Verify routing rollout'
  steps that mirror the ingestion equivalents, so failures land loudly rather
  than 5 min later when something tries to call the new tool.
- Updates the chore(deploy) commit message to 'ingestion+routing' to match
  reality.

Unblocks deployment of feat: project_create (#10).
2026-05-18 11:48:57 +02:00
Mathias
3b79311fdd feat(routing): project_create MCP tool — gitea-first new-project pipeline (#10)
All checks were successful
CI / Lint / Test / Vet (push) Successful in 12s
CI / Mirror to GitHub (push) Successful in 4s
Adds the project_create tool to the routing pod that automates the
"new project" bootstrap end-to-end from claude.ai. Gitea-first
architecture: GitHub receives the repo only via push-mirror, never
via a direct GitHub API call from this server.

Four sequential calls to the gitea-mcp server (configured via
GITEA_MCP_URL):

  1. create_project_from_template — Gitea repo from
     template-go-{agent,web} per the 'stack' arg
  2. repo_mirror_push (action=add) — push-mirror to
     github.com/<GITHUB_OWNER>/<name>.git, interval 8h, sync_on_commit
  3. file_write_branch — k3s/staging/<name>/namespace.yaml committed
     on a staging/<name> branch in the infra repo
  4. issue_create — experiment brief (hypothesis + description + stack
     + provisioning log) on the new repo, returns the issue_url

Returns gitea_url, github_url, issue_url, next_steps. The next_steps
string is the exact shell sequence the operator runs locally to
clone, scaffold via local-dev 'task new-project', and push.

Idempotency: create_project_from_template + repo_mirror_push +
file_write_branch all return JSON-RPC code -32003 (Conflict) when
their target already exists; the orchestrator swallows the conflict
and continues. Re-running on an existing repo restates the brief in
a fresh issue.

Error handling: on any non-conflict downstream failure the response
returns {reached: ["<step>",...], failed_step: "<step>"} alongside
a JSON-RPC error. No rollback — partial state stays so the operator
can resume manually.

New env vars (all optional except GITEA_MCP_URL):
  GITEA_MCP_URL    enables the tool
  GITEA_MCP_TOKEN  bearer auth for gitea-mcp
  GITEA_OWNER      default mathias
  GITHUB_OWNER     default mathiasb
  INFRA_REPO       default infra
  GITHUB_PAT       repo scope, used as mirror remote_password; never logged

Without GITEA_MCP_URL set, the tool is not registered and the
routing pod starts normally (degrades open).

internal/mcpclient/: new minimal JSON-RPC tools/call client with
bearer auth, used by project_create. Unwraps MCP's
content[0].text envelope and surfaces typed errors via mcpclient.Error.

Tests: table-driven against an httptest fake gitea-mcp covering happy
path (4-step success + correct PATCH-style arg shapes), idempotent
repo-exists, mirror failure (partial-success response with reached=
[create_repo] + failed_step=mirror), infra-commit failure (reached up
to mirror + failed_step=infra_commit), and validation errors.

Closes #10
2026-05-18 11:44:39 +02:00
Mathias
7baf8d7e7a chore: re-sync context adapters from updated root AGENT.md 2026-05-18 11:44:02 +02:00
Mathias Bergqvist
a8de04c7b6 docs: update canonical PROJECT.md for completed 7-plan migration
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 4s
Updates MCP endpoints section: supervisor retired, brain gets HTTPS
domain + Dex JWT auth + brain_answer/brain_classify. Regenerate all
derived adapter files via context:sync.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 14:53:46 +02:00
Mathias Bergqvist
87cf9d0afc docs: update CLAUDE.md and DECISIONS.md for completed 7-plan migration
Some checks failed
CI / Mirror to GitHub (push) Has been cancelled
CI / Lint / Test / Vet (push) Has been cancelled
Reflects Plan 7 (supervisor retirement) and brain_answer/brain_classify
addition. Supervisor MCP endpoint removed; brain now exposes HTTPS domain
with Dex JWT auth. Routing decisions documented for LLM berget→iguana chain.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 14:53:08 +02:00
Mathias Bergqvist
46adaf2148 chore(mcp): remove supervisor entry from .mcp.json
All checks were successful
CI / Lint / Test / Vet (push) Successful in 9s
CI / Mirror to GitHub (push) Successful in 3s
2026-05-12 14:49:46 +02:00
Mathias Bergqvist
c11763472c feat(plan7): retire supervisor pod — delete cmd/supervisor, tdd/spec skills, Dockerfile
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 3s
Removes the supervisor binary and its two exclusive skill packages (tdd,
spec) now that all functionality is covered by SKILL.md files, the routing
pod, and the brain MCP. Routing pod reuses review/debug/retrospective/trainer
skill packages which are intentionally preserved.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 12:18:30 +02:00
Mathias Bergqvist
189ff89c34 feat(brain): add brain_answer and brain_classify MCP tools
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 3s
Adds two new LLM-backed MCP tools to the ingestion service:

- brain_answer(query): BM25 retrieval + LLM synthesis → answer + sources
- brain_classify(text): classifies doc into type/title/tags via LLM

Adds llm.Router for primary→fallback routing (berget.ai → iguana).
Wired via BRAIN_LLM_PRIMARY_URL/BRAIN_LLM_FALLBACK_URL env vars;
no-op when unset so existing deployments are unaffected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 11:06:17 +02:00
Mathias Bergqvist
c7e0192486 feat(auth): add Dex JWT middleware to supervisor, routing pod, and brain MCP
All checks were successful
CI / Lint / Test / Vet (push) Successful in 13s
CI / Mirror to GitHub (push) Successful in 3s
Closes #6 on gitea.d-ma.be/mathias/hyperguild.

Dex is deployed at auth.d-ma.be. All three MCP servers now accept JWTs
issued by Dex in addition to static bearer tokens, enabling claude.ai
OAuth 2.0 integration without abandoning backward-compat CLI auth.

Changes:
- internal/auth/: new Validator (JWKS auto-refresh via lestrrat-go/jwx/v2),
  ProtectedResourceHandler (RFC 9728 /.well-known/oauth-protected-resource)
- internal/mcp/Server: adds optional *auth.Validator; checkAuth tries JWT
  first, then static token fallback; both-nil = auth disabled (unchanged default)
- cmd/supervisor, cmd/routing: construct Validator from DEX_ISSUER_URL +
  MCP_AUDIENCE env vars; register protected-resource handler when set
- ingestion/internal/auth/: same Validator + handler (separate module)
- ingestion/internal/mcp/BearerAuth: same JWT-or-static chain
- ingestion/cmd/server: same wiring pattern

New env vars (all optional; absent = static-token-only, same as before):
  DEX_ISSUER_URL   — Dex issuer URL (e.g. https://auth.d-ma.be)
  MCP_AUDIENCE     — expected aud claim (e.g. brain, supervisor)
  MCP_RESOURCE_URL — resource identifier for RFC 9728 metadata response

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 20:10:05 +02:00
1c3c9de550 Merge pull request 'refactor(routing): rename local/claude to fast/thinking model pair' (#4) from agent/thinking-fast-routing into main
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 4s
2026-05-08 14:43:29 +00:00
d0edc1a725 Merge pull request 'chore(mcp): switch MCP endpoints to HTTPS domain URLs' (#3) from agent/mcp-domain-urls into main
Some checks failed
CI / Lint / Test / Vet (push) Has been cancelled
CI / Mirror to GitHub (push) Has been cancelled
2026-05-08 14:43:18 +00:00
Mathias Bergqvist
5b207425ed refactor(routing): rename local/claude to fast/thinking model pair
All checks were successful
CI / Lint / Test / Vet (pull_request) Successful in 10s
CI / Mirror to GitHub (pull_request) Has been skipped
The routing decision is about reasoning capacity, not cost or provider.
Fast model (koala/qwen35-9b-fast) handles high-pass-rate calls; thinking
model (iguana/gemma4-26b) handles low-pass-rate calls. Removes the
implicit Anthropic dependency from the routing pod — both models go
through LiteLLM.

Renames: HYPERGUILD_LOCAL_MODEL → HYPERGUILD_FAST_MODEL,
HYPERGUILD_CLAUDE_MODEL → HYPERGUILD_THINKING_MODEL,
Router.LocalModel → FastModel, Router.ClaudeModel → ThinkingModel,
log decision "claude_fallback" → "thinking_fallback".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-08 16:39:42 +02:00
Mathias Bergqvist
cb51ff7ba1 chore(mcp): switch MCP endpoints to HTTPS domain URLs
All checks were successful
CI / Lint / Test / Vet (pull_request) Successful in 10s
CI / Mirror to GitHub (pull_request) Has been skipped
Brain and supervisor now behind NPM with Let's Encrypt. Use canonical
hostnames (brain-mcp.d-ma.be, supervisor-mcp.d-ma.be) over NodePorts so
connections work across networks without Tailscale for DNS.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-08 14:10:25 +02:00
Mathias Bergqvist
43a8255272 fix(mcp): add SSE GET handler for streamable HTTP transport
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 4s
claude.ai probes with GET before initialize; without this the supervisor
returned application/json parse error instead of text/event-stream, causing
"Couldn't reach the MCP server" in the claude.ai connector setup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 23:27:56 +02:00
Mathias Bergqvist
78be3d1f9c fix(ingestion): support GET/SSE on /mcp endpoint for claude.ai compatibility
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 3s
2026-05-07 23:20:47 +02:00
Mathias Bergqvist
7139a3ca74 ci: add environment gate and Flux rollout verification to cd pipeline
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 4s
Aligns hyperguild's cd.yml with the cobalt-dingo reference pattern:
- Add environment: staging to the deploy job
- Add Flux reconcile trigger after infra repo push
- Add polling wait for supervisor and ingestion image tags to propagate
- Add rollout status verification for both deployments with failure
  diagnostics (pod status, events, describe)
2026-05-07 21:52:52 +02:00
Mathias Bergqvist
c509ae2a5f refactor(ingestion): use strings.CutPrefix for explicit Bearer scheme check 2026-05-07 21:02:14 +02:00
Mathias Bergqvist
228ee57d4c feat(ingestion): add bearer token auth middleware for MCP endpoint 2026-05-07 20:58:16 +02:00
58 changed files with 3135 additions and 1120 deletions

View File

@@ -36,6 +36,18 @@ These rules apply to every task across every project, regardless of harness.
4. **Goal-driven execution.** Define clear success criteria up front for every task.
Loop — implement, verify, refine — until those criteria are met. Don't claim
completion without evidence (tests pass, command output, observed behavior).
5. **Trunk-Based Development — commit directly to main.** Every commit is one
logical change (one tool, one fix, one test) with passing tests. Main is always
deployable. Never create long-lived feature branches.
**Exception — parallel agents on same repo:** If another agent is known to be
actively working on the same repo simultaneously, create a short-lived branch
(`agent/<description>`), finish the task, and merge to main within the same
session. Do not leave agent branches open between sessions.
**Exception — external contributor or client four-eyes requirement:** Use
PR flow only when a human reviewer outside the project is required. Document
the reason in PROJECT.md.
## Default stack
@@ -46,9 +58,10 @@ These rules apply to every task across every project, regardless of harness.
| Build | Task (taskfile.dev) | Make | — |
| Containers | Docker Compose (dev), k3s (prod) | — | — |
| DB | PostgreSQL + sqlc | SQLite | — |
| Search | Qdrant (vector), BM25 | | — |
| Search | pgvector (vector), BM25 | Qdrant (when >1M vectors or hybrid retrieval) | — |
| Logging | slog (structured) | — | — |
| Testing | Table-driven, testify | — | — |
| Agents (Go) | google.golang.org/adk + pkg/litellm adapter | — | — |
Exploratory: Rust, Zig — I'll tell you when I want these.
@@ -58,9 +71,12 @@ Exploratory: Rust, Zig — I'll tell you when I want these.
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
- **Naming**: stdlib conventions, no stuttering
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), one concern per PR, PR describes *why* not *what*
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), commit directly to main,
one logical change per commit, CI is the quality gate
- **Never**: long-lived feature branches, PRs for solo work, direct push without
passing `task check` locally first
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc are pre-approved; anything else needs justification in the commit message
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc, google.golang.org/adk (agent projects only) are pre-approved; anything else needs justification in the commit message
## Infrastructure
@@ -68,7 +84,7 @@ Three machines on Tailscale:
| Machine | Role | Key specs |
|---------|------|-----------|
| koala | GPU inference, heavy compute | RTX 5070, runs llama-swap, Qdrant |
| koala | GPU inference, heavy compute | RTX 5070, runs k3s + llama-swap + shared postgres18/pgvector |
| iguana | Services, builds | M2 Ultra Mac |
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
@@ -100,18 +116,64 @@ See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
## Knowledge base
## Knowledge base — actively use it
When available, agents can query the shared knowledge base:
A persistent brain (BM25 search + LLM-synthesised Q&A) survives across sessions,
hosts, and harnesses. It holds 100+ hard-won entries: infra incident postmortems,
Go pitfalls, framework gotchas, design principles, ADRs. **It is not optional
reference material — query it actively, not just when explicitly told.**
- **MCP**: `mcp://hyperguild.<TAILNET>.ts.net:3100/knowledge`
- **HTTP**: `http://hyperguild.<TAILNET>.ts.net:3100/api/v1/search`
### When to query (treat as a reflex)
<!-- TODO: replace <TAILNET> placeholder with the real Tailscale tailnet
name once hyperguild is deployed. Until then, agents that try to
reach the knowledge service on a host where it isn't running will
get DNS NXDOMAIN, which is the desired fail-loudly behavior. -->
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`
- **Before** starting a non-trivial task — search for prior art with the symptom
AND the system component ("how did we solve X in Y?"). 5 seconds beats 5 hours.
- **When debugging** — search for the error string, the stack frame, the affected
service. Past you may have already paid this tax.
- **Before adopting** a pattern, library, framework, or model name — check if it
was tried and rejected, or what the integration footguns are.
- **When making architectural decisions** — search for the domain + "ADR" or
"decision" to find prior reasoning before re-deriving it.
- **When a recommendation feels novel** — challenge yourself: "has this been
documented?" The brain often has it.
### When to write
After you discover something that **future-you would forget** and that **isn't
recoverable from the code, git log, or PR description alone**:
- Bugs whose root cause is non-obvious and generalisable beyond this project.
- Framework / library / model-name quirks that bit you and would bite anyone.
- Design principles validated under fire (e.g. "every `_get` needs a `_list`").
- Postmortems for incidents: what broke, why, how diagnosed, what to do next time.
DON'T write project status, sprint progress, PR summaries, or "what I did this
session" — those rot fast and the originals are in git/gitea anyway. Brain
entries that age well are about *why*, *how to avoid*, and *what to do when*.
### How to access (per harness)
| Harness | Query | Write |
|---------|-------|-------|
| **Claude Code, Claude Desktop** | `brain_query` (BM25), `brain_answer` (LLM-synth + sources) MCP tools | `brain_write` MCP tool |
| **Crush, Pi, Antigravity, other MCP-capable** | same MCP server: `ingestion-brain` (via the `mcp__*_brain__*` namespace once authenticated) | same |
| **Anything HTTP-only (curl, scripts)** | `POST https://brain-mcp.d-ma.be/query` with `{"query":"..."}` (auth via `BRAIN_MCP_TOKEN`) | `POST .../write` with `{"content":"...","filename":"..."}` |
| **Browser / human inspection** | `https://gitea.d-ma.be/mathias/hyperguild``knowledge/` and `wiki/` markdown files |
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`.
- **Routing**: brain_answer's LLM uses berget.ai as primary, iguana ollama as
fallback. Both are configurable in the `supervisor/ingestion-deployment.yaml`
on the koala k3s cluster; don't hardcode local-only model names into the
berget URL (see knowledge entry on namespace mismatches).
### Quick reflex checks
If you find yourself about to say any of these out loud, you owe yourself a brain query first:
- "I think the issue might be..."
- "Let me try X and see..."
- "I'll just write a script to..."
- "This is probably a new bug..."
- "Has anyone done this before?" — *yes, probably, go check.*
## Client work rules
@@ -218,31 +280,28 @@ Key skills:
## MCP endpoints
Two MCP servers expose this project's tooling, both reachable over Tailscale:
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
`brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
by the ingestion service directly.
- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
migration.
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
the same four cost-routable skills as the supervisor (`review`, `debug`,
`retrospective`, `trainer`) but per-call decides whether to use a local
model or Claude based on the brain's `/pass-rate` response. Bearer auth
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
endpoint; Mode 1 and Mode 3 do not.
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
(opt-in). Only `mode client-local` registers this endpoint.
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
the routing pod; brain tools moved to the brain MCP.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
shell scripts and non-MCP clients.
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
for shell scripts and non-MCP clients.
The brain HTTP REST API also serves a read-only `GET /pass-rate?skill=X&window=Y`
endpoint that aggregates `final_status` counts from session logs and returns
`{skill, window, pass, fail, skip, total, pass_rate}`. Plan 6 (routing pod)
reads this to decide whether to route skill calls to local models. Pass rate
is `null` when no logged invocations are in the window.
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
## Agent instructions

View File

@@ -47,31 +47,28 @@
## MCP endpoints
Two MCP servers expose this project's tooling, both reachable over Tailscale:
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
`brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
by the ingestion service directly.
- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
migration.
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
the same four cost-routable skills as the supervisor (`review`, `debug`,
`retrospective`, `trainer`) but per-call decides whether to use a local
model or Claude based on the brain's `/pass-rate` response. Bearer auth
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
endpoint; Mode 1 and Mode 3 do not.
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
(opt-in). Only `mode client-local` registers this endpoint.
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
the routing pod; brain tools moved to the brain MCP.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
shell scripts and non-MCP clients.
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
for shell scripts and non-MCP clients.
The brain HTTP REST API also serves a read-only `GET /pass-rate?skill=X&window=Y`
endpoint that aggregates `final_status` counts from session logs and returns
`{skill, window, pass, fail, skip, total, pass_rate}`. Plan 6 (routing pod)
reads this to decide whether to route skill calls to local models. Pass rate
is `null` when no logged invocations are in the window.
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
## Agent instructions

View File

@@ -41,6 +41,18 @@ These rules apply to every task across every project, regardless of harness.
4. **Goal-driven execution.** Define clear success criteria up front for every task.
Loop — implement, verify, refine — until those criteria are met. Don't claim
completion without evidence (tests pass, command output, observed behavior).
5. **Trunk-Based Development — commit directly to main.** Every commit is one
logical change (one tool, one fix, one test) with passing tests. Main is always
deployable. Never create long-lived feature branches.
**Exception — parallel agents on same repo:** If another agent is known to be
actively working on the same repo simultaneously, create a short-lived branch
(`agent/<description>`), finish the task, and merge to main within the same
session. Do not leave agent branches open between sessions.
**Exception — external contributor or client four-eyes requirement:** Use
PR flow only when a human reviewer outside the project is required. Document
the reason in PROJECT.md.
## Default stack
@@ -51,9 +63,10 @@ These rules apply to every task across every project, regardless of harness.
| Build | Task (taskfile.dev) | Make | — |
| Containers | Docker Compose (dev), k3s (prod) | — | — |
| DB | PostgreSQL + sqlc | SQLite | — |
| Search | Qdrant (vector), BM25 | | — |
| Search | pgvector (vector), BM25 | Qdrant (when >1M vectors or hybrid retrieval) | — |
| Logging | slog (structured) | — | — |
| Testing | Table-driven, testify | — | — |
| Agents (Go) | google.golang.org/adk + pkg/litellm adapter | — | — |
Exploratory: Rust, Zig — I'll tell you when I want these.
@@ -63,9 +76,12 @@ Exploratory: Rust, Zig — I'll tell you when I want these.
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
- **Naming**: stdlib conventions, no stuttering
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), one concern per PR, PR describes *why* not *what*
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), commit directly to main,
one logical change per commit, CI is the quality gate
- **Never**: long-lived feature branches, PRs for solo work, direct push without
passing `task check` locally first
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc are pre-approved; anything else needs justification in the commit message
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc, google.golang.org/adk (agent projects only) are pre-approved; anything else needs justification in the commit message
## Infrastructure
@@ -73,7 +89,7 @@ Three machines on Tailscale:
| Machine | Role | Key specs |
|---------|------|-----------|
| koala | GPU inference, heavy compute | RTX 5070, runs llama-swap, Qdrant |
| koala | GPU inference, heavy compute | RTX 5070, runs k3s + llama-swap + shared postgres18/pgvector |
| iguana | Services, builds | M2 Ultra Mac |
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
@@ -105,18 +121,64 @@ See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
## Knowledge base
## Knowledge base — actively use it
When available, agents can query the shared knowledge base:
A persistent brain (BM25 search + LLM-synthesised Q&A) survives across sessions,
hosts, and harnesses. It holds 100+ hard-won entries: infra incident postmortems,
Go pitfalls, framework gotchas, design principles, ADRs. **It is not optional
reference material — query it actively, not just when explicitly told.**
- **MCP**: `mcp://hyperguild.<TAILNET>.ts.net:3100/knowledge`
- **HTTP**: `http://hyperguild.<TAILNET>.ts.net:3100/api/v1/search`
### When to query (treat as a reflex)
<!-- TODO: replace <TAILNET> placeholder with the real Tailscale tailnet
name once hyperguild is deployed. Until then, agents that try to
reach the knowledge service on a host where it isn't running will
get DNS NXDOMAIN, which is the desired fail-loudly behavior. -->
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`
- **Before** starting a non-trivial task — search for prior art with the symptom
AND the system component ("how did we solve X in Y?"). 5 seconds beats 5 hours.
- **When debugging** — search for the error string, the stack frame, the affected
service. Past you may have already paid this tax.
- **Before adopting** a pattern, library, framework, or model name — check if it
was tried and rejected, or what the integration footguns are.
- **When making architectural decisions** — search for the domain + "ADR" or
"decision" to find prior reasoning before re-deriving it.
- **When a recommendation feels novel** — challenge yourself: "has this been
documented?" The brain often has it.
### When to write
After you discover something that **future-you would forget** and that **isn't
recoverable from the code, git log, or PR description alone**:
- Bugs whose root cause is non-obvious and generalisable beyond this project.
- Framework / library / model-name quirks that bit you and would bite anyone.
- Design principles validated under fire (e.g. "every `_get` needs a `_list`").
- Postmortems for incidents: what broke, why, how diagnosed, what to do next time.
DON'T write project status, sprint progress, PR summaries, or "what I did this
session" — those rot fast and the originals are in git/gitea anyway. Brain
entries that age well are about *why*, *how to avoid*, and *what to do when*.
### How to access (per harness)
| Harness | Query | Write |
|---------|-------|-------|
| **Claude Code, Claude Desktop** | `brain_query` (BM25), `brain_answer` (LLM-synth + sources) MCP tools | `brain_write` MCP tool |
| **Crush, Pi, Antigravity, other MCP-capable** | same MCP server: `ingestion-brain` (via the `mcp__*_brain__*` namespace once authenticated) | same |
| **Anything HTTP-only (curl, scripts)** | `POST https://brain-mcp.d-ma.be/query` with `{"query":"..."}` (auth via `BRAIN_MCP_TOKEN`) | `POST .../write` with `{"content":"...","filename":"..."}` |
| **Browser / human inspection** | `https://gitea.d-ma.be/mathias/hyperguild` → `knowledge/` and `wiki/` markdown files |
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`.
- **Routing**: brain_answer's LLM uses berget.ai as primary, iguana ollama as
fallback. Both are configurable in the `supervisor/ingestion-deployment.yaml`
on the koala k3s cluster; don't hardcode local-only model names into the
berget URL (see knowledge entry on namespace mismatches).
### Quick reflex checks
If you find yourself about to say any of these out loud, you owe yourself a brain query first:
- "I think the issue might be..."
- "Let me try X and see..."
- "I'll just write a script to..."
- "This is probably a new bug..."
- "Has anyone done this before?" — *yes, probably, go check.*
## Client work rules
@@ -223,31 +285,28 @@ Key skills:
## MCP endpoints
Two MCP servers expose this project's tooling, both reachable over Tailscale:
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
`brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
by the ingestion service directly.
- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
migration.
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
the same four cost-routable skills as the supervisor (`review`, `debug`,
`retrospective`, `trainer`) but per-call decides whether to use a local
model or Claude based on the brain's `/pass-rate` response. Bearer auth
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
endpoint; Mode 1 and Mode 3 do not.
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
(opt-in). Only `mode client-local` registers this endpoint.
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
the routing pod; brain tools moved to the brain MCP.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
shell scripts and non-MCP clients.
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
for shell scripts and non-MCP clients.
The brain HTTP REST API also serves a read-only `GET /pass-rate?skill=X&window=Y`
endpoint that aggregates `final_status` counts from session logs and returns
`{skill, window, pass, fail, skip, total, pass_rate}`. Plan 6 (routing pod)
reads this to decide whether to route skill calls to local models. Pass rate
is `null` when no logged invocations are in the window.
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
## Agent instructions

View File

@@ -39,6 +39,18 @@ These rules apply to every task across every project, regardless of harness.
4. **Goal-driven execution.** Define clear success criteria up front for every task.
Loop — implement, verify, refine — until those criteria are met. Don't claim
completion without evidence (tests pass, command output, observed behavior).
5. **Trunk-Based Development — commit directly to main.** Every commit is one
logical change (one tool, one fix, one test) with passing tests. Main is always
deployable. Never create long-lived feature branches.
**Exception — parallel agents on same repo:** If another agent is known to be
actively working on the same repo simultaneously, create a short-lived branch
(`agent/<description>`), finish the task, and merge to main within the same
session. Do not leave agent branches open between sessions.
**Exception — external contributor or client four-eyes requirement:** Use
PR flow only when a human reviewer outside the project is required. Document
the reason in PROJECT.md.
## Default stack
@@ -49,9 +61,10 @@ These rules apply to every task across every project, regardless of harness.
| Build | Task (taskfile.dev) | Make | — |
| Containers | Docker Compose (dev), k3s (prod) | — | — |
| DB | PostgreSQL + sqlc | SQLite | — |
| Search | Qdrant (vector), BM25 | | — |
| Search | pgvector (vector), BM25 | Qdrant (when >1M vectors or hybrid retrieval) | — |
| Logging | slog (structured) | — | — |
| Testing | Table-driven, testify | — | — |
| Agents (Go) | google.golang.org/adk + pkg/litellm adapter | — | — |
Exploratory: Rust, Zig — I'll tell you when I want these.
@@ -61,9 +74,12 @@ Exploratory: Rust, Zig — I'll tell you when I want these.
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
- **Naming**: stdlib conventions, no stuttering
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), one concern per PR, PR describes *why* not *what*
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), commit directly to main,
one logical change per commit, CI is the quality gate
- **Never**: long-lived feature branches, PRs for solo work, direct push without
passing `task check` locally first
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc are pre-approved; anything else needs justification in the commit message
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc, google.golang.org/adk (agent projects only) are pre-approved; anything else needs justification in the commit message
## Infrastructure
@@ -71,7 +87,7 @@ Three machines on Tailscale:
| Machine | Role | Key specs |
|---------|------|-----------|
| koala | GPU inference, heavy compute | RTX 5070, runs llama-swap, Qdrant |
| koala | GPU inference, heavy compute | RTX 5070, runs k3s + llama-swap + shared postgres18/pgvector |
| iguana | Services, builds | M2 Ultra Mac |
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
@@ -103,18 +119,64 @@ See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
## Knowledge base
## Knowledge base — actively use it
When available, agents can query the shared knowledge base:
A persistent brain (BM25 search + LLM-synthesised Q&A) survives across sessions,
hosts, and harnesses. It holds 100+ hard-won entries: infra incident postmortems,
Go pitfalls, framework gotchas, design principles, ADRs. **It is not optional
reference material — query it actively, not just when explicitly told.**
- **MCP**: `mcp://hyperguild.<TAILNET>.ts.net:3100/knowledge`
- **HTTP**: `http://hyperguild.<TAILNET>.ts.net:3100/api/v1/search`
### When to query (treat as a reflex)
<!-- TODO: replace <TAILNET> placeholder with the real Tailscale tailnet
name once hyperguild is deployed. Until then, agents that try to
reach the knowledge service on a host where it isn't running will
get DNS NXDOMAIN, which is the desired fail-loudly behavior. -->
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`
- **Before** starting a non-trivial task — search for prior art with the symptom
AND the system component ("how did we solve X in Y?"). 5 seconds beats 5 hours.
- **When debugging** — search for the error string, the stack frame, the affected
service. Past you may have already paid this tax.
- **Before adopting** a pattern, library, framework, or model name — check if it
was tried and rejected, or what the integration footguns are.
- **When making architectural decisions** — search for the domain + "ADR" or
"decision" to find prior reasoning before re-deriving it.
- **When a recommendation feels novel** — challenge yourself: "has this been
documented?" The brain often has it.
### When to write
After you discover something that **future-you would forget** and that **isn't
recoverable from the code, git log, or PR description alone**:
- Bugs whose root cause is non-obvious and generalisable beyond this project.
- Framework / library / model-name quirks that bit you and would bite anyone.
- Design principles validated under fire (e.g. "every `_get` needs a `_list`").
- Postmortems for incidents: what broke, why, how diagnosed, what to do next time.
DON'T write project status, sprint progress, PR summaries, or "what I did this
session" — those rot fast and the originals are in git/gitea anyway. Brain
entries that age well are about *why*, *how to avoid*, and *what to do when*.
### How to access (per harness)
| Harness | Query | Write |
|---------|-------|-------|
| **Claude Code, Claude Desktop** | `brain_query` (BM25), `brain_answer` (LLM-synth + sources) MCP tools | `brain_write` MCP tool |
| **Crush, Pi, Antigravity, other MCP-capable** | same MCP server: `ingestion-brain` (via the `mcp__*_brain__*` namespace once authenticated) | same |
| **Anything HTTP-only (curl, scripts)** | `POST https://brain-mcp.d-ma.be/query` with `{"query":"..."}` (auth via `BRAIN_MCP_TOKEN`) | `POST .../write` with `{"content":"...","filename":"..."}` |
| **Browser / human inspection** | `https://gitea.d-ma.be/mathias/hyperguild` → `knowledge/` and `wiki/` markdown files |
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`.
- **Routing**: brain_answer's LLM uses berget.ai as primary, iguana ollama as
fallback. Both are configurable in the `supervisor/ingestion-deployment.yaml`
on the koala k3s cluster; don't hardcode local-only model names into the
berget URL (see knowledge entry on namespace mismatches).
### Quick reflex checks
If you find yourself about to say any of these out loud, you owe yourself a brain query first:
- "I think the issue might be..."
- "Let me try X and see..."
- "I'll just write a script to..."
- "This is probably a new bug..."
- "Has anyone done this before?" — *yes, probably, go check.*
## Client work rules
@@ -221,31 +283,28 @@ Key skills:
## MCP endpoints
Two MCP servers expose this project's tooling, both reachable over Tailscale:
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
`brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
by the ingestion service directly.
- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
migration.
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
the same four cost-routable skills as the supervisor (`review`, `debug`,
`retrospective`, `trainer`) but per-call decides whether to use a local
model or Claude based on the brain's `/pass-rate` response. Bearer auth
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
endpoint; Mode 1 and Mode 3 do not.
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
(opt-in). Only `mode client-local` registers this endpoint.
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
the routing pod; brain tools moved to the brain MCP.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
shell scripts and non-MCP clients.
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
for shell scripts and non-MCP clients.
The brain HTTP REST API also serves a read-only `GET /pass-rate?skill=X&window=Y`
endpoint that aggregates `final_status` counts from session logs and returns
`{skill, window, pass, fail, skip, total, pass_rate}`. Plan 6 (routing pod)
reads this to decide whether to route skill calls to local models. Pass rate
is `null` when no logged invocations are in the window.
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
## Agent instructions

View File

@@ -11,9 +11,8 @@ jobs:
name: Build and deploy
runs-on: self-hosted
if: ${{ github.event.workflow_run.conclusion == 'success' && github.event.workflow_run.event == 'push' }}
environment: staging
env:
SERVICE: supervisor
IMAGE: gitea.d-ma.be/mathias/supervisor
INGESTION_IMAGE: gitea.d-ma.be/mathias/ingestion
ROUTING_IMAGE: gitea.d-ma.be/mathias/routing
INFRA_REPO: git@gitea.d-ma.be:mathias/infra.git
@@ -22,27 +21,6 @@ jobs:
- name: Checkout
uses: actions/checkout@v4
- name: Build and push supervisor image
run: |
set -e
trap 'rm -f /tmp/supervisor-image.tar' EXIT
IMAGE_TAG="${{ github.sha }}"
echo "Building ${IMAGE}:${IMAGE_TAG}"
buildctl --addr "${BUILDKIT_HOST}" build \
--frontend dockerfile.v0 \
--local context=. \
--local dockerfile=. \
--opt build-arg:VERSION="${IMAGE_TAG}" \
--output type=oci,dest=/tmp/supervisor-image.tar
skopeo copy \
oci-archive:/tmp/supervisor-image.tar \
docker://${IMAGE}:${IMAGE_TAG} \
--dest-creds "${{ secrets.REGISTRY_CREDS }}"
echo "Built and pushed ${IMAGE}:${IMAGE_TAG}"
- name: Build and push ingestion image
run: |
set -e
@@ -100,22 +78,89 @@ jobs:
cd /tmp/infra-update
sed -i "s|gitea.d-ma.be/mathias/supervisor:.*|gitea.d-ma.be/mathias/supervisor:${IMAGE_TAG}|" \
"k3s/apps/${SERVICE}/deployment.yaml"
sed -i "s|gitea.d-ma.be/mathias/ingestion:.*|gitea.d-ma.be/mathias/ingestion:${IMAGE_TAG}|" \
"k3s/apps/${SERVICE}/ingestion-deployment.yaml"
"k3s/apps/supervisor/ingestion-deployment.yaml"
sed -i "s|gitea.d-ma.be/mathias/routing:.*|gitea.d-ma.be/mathias/routing:${IMAGE_TAG}|" \
"k3s/apps/routing/deployment.yaml"
git config user.email "cd-bot@d-ma.be"
git config user.name "CD Bot"
git add "k3s/apps/${SERVICE}/deployment.yaml" \
"k3s/apps/${SERVICE}/ingestion-deployment.yaml" \
git add "k3s/apps/supervisor/ingestion-deployment.yaml" \
"k3s/apps/routing/deployment.yaml"
git commit -m "chore(deploy): supervisor+ingestion+routing → ${IMAGE_TAG}"
git commit -m "chore(deploy): ingestion+routing → ${IMAGE_TAG}"
GIT_SSH_COMMAND="ssh -i ~/.ssh/infra_deploy_key -o IdentitiesOnly=yes" \
git push
echo "Infra repo updated: ${SERVICE}+ingestion → ${IMAGE_TAG}"
echo "Infra repo updated: ingestion+routing → ${IMAGE_TAG}"
- name: Trigger Flux reconcile (immediate)
run: |
kubectl -n flux-system annotate gitrepository flux-system \
reconcile.fluxcd.io/requestedAt="$(date +%s)" --overwrite
kubectl -n flux-system annotate kustomization apps \
reconcile.fluxcd.io/requestedAt="$(date +%s)" --overwrite
- name: Wait for Flux to apply new ingestion image
run: |
EXPECTED="gitea.d-ma.be/mathias/ingestion:${{ github.sha }}"
for i in $(seq 1 60); do
CURRENT=$(kubectl get deploy ingestion -n supervisor \
-o jsonpath='{.spec.template.spec.containers[0].image}' 2>/dev/null || echo "")
if [ "$CURRENT" = "$EXPECTED" ]; then
echo "✓ Flux applied ingestion image after ${i}s"
break
fi
sleep 1
done
kubectl get deploy ingestion -n supervisor \
-o jsonpath='{.spec.template.spec.containers[0].image}' \
| grep -qx "$EXPECTED" \
|| { echo "✗ Flux did not apply ingestion image within 60s"; exit 1; }
- name: Verify ingestion rollout
run: |
kubectl rollout status deployment/ingestion \
--namespace supervisor \
--timeout=120s \
|| {
echo "── pod status ──"
kubectl get pods -n supervisor -o wide
echo "── events ──"
kubectl get events -n supervisor --sort-by='.lastTimestamp' | tail -20
echo "── describe ──"
kubectl describe pods -n supervisor -l app=ingestion | tail -40
exit 1
}
- name: Wait for Flux to apply new routing image
run: |
EXPECTED="gitea.d-ma.be/mathias/routing:${{ github.sha }}"
for i in $(seq 1 60); do
CURRENT=$(kubectl get deploy routing -n routing \
-o jsonpath='{.spec.template.spec.containers[0].image}' 2>/dev/null || echo "")
if [ "$CURRENT" = "$EXPECTED" ]; then
echo "✓ Flux applied routing image after ${i}s"
break
fi
sleep 1
done
kubectl get deploy routing -n routing \
-o jsonpath='{.spec.template.spec.containers[0].image}' \
| grep -qx "$EXPECTED" \
|| { echo "✗ Flux did not apply routing image within 60s"; exit 1; }
- name: Verify routing rollout
run: |
kubectl rollout status deployment/routing \
--namespace routing \
--timeout=120s \
|| {
echo "── pod status ──"
kubectl get pods -n routing -o wide
echo "── events ──"
kubectl get events -n routing --sort-by='.lastTimestamp' | tail -20
echo "── describe ──"
kubectl describe pods -n routing -l app=routing | tail -40
exit 1
}

View File

@@ -1,15 +1,11 @@
{
"mcpServers": {
"supervisor": {
"type": "http",
"url": "http://koala:30320/mcp",
"headers": {
"Authorization": "Bearer ${SUPERVISOR_MCP_TOKEN}"
}
},
"brain": {
"type": "http",
"url": "http://koala:30330/mcp"
"url": "https://brain-mcp.d-ma.be/mcp",
"headers": {
"Authorization": "Bearer ${BRAIN_MCP_TOKEN}"
}
}
}
}

125
AGENTS.md
View File

@@ -36,6 +36,18 @@ These rules apply to every task across every project, regardless of harness.
4. **Goal-driven execution.** Define clear success criteria up front for every task.
Loop — implement, verify, refine — until those criteria are met. Don't claim
completion without evidence (tests pass, command output, observed behavior).
5. **Trunk-Based Development — commit directly to main.** Every commit is one
logical change (one tool, one fix, one test) with passing tests. Main is always
deployable. Never create long-lived feature branches.
**Exception — parallel agents on same repo:** If another agent is known to be
actively working on the same repo simultaneously, create a short-lived branch
(`agent/<description>`), finish the task, and merge to main within the same
session. Do not leave agent branches open between sessions.
**Exception — external contributor or client four-eyes requirement:** Use
PR flow only when a human reviewer outside the project is required. Document
the reason in PROJECT.md.
## Default stack
@@ -46,9 +58,10 @@ These rules apply to every task across every project, regardless of harness.
| Build | Task (taskfile.dev) | Make | — |
| Containers | Docker Compose (dev), k3s (prod) | — | — |
| DB | PostgreSQL + sqlc | SQLite | — |
| Search | Qdrant (vector), BM25 | | — |
| Search | pgvector (vector), BM25 | Qdrant (when >1M vectors or hybrid retrieval) | — |
| Logging | slog (structured) | — | — |
| Testing | Table-driven, testify | — | — |
| Agents (Go) | google.golang.org/adk + pkg/litellm adapter | — | — |
Exploratory: Rust, Zig — I'll tell you when I want these.
@@ -58,9 +71,12 @@ Exploratory: Rust, Zig — I'll tell you when I want these.
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
- **Naming**: stdlib conventions, no stuttering
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), one concern per PR, PR describes *why* not *what*
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), commit directly to main,
one logical change per commit, CI is the quality gate
- **Never**: long-lived feature branches, PRs for solo work, direct push without
passing `task check` locally first
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc are pre-approved; anything else needs justification in the commit message
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc, google.golang.org/adk (agent projects only) are pre-approved; anything else needs justification in the commit message
## Infrastructure
@@ -68,7 +84,7 @@ Three machines on Tailscale:
| Machine | Role | Key specs |
|---------|------|-----------|
| koala | GPU inference, heavy compute | RTX 5070, runs llama-swap, Qdrant |
| koala | GPU inference, heavy compute | RTX 5070, runs k3s + llama-swap + shared postgres18/pgvector |
| iguana | Services, builds | M2 Ultra Mac |
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
@@ -100,18 +116,64 @@ See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
## Knowledge base
## Knowledge base — actively use it
When available, agents can query the shared knowledge base:
A persistent brain (BM25 search + LLM-synthesised Q&A) survives across sessions,
hosts, and harnesses. It holds 100+ hard-won entries: infra incident postmortems,
Go pitfalls, framework gotchas, design principles, ADRs. **It is not optional
reference material — query it actively, not just when explicitly told.**
- **MCP**: `mcp://hyperguild.<TAILNET>.ts.net:3100/knowledge`
- **HTTP**: `http://hyperguild.<TAILNET>.ts.net:3100/api/v1/search`
### When to query (treat as a reflex)
<!-- TODO: replace <TAILNET> placeholder with the real Tailscale tailnet
name once hyperguild is deployed. Until then, agents that try to
reach the knowledge service on a host where it isn't running will
get DNS NXDOMAIN, which is the desired fail-loudly behavior. -->
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`
- **Before** starting a non-trivial task — search for prior art with the symptom
AND the system component ("how did we solve X in Y?"). 5 seconds beats 5 hours.
- **When debugging** — search for the error string, the stack frame, the affected
service. Past you may have already paid this tax.
- **Before adopting** a pattern, library, framework, or model name — check if it
was tried and rejected, or what the integration footguns are.
- **When making architectural decisions** — search for the domain + "ADR" or
"decision" to find prior reasoning before re-deriving it.
- **When a recommendation feels novel** — challenge yourself: "has this been
documented?" The brain often has it.
### When to write
After you discover something that **future-you would forget** and that **isn't
recoverable from the code, git log, or PR description alone**:
- Bugs whose root cause is non-obvious and generalisable beyond this project.
- Framework / library / model-name quirks that bit you and would bite anyone.
- Design principles validated under fire (e.g. "every `_get` needs a `_list`").
- Postmortems for incidents: what broke, why, how diagnosed, what to do next time.
DON'T write project status, sprint progress, PR summaries, or "what I did this
session" — those rot fast and the originals are in git/gitea anyway. Brain
entries that age well are about *why*, *how to avoid*, and *what to do when*.
### How to access (per harness)
| Harness | Query | Write |
|---------|-------|-------|
| **Claude Code, Claude Desktop** | `brain_query` (BM25), `brain_answer` (LLM-synth + sources) MCP tools | `brain_write` MCP tool |
| **Crush, Pi, Antigravity, other MCP-capable** | same MCP server: `ingestion-brain` (via the `mcp__*_brain__*` namespace once authenticated) | same |
| **Anything HTTP-only (curl, scripts)** | `POST https://brain-mcp.d-ma.be/query` with `{"query":"..."}` (auth via `BRAIN_MCP_TOKEN`) | `POST .../write` with `{"content":"...","filename":"..."}` |
| **Browser / human inspection** | `https://gitea.d-ma.be/mathias/hyperguild``knowledge/` and `wiki/` markdown files |
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`.
- **Routing**: brain_answer's LLM uses berget.ai as primary, iguana ollama as
fallback. Both are configurable in the `supervisor/ingestion-deployment.yaml`
on the koala k3s cluster; don't hardcode local-only model names into the
berget URL (see knowledge entry on namespace mismatches).
### Quick reflex checks
If you find yourself about to say any of these out loud, you owe yourself a brain query first:
- "I think the issue might be..."
- "Let me try X and see..."
- "I'll just write a script to..."
- "This is probably a new bug..."
- "Has anyone done this before?" — *yes, probably, go check.*
## Client work rules
@@ -218,31 +280,28 @@ Key skills:
## MCP endpoints
Two MCP servers expose this project's tooling, both reachable over Tailscale:
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
`brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
by the ingestion service directly.
- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
migration.
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
the same four cost-routable skills as the supervisor (`review`, `debug`,
`retrospective`, `trainer`) but per-call decides whether to use a local
model or Claude based on the brain's `/pass-rate` response. Bearer auth
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
endpoint; Mode 1 and Mode 3 do not.
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
(opt-in). Only `mode client-local` registers this endpoint.
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
the routing pod; brain tools moved to the brain MCP.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
shell scripts and non-MCP clients.
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
for shell scripts and non-MCP clients.
The brain HTTP REST API also serves a read-only `GET /pass-rate?skill=X&window=Y`
endpoint that aggregates `final_status` counts from session logs and returns
`{skill, window, pass, fail, skip, total, pass_rate}`. Plan 6 (routing pod)
reads this to decide whether to route skill calls to local models. Pass rate
is `null` when no logged invocations are in the window.
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
## Agent instructions

View File

@@ -47,31 +47,28 @@
## MCP endpoints
Two MCP servers expose this project's tooling, both reachable over Tailscale:
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
`brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
by the ingestion service directly.
- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
migration.
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
the same four cost-routable skills as the supervisor (`review`, `debug`,
`retrospective`, `trainer`) but per-call decides whether to use a local
model or Claude based on the brain's `/pass-rate` response. Bearer auth
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
endpoint; Mode 1 and Mode 3 do not.
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
(opt-in). Only `mode client-local` registers this endpoint.
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
the routing pod; brain tools moved to the brain MCP.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
shell scripts and non-MCP clients.
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
for shell scripts and non-MCP clients.
The brain HTTP REST API also serves a read-only `GET /pass-rate?skill=X&window=Y`
endpoint that aggregates `final_status` counts from session logs and returns
`{skill, window, pass, fail, skip, total, pass_rate}`. Plan 6 (routing pod)
reads this to decide whether to route skill calls to local models. Pass rate
is `null` when no logged invocations are in the window.
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
## Agent instructions

View File

@@ -72,23 +72,42 @@ Record *why* things are the way they are. Future-you will thank present-you.
Plan 6 (Mode 2 routing pod, 2026-05-04) introduces a second consumer of
the four cost-routable skill packages. The routing pod constructs each
skill via `<pkg>.New(Config{...})` and hands it `routing.Router.Run` as
the `CompleteFunc`. Plan 7 (supervisor retirement) MUST NOT delete the
four packages.
the `CompleteFunc`.
**Plan 7's allowed deletions:**
- `internal/skills/{tdd,spec,tier}/` (not consumed by the routing pod)
- `cmd/supervisor/` (binary)
- `Dockerfile` (supervisor's, at repo root — distinct from `Dockerfile.routing`)
- supervisor manifests in the infra repo
- NodePort `:30320`
**Plan 7's preserved code:**
**Preserved code (do not delete):**
- `internal/skills/{review,debug,retrospective,trainer}/`
- `internal/registry`
- `internal/mcp`
- `internal/exec/litellm.go`
- `internal/routing/` (entirely new in Plan 6)
- `cmd/routing/`
- `internal/registry`, `internal/mcp`, `internal/exec/litellm.go`
- `internal/routing/`, `cmd/routing/`
---
## Plan 7: supervisor pod retired (2026-05-12)
**What was deleted:** `cmd/supervisor/`, `internal/skills/{tdd,spec}/`,
root `Dockerfile`, supervisor k8s manifests (Deployment, Service, Ingress,
NodePort 30320), `supervisor` entry removed from all `.mcp.json` configs.
**Coverage:** `tdd`/`spec` → SKILL.md files in `~/dev/.skills/`; `review`,
`debug`, `retrospective`, `trainer` → routing pod; `brain_*`/`session_log`
brain MCP; `tier``hyperguild tier` CLI.
---
## 2026-05-12 — brain_answer and brain_classify: LLM routing via berget.ai → iguana
**Context:** Brain MCP returned raw BM25 excerpts with no synthesis. Adding
LLM-backed tools enables Q&A and ingestion enrichment without a separate service.
**Decision:** Two new MCP tools in the ingestion service (`ingestion/internal/mcp/`):
- `brain_answer(query)` — BM25 top-10 → LLM synthesis → answer + sources
- `brain_classify(text)` — LLM classifies doc into type/title/tags
Primary LLM: berget.ai `gemma4:31b` (EU cloud, spend tokens while available).
Fallback: iguana `gemma4:31b` (local Ollama). Reranker deferred to follow-up.
Router lives in `ingestion/internal/llm.Router`; opt-in via `BRAIN_LLM_PRIMARY_URL`.
**Consequences:** Brain becomes a knowledge assistant, not just a search index.
When berget.ai tokens run out, flip `BRAIN_LLM_PRIMARY_URL` to iguana.
---

View File

@@ -1,50 +0,0 @@
# syntax=docker/dockerfile:1
# ── Build stage ───────────────────────────────────────────────────────────────
FROM golang:1.26-bookworm AS builder
ARG VERSION=dev
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 \
go build -trimpath -ldflags="-s -w -X main.version=${VERSION}" \
-o /out/supervisor ./cmd/supervisor
# ── Runtime stage ─────────────────────────────────────────────────────────────
# Node.js 22 slim — needed for claude CLI subprocess
FROM node:22-slim
# Install claude CLI (provides the `claude` binary the supervisor shells out to)
RUN npm install -g @anthropic-ai/claude-code \
&& claude --version \
&& echo "claude CLI installed"
# Copy supervisor binary
COPY --from=builder /out/supervisor /usr/local/bin/supervisor
# Bake in config (models.yaml + skill discipline files)
COPY config/ /app/config/
# Run as non-root
RUN groupadd -r supervisor && useradd -r -g supervisor -d /app supervisor
WORKDIR /app
# brain/ is writable state — mount a PersistentVolume here
VOLUME /app/brain
ENV SUPERVISOR_CONFIG_DIR=/app/config/supervisor
ENV SUPERVISOR_MODELS_FILE=/app/config/models.yaml
ENV SUPERVISOR_BRAIN_DIR=/app/brain
ENV SUPERVISOR_SESSIONS_DIR=/app/brain/sessions
ENV SUPERVISOR_PORT=3200
USER supervisor
EXPOSE 3200
ENTRYPOINT ["/usr/local/bin/supervisor"]

View File

@@ -116,13 +116,13 @@ The supervisor probes connectivity at call time:
| `ROUTING_PORT` | `3210` | Routing pod's listen port |
| `ROUTING_MCP_TOKEN` | — | Optional bearer token for the routing MCP HTTP endpoint |
| `BRAIN_URL` | `http://ingestion.supervisor:3300` | Routing pod → brain (in-cluster) |
| `HYPERGUILD_LOCAL_MODEL` | `qwen35` | Local model for routed-to-local skill calls |
| `HYPERGUILD_CLAUDE_MODEL` | `claude-sonnet-4-6` | Claude model for routed-to-Claude skill calls |
| `HYPERGUILD_ROUTE_LOCAL_FLOOR` | `0.90` | At/above pass rate, route to local |
| `HYPERGUILD_ROUTE_LOCAL_CEIL` | `0.70` | Below pass rate, route to Claude. Between CEIL and FLOOR is the sample band. |
| `HYPERGUILD_FAST_MODEL` | `koala/qwen35-9b-fast` | Fast model for high-pass-rate skill calls |
| `HYPERGUILD_THINKING_MODEL` | `iguana/gemma4-26b` | Thinking model for low-pass-rate skill calls |
| `HYPERGUILD_ROUTE_LOCAL_FLOOR` | `0.90` | At/above pass rate, route to fast model |
| `HYPERGUILD_ROUTE_LOCAL_CEIL` | `0.70` | Below pass rate, route to thinking model. Between CEIL and FLOOR is the sample band. |
| `HYPERGUILD_PASS_RATE_TTL_SECONDS` | `60` | Per-skill pass-rate cache TTL |
> **Operator note:** LiteLLM at `LITELLM_BASE_URL` must register both `HYPERGUILD_LOCAL_MODEL` and `HYPERGUILD_CLAUDE_MODEL` for routing to do useful work. If a model is missing, LiteLLM returns 4xx, the routing pod's local route fails, the fail-open retry on Claude likely also fails (since both are missing), and the only signal is `final_status: "fail"` on `_routing` entries in the brain.
> **Operator note:** LiteLLM at `LITELLM_BASE_URL` must register both `HYPERGUILD_FAST_MODEL` and `HYPERGUILD_THINKING_MODEL` for routing to do useful work. If a model is missing, LiteLLM returns 4xx, the routing pod's fast route fails, the fail-open retry on the thinking model likely also fails (since both are missing), and the only signal is `final_status: "fail"` on `_routing` entries in the brain.
## Phase 2 (planned)

View File

@@ -14,12 +14,16 @@ import (
"os"
"time"
"github.com/mathiasbq/supervisor/internal/auth"
"github.com/mathiasbq/supervisor/internal/config"
iexec "github.com/mathiasbq/supervisor/internal/exec"
"github.com/mathiasbq/supervisor/internal/githubclient"
"github.com/mathiasbq/supervisor/internal/mcp"
"github.com/mathiasbq/supervisor/internal/mcpclient"
"github.com/mathiasbq/supervisor/internal/registry"
"github.com/mathiasbq/supervisor/internal/routing"
"github.com/mathiasbq/supervisor/internal/skills/debug"
"github.com/mathiasbq/supervisor/internal/skills/project"
"github.com/mathiasbq/supervisor/internal/skills/retrospective"
"github.com/mathiasbq/supervisor/internal/skills/review"
"github.com/mathiasbq/supervisor/internal/skills/trainer"
@@ -48,12 +52,12 @@ func main() {
llm := iexec.NewLiteLLM(cfg.LiteLLMBaseURL, cfg.LiteLLMAPIKey, 0)
router := &routing.Router{
Fetcher: routing.NewFetcher(cfg.BrainURL, "7d", time.Duration(cfg.PassRateTTLSeconds)*time.Second),
Logger: routing.NewLogger(cfg.BrainURL),
Policy: routing.Policy{Floor: cfg.RouteLocalFloor, Ceil: cfg.RouteLocalCeil},
LocalModel: cfg.LocalModel,
ClaudeModel: cfg.ClaudeModel,
Complete: llm.Complete,
Fetcher: routing.NewFetcher(cfg.BrainURL, "7d", time.Duration(cfg.PassRateTTLSeconds)*time.Second),
Logger: routing.NewLogger(cfg.BrainURL),
Policy: routing.Policy{Floor: cfg.RouteLocalFloor, Ceil: cfg.RouteLocalCeil},
FastModel: cfg.FastModel,
ThinkingModel: cfg.ThinkingModel,
Complete: llm.Complete,
}
// Skill packages call CompleteFunc(ctx, model, system, user) — no session_id
@@ -78,36 +82,74 @@ func main() {
reg := registry.New()
reg.Register(review.New(review.Config{
SkillPrompt: mustRead("review.md"),
DefaultModel: cfg.LocalModel,
DefaultModel: cfg.FastModel,
CompleteFunc: review.CompleteFunc(wrap("review")),
}))
reg.Register(debug.New(debug.Config{
SkillPrompt: mustRead("debug.md"),
DefaultModel: cfg.LocalModel,
DefaultModel: cfg.FastModel,
CompleteFunc: debug.CompleteFunc(wrap("debug")),
}))
reg.Register(retrospective.New(retrospective.Config{
SkillPrompt: mustRead("retrospective.md"),
DefaultModel: cfg.LocalModel,
DefaultModel: cfg.FastModel,
CompleteFunc: retrospective.CompleteFunc(wrap("retrospective")),
}))
reg.Register(trainer.New(trainer.Config{
ReaderPrompt: mustRead("trainer-reader.md"),
WriterPrompt: mustRead("trainer-writer.md"),
DefaultModel: cfg.LocalModel,
DefaultModel: cfg.FastModel,
CompleteFunc: trainer.CompleteFunc(wrap("trainer")),
}))
srv := mcp.NewServer(reg, cfg.MCPAuthToken)
if cfg.GiteaMCPURL != "" {
var ghClient *githubclient.Client
if cfg.GitHubPAT != "" {
ghClient = githubclient.New(cfg.GitHubPAT)
}
reg.Register(project.New(project.Config{
Client: mcpclient.New(cfg.GiteaMCPURL, cfg.GiteaMCPToken),
GitHub: ghClient,
GiteaOwner: cfg.GiteaOwner,
GitHubOwner: cfg.GitHubOwner,
GitHubPAT: cfg.GitHubPAT,
InfraRepo: cfg.InfraRepo,
}))
logger.Info("project_create registered", "gitea_mcp_url", cfg.GiteaMCPURL,
"gitea_owner", cfg.GiteaOwner, "github_owner", cfg.GitHubOwner,
"infra_repo", cfg.InfraRepo, "github_pat_set", cfg.GitHubPAT != "")
} else {
logger.Info("project_create skipped — GITEA_MCP_URL not set")
}
var validator *auth.Validator
if dexURL := os.Getenv("DEX_ISSUER_URL"); dexURL != "" {
audience := os.Getenv("MCP_AUDIENCE")
v, err := auth.NewValidator(dexURL, audience)
if err != nil {
logger.Error("build jwt validator", "err", err)
os.Exit(1)
}
validator = v
logger.Info("jwt auth enabled", "issuer", dexURL)
}
srv := mcp.NewServer(reg, cfg.MCPAuthToken, validator)
mux := http.NewServeMux()
mux.Handle("/mcp", srv)
mux.HandleFunc("/healthz", func(w http.ResponseWriter, _ *http.Request) {
w.WriteHeader(http.StatusOK)
})
if dexURL := os.Getenv("DEX_ISSUER_URL"); dexURL != "" {
resourceURL := os.Getenv("MCP_RESOURCE_URL")
mux.HandleFunc("GET /.well-known/oauth-protected-resource",
auth.ProtectedResourceHandler(resourceURL, dexURL))
}
addr := ":" + cfg.Port
logger.Info("routing pod starting", "addr", addr,
"local", cfg.LocalModel, "claude", cfg.ClaudeModel,
"fast", cfg.FastModel, "thinking", cfg.ThinkingModel,
"floor", cfg.RouteLocalFloor, "ceil", cfg.RouteLocalCeil)
if err := http.ListenAndServe(addr, mux); err != nil { //nolint:gosec
logger.Error("server stopped", "err", err)

View File

@@ -1,163 +0,0 @@
package main
import (
"context"
"log/slog"
"net/http"
"os"
"github.com/mathiasbq/supervisor/internal/config"
iexec "github.com/mathiasbq/supervisor/internal/exec"
"github.com/mathiasbq/supervisor/internal/mcp"
"github.com/mathiasbq/supervisor/internal/registry"
"github.com/mathiasbq/supervisor/internal/skills/brain"
"github.com/mathiasbq/supervisor/internal/skills/org"
"github.com/mathiasbq/supervisor/internal/skills/retrospective"
skilldebug "github.com/mathiasbq/supervisor/internal/skills/debug"
"github.com/mathiasbq/supervisor/internal/skills/review"
"github.com/mathiasbq/supervisor/internal/skills/spec"
"github.com/mathiasbq/supervisor/internal/skills/trainer"
"github.com/mathiasbq/supervisor/internal/skills/sessionlog"
"github.com/mathiasbq/supervisor/internal/skills/tdd"
"github.com/mathiasbq/supervisor/internal/tier"
)
func main() {
logger := slog.New(slog.NewJSONHandler(os.Stdout, nil))
cfg, err := config.Load()
if err != nil {
logger.Error("load config", "err", err)
os.Exit(1)
}
models, err := config.LoadModels(cfg.ModelsFile)
if err != nil {
logger.Error("load models", "err", err)
os.Exit(1)
}
protocolsPrompt, err := os.ReadFile(cfg.ConfigDir + "/protocols.md")
if err != nil {
logger.Error("read protocols.md", "path", cfg.ConfigDir+"/protocols.md", "err", err)
os.Exit(1)
}
// prependProtocols prepends the shared protocols to a skill discipline file.
prependProtocols := func(skillPrompt []byte) string {
return string(protocolsPrompt) + "\n---\n\n" + string(skillPrompt)
}
tddPrompt, err := os.ReadFile(cfg.ConfigDir + "/tdd.md")
if err != nil {
logger.Error("read tdd.md", "path", cfg.ConfigDir+"/tdd.md", "err", err)
os.Exit(1)
}
retroPrompt, err := os.ReadFile(cfg.ConfigDir + "/retrospective.md")
if err != nil {
logger.Error("read retrospective.md", "path", cfg.ConfigDir+"/retrospective.md", "err", err)
os.Exit(1)
}
reviewPrompt, err := os.ReadFile(cfg.ConfigDir + "/review.md")
if err != nil {
logger.Error("read review.md", "path", cfg.ConfigDir+"/review.md", "err", err)
os.Exit(1)
}
debugPrompt, err := os.ReadFile(cfg.ConfigDir + "/debug.md")
if err != nil {
logger.Error("read debug.md", "path", cfg.ConfigDir+"/debug.md", "err", err)
os.Exit(1)
}
specPrompt, err := os.ReadFile(cfg.ConfigDir + "/spec.md")
if err != nil {
logger.Error("read spec.md", "path", cfg.ConfigDir+"/spec.md", "err", err)
os.Exit(1)
}
trainerReaderPrompt, err := os.ReadFile(cfg.ConfigDir + "/trainer-reader.md")
if err != nil {
logger.Error("read trainer-reader.md", "path", cfg.ConfigDir+"/trainer-reader.md", "err", err)
os.Exit(1)
}
trainerWriterPrompt, err := os.ReadFile(cfg.ConfigDir + "/trainer-writer.md")
if err != nil {
logger.Error("read trainer-writer.md", "path", cfg.ConfigDir+"/trainer-writer.md", "err", err)
os.Exit(1)
}
litellm := iexec.NewLiteLLM(cfg.LiteLLMBaseURL, cfg.LiteLLMAPIKey, 0)
tierFn := func(ctx context.Context) tier.Info {
return tier.Detect(ctx, "https://api.anthropic.com", cfg.LiteLLMBaseURL)
}
reg := registry.New()
reg.Register(tdd.New(tdd.Config{
SkillPrompt: prependProtocols(tddPrompt),
DefaultModel: models.ModelFor("tdd", ""),
CompleteFunc: litellm.Complete,
SessionsDir: cfg.SessionsDir,
IngestBaseURL: cfg.IngestBaseURL,
}))
reg.Register(brain.New(brain.Config{
IngestBaseURL: cfg.IngestBaseURL,
IngestSvcURL: cfg.IngestSvcURL,
KBRetrievalURL: cfg.KBRetrievalURL,
}))
reg.Register(org.New(org.Config{
TierFn: tierFn,
}))
reg.Register(sessionlog.New(sessionlog.Config{
SessionsDir: cfg.SessionsDir,
}))
reg.Register(retrospective.New(retrospective.Config{
SkillPrompt: prependProtocols(retroPrompt),
DefaultModel: models.ModelFor("retrospective", ""),
SessionsDir: cfg.SessionsDir,
CompleteFunc: litellm.Complete,
}))
reg.Register(review.New(review.Config{
SkillPrompt: prependProtocols(reviewPrompt),
DefaultModel: models.ModelFor("review", ""),
CompleteFunc: litellm.Complete,
SessionsDir: cfg.SessionsDir,
IngestBaseURL: cfg.IngestBaseURL,
}))
reg.Register(skilldebug.New(skilldebug.Config{
SkillPrompt: prependProtocols(debugPrompt),
DefaultModel: models.ModelFor("debug", ""),
CompleteFunc: litellm.Complete,
SessionsDir: cfg.SessionsDir,
IngestBaseURL: cfg.IngestBaseURL,
}))
reg.Register(spec.New(spec.Config{
SkillPrompt: prependProtocols(specPrompt),
DefaultModel: models.ModelFor("spec", ""),
CompleteFunc: litellm.Complete,
SessionsDir: cfg.SessionsDir,
IngestBaseURL: cfg.IngestBaseURL,
}))
reg.Register(trainer.New(trainer.Config{
ReaderPrompt: prependProtocols(trainerReaderPrompt),
WriterPrompt: prependProtocols(trainerWriterPrompt),
DefaultModel: models.ModelFor("trainer", ""),
CompleteFunc: litellm.Complete,
SessionsDir: cfg.SessionsDir,
BrainDir: cfg.BrainDir,
}))
srv := mcp.NewServer(reg, cfg.MCPAuthToken)
mux := http.NewServeMux()
mux.Handle("/mcp", srv)
addr := ":" + cfg.Port
logger.Info("supervisor starting", "addr", addr, "version", "v0.5.0")
if err := http.ListenAndServe(addr, mux); err != nil {
logger.Error("server stopped", "err", err)
os.Exit(1)
}
}

View File

@@ -1,14 +0,0 @@
package main
import (
"os/exec"
"testing"
)
func TestBinaryCompiles(t *testing.T) {
cmd := exec.Command("go", "build", "./...")
out, err := cmd.CombinedOutput()
if err != nil {
t.Fatalf("build failed: %s\n%s", err, out)
}
}

17
go.mod
View File

@@ -2,10 +2,23 @@ module github.com/mathiasbq/supervisor
go 1.26.1
require github.com/stretchr/testify v1.11.1
require (
github.com/lestrrat-go/jwx/v2 v2.1.6
github.com/stretchr/testify v1.11.1
gopkg.in/yaml.v3 v3.0.1
)
require (
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0 // indirect
github.com/goccy/go-json v0.10.3 // indirect
github.com/lestrrat-go/blackmagic v1.0.3 // indirect
github.com/lestrrat-go/httpcc v1.0.1 // indirect
github.com/lestrrat-go/httprc v1.0.6 // indirect
github.com/lestrrat-go/iter v1.0.2 // indirect
github.com/lestrrat-go/option v1.0.1 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
github.com/segmentio/asm v1.2.0 // indirect
golang.org/x/crypto v0.32.0 // indirect
golang.org/x/sys v0.31.0 // indirect
)

27
go.sum
View File

@@ -1,10 +1,37 @@
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0 h1:NMZiJj8QnKe1LgsbDayM4UoHwbvwDRwnI3hwNaAHRnc=
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0/go.mod h1:ZXNYxsqcloTdSy/rNShjYzMhyjf0LaoftYK0p+A3h40=
github.com/goccy/go-json v0.10.3 h1:KZ5WoDbxAIgm2HNbYckL0se1fHD6rz5j4ywS6ebzDqA=
github.com/goccy/go-json v0.10.3/go.mod h1:oq7eo15ShAhp70Anwd5lgX2pLfOS3QCiwU/PULtXL6M=
github.com/lestrrat-go/blackmagic v1.0.3 h1:94HXkVLxkZO9vJI/w2u1T0DAoprShFd13xtnSINtDWs=
github.com/lestrrat-go/blackmagic v1.0.3/go.mod h1:6AWFyKNNj0zEXQYfTMPfZrAXUWUfTIZ5ECEUEJaijtw=
github.com/lestrrat-go/httpcc v1.0.1 h1:ydWCStUeJLkpYyjLDHihupbn2tYmZ7m22BGkcvZZrIE=
github.com/lestrrat-go/httpcc v1.0.1/go.mod h1:qiltp3Mt56+55GPVCbTdM9MlqhvzyuL6W/NMDA8vA5E=
github.com/lestrrat-go/httprc v1.0.6 h1:qgmgIRhpvBqexMJjA/PmwSvhNk679oqD1RbovdCGW8k=
github.com/lestrrat-go/httprc v1.0.6/go.mod h1:mwwz3JMTPBjHUkkDv/IGJ39aALInZLrhBp0X7KGUZlo=
github.com/lestrrat-go/iter v1.0.2 h1:gMXo1q4c2pHmC3dn8LzRhJfP1ceCbgSiT9lUydIzltI=
github.com/lestrrat-go/iter v1.0.2/go.mod h1:Momfcq3AnRlRjI5b5O8/G5/BvpzrhoFTZcn06fEOPt4=
github.com/lestrrat-go/jwx/v2 v2.1.6 h1:hxM1gfDILk/l5ylers6BX/Eq1m/pnxe9NBwW6lVfecA=
github.com/lestrrat-go/jwx/v2 v2.1.6/go.mod h1:Y722kU5r/8mV7fYDifjug0r8FK8mZdw0K0GpJw/l8pU=
github.com/lestrrat-go/option v1.0.1 h1:oAzP2fvZGQKWkvHa1/SAcFolBEca1oN+mQ7eooNBEYU=
github.com/lestrrat-go/option v1.0.1/go.mod h1:5ZHFbivi4xwXxhxY9XHDe2FHo6/Z7WWmtT7T5nBBp3I=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/segmentio/asm v1.2.0 h1:9BQrFxC+YOHJlTlHGkTrFWf59nbL3XnCoFLTwDCI7ys=
github.com/segmentio/asm v1.2.0/go.mod h1:BqMnlJP91P8d+4ibuonYZw9mfnzI9HfxselHZr5aAcs=
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/testify v1.6.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.11.1 h1:7s2iGBzp5EwR7/aIZr8ao5+dra3wiQyKjjFuvgVKu7U=
github.com/stretchr/testify v1.11.1/go.mod h1:wZwfW3scLgRK+23gO65QZefKpKQRnfz6sD981Nm4B6U=
golang.org/x/crypto v0.32.0 h1:euUpcYgM8WcP71gNpTqQCn6rC2t6ULUPiOzfWaXVVfc=
golang.org/x/crypto v0.32.0/go.mod h1:ZnnJkOaASj8g0AjIduWNlq2NRxL0PlBrbKVyZ6V/Ugc=
golang.org/x/sys v0.31.0 h1:ioabZlmFYtWhL+TRYpcnNlLwhyxaM9kWTDEmfnprqik=
golang.org/x/sys v0.31.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=

View File

@@ -11,6 +11,7 @@ import (
"time"
"github.com/mathiasbq/hyperguild/ingestion/internal/api"
"github.com/mathiasbq/hyperguild/ingestion/internal/auth"
"github.com/mathiasbq/hyperguild/ingestion/internal/llm"
"github.com/mathiasbq/hyperguild/ingestion/internal/mcp"
"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
@@ -55,7 +56,31 @@ func main() {
h := api.NewHandler(brainDir, logger, pipelineCfg)
mcpSrv := mcp.NewServer(brainDir, &pipelineCfg, llmClient.Complete)
var answerComplete pipeline.CompleteFunc
if primaryURL := os.Getenv("BRAIN_LLM_PRIMARY_URL"); primaryURL != "" {
primaryModel := envOr("BRAIN_LLM_PRIMARY_MODEL", "gemma4:31b")
primaryKey := os.Getenv("BERGET_API_KEY")
timeoutMS := envInt("BRAIN_LLM_TIMEOUT_MS", 10000)
timeout := time.Duration(timeoutMS) * time.Millisecond
primary := llm.New(primaryURL, primaryKey, primaryModel, timeout)
router := &llm.Router{Primary: primary}
if fallbackURL := os.Getenv("BRAIN_LLM_FALLBACK_URL"); fallbackURL != "" {
fallbackModel := envOr("BRAIN_LLM_FALLBACK_MODEL", "gemma4:31b")
router.Fallback = llm.New(fallbackURL, "", fallbackModel, timeout)
}
answerComplete = router.Complete
logger.Info("brain answer LLM configured", "primary", primaryURL, "model", primaryModel)
}
mcpSrv := mcp.NewServer(brainDir, &pipelineCfg, llmClient.Complete, answerComplete)
mcpToken := os.Getenv("BRAIN_MCP_TOKEN")
if mcpToken == "" {
logger.Error("BRAIN_MCP_TOKEN not set")
os.Exit(1)
}
ctx := context.Background()
if watchInterval > 0 {
@@ -74,7 +99,25 @@ func main() {
mux.HandleFunc("POST /ingest-raw", h.IngestRaw)
mux.HandleFunc("POST /backfill-refs", h.BackfillRefs)
mux.HandleFunc("GET /pass-rate", h.PassRate)
mux.Handle("POST /mcp", mcpSrv)
var jwtValidator *auth.Validator
if dexURL := os.Getenv("DEX_ISSUER_URL"); dexURL != "" {
audience := os.Getenv("MCP_AUDIENCE")
v, err := auth.NewValidator(dexURL, audience)
if err != nil {
logger.Error("build jwt validator", "err", err)
os.Exit(1)
}
jwtValidator = v
logger.Info("jwt auth enabled", "issuer", dexURL)
}
mux.Handle("/mcp", mcp.BearerAuth(mcpToken, jwtValidator, mcpSrv))
if dexURL := os.Getenv("DEX_ISSUER_URL"); dexURL != "" {
resourceURL := os.Getenv("MCP_RESOURCE_URL")
mux.HandleFunc("GET /.well-known/oauth-protected-resource",
auth.ProtectedResourceHandler(resourceURL, os.Getenv("DEX_ISSUER_URL")))
}
addr := ":" + port
watchIntervalLog := "disabled"

View File

@@ -2,10 +2,23 @@ module github.com/mathiasbq/hyperguild/ingestion
go 1.26.1
require github.com/stretchr/testify v1.11.1
require (
github.com/lestrrat-go/jwx/v2 v2.1.6
github.com/stretchr/testify v1.11.1
)
require (
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0 // indirect
github.com/goccy/go-json v0.10.3 // indirect
github.com/lestrrat-go/blackmagic v1.0.3 // indirect
github.com/lestrrat-go/httpcc v1.0.1 // indirect
github.com/lestrrat-go/httprc v1.0.6 // indirect
github.com/lestrrat-go/iter v1.0.2 // indirect
github.com/lestrrat-go/option v1.0.1 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/segmentio/asm v1.2.0 // indirect
golang.org/x/crypto v0.32.0 // indirect
golang.org/x/sys v0.31.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
)

View File

@@ -1,9 +1,37 @@
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0 h1:NMZiJj8QnKe1LgsbDayM4UoHwbvwDRwnI3hwNaAHRnc=
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0/go.mod h1:ZXNYxsqcloTdSy/rNShjYzMhyjf0LaoftYK0p+A3h40=
github.com/goccy/go-json v0.10.3 h1:KZ5WoDbxAIgm2HNbYckL0se1fHD6rz5j4ywS6ebzDqA=
github.com/goccy/go-json v0.10.3/go.mod h1:oq7eo15ShAhp70Anwd5lgX2pLfOS3QCiwU/PULtXL6M=
github.com/lestrrat-go/blackmagic v1.0.3 h1:94HXkVLxkZO9vJI/w2u1T0DAoprShFd13xtnSINtDWs=
github.com/lestrrat-go/blackmagic v1.0.3/go.mod h1:6AWFyKNNj0zEXQYfTMPfZrAXUWUfTIZ5ECEUEJaijtw=
github.com/lestrrat-go/httpcc v1.0.1 h1:ydWCStUeJLkpYyjLDHihupbn2tYmZ7m22BGkcvZZrIE=
github.com/lestrrat-go/httpcc v1.0.1/go.mod h1:qiltp3Mt56+55GPVCbTdM9MlqhvzyuL6W/NMDA8vA5E=
github.com/lestrrat-go/httprc v1.0.6 h1:qgmgIRhpvBqexMJjA/PmwSvhNk679oqD1RbovdCGW8k=
github.com/lestrrat-go/httprc v1.0.6/go.mod h1:mwwz3JMTPBjHUkkDv/IGJ39aALInZLrhBp0X7KGUZlo=
github.com/lestrrat-go/iter v1.0.2 h1:gMXo1q4c2pHmC3dn8LzRhJfP1ceCbgSiT9lUydIzltI=
github.com/lestrrat-go/iter v1.0.2/go.mod h1:Momfcq3AnRlRjI5b5O8/G5/BvpzrhoFTZcn06fEOPt4=
github.com/lestrrat-go/jwx/v2 v2.1.6 h1:hxM1gfDILk/l5ylers6BX/Eq1m/pnxe9NBwW6lVfecA=
github.com/lestrrat-go/jwx/v2 v2.1.6/go.mod h1:Y722kU5r/8mV7fYDifjug0r8FK8mZdw0K0GpJw/l8pU=
github.com/lestrrat-go/option v1.0.1 h1:oAzP2fvZGQKWkvHa1/SAcFolBEca1oN+mQ7eooNBEYU=
github.com/lestrrat-go/option v1.0.1/go.mod h1:5ZHFbivi4xwXxhxY9XHDe2FHo6/Z7WWmtT7T5nBBp3I=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/segmentio/asm v1.2.0 h1:9BQrFxC+YOHJlTlHGkTrFWf59nbL3XnCoFLTwDCI7ys=
github.com/segmentio/asm v1.2.0/go.mod h1:BqMnlJP91P8d+4ibuonYZw9mfnzI9HfxselHZr5aAcs=
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/testify v1.6.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.11.1 h1:7s2iGBzp5EwR7/aIZr8ao5+dra3wiQyKjjFuvgVKu7U=
github.com/stretchr/testify v1.11.1/go.mod h1:wZwfW3scLgRK+23gO65QZefKpKQRnfz6sD981Nm4B6U=
golang.org/x/crypto v0.32.0 h1:euUpcYgM8WcP71gNpTqQCn6rC2t6ULUPiOzfWaXVVfc=
golang.org/x/crypto v0.32.0/go.mod h1:ZnnJkOaASj8g0AjIduWNlq2NRxL0PlBrbKVyZ6V/Ugc=
golang.org/x/sys v0.31.0 h1:ioabZlmFYtWhL+TRYpcnNlLwhyxaM9kWTDEmfnprqik=
golang.org/x/sys v0.31.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=

View File

@@ -0,0 +1,84 @@
package auth
import (
"context"
"encoding/json"
"fmt"
"net/http"
"time"
"github.com/lestrrat-go/jwx/v2/jwk"
"github.com/lestrrat-go/jwx/v2/jwt"
)
// Validator validates Bearer JWTs issued by a Dex (OIDC) authorization server.
// Audience is optional; leave empty to skip audience validation.
type Validator struct {
issuer string
audience string
jwksURI string
cache *jwk.Cache
}
// NewValidator fetches the OIDC discovery document from issuerURL, extracts
// jwks_uri, seeds the JWKS cache, and returns a ready Validator.
// If DEX_ISSUER_URL is not set the caller should pass "" and skip construction.
func NewValidator(issuerURL, audience string) (*Validator, error) {
resp, err := http.Get(issuerURL + "/.well-known/openid-configuration") //nolint:noctx
if err != nil {
return nil, fmt.Errorf("fetch oidc discovery: %w", err)
}
defer resp.Body.Close() //nolint:errcheck
if resp.StatusCode != http.StatusOK {
return nil, fmt.Errorf("oidc discovery: status %d", resp.StatusCode)
}
var doc struct {
JWKSURI string `json:"jwks_uri"`
}
if err := json.NewDecoder(resp.Body).Decode(&doc); err != nil {
return nil, fmt.Errorf("decode oidc discovery: %w", err)
}
if doc.JWKSURI == "" {
return nil, fmt.Errorf("oidc discovery: empty jwks_uri")
}
ctx := context.Background()
cache := jwk.NewCache(ctx)
if err := cache.Register(doc.JWKSURI, jwk.WithMinRefreshInterval(time.Hour)); err != nil {
return nil, fmt.Errorf("register jwks cache: %w", err)
}
if _, err := cache.Refresh(ctx, doc.JWKSURI); err != nil {
return nil, fmt.Errorf("initial jwks fetch: %w", err)
}
return &Validator{
issuer: issuerURL,
audience: audience,
jwksURI: doc.JWKSURI,
cache: cache,
}, nil
}
// Validate parses and validates rawToken. Returns the subject claim on success.
func (v *Validator) Validate(ctx context.Context, rawToken string) (string, error) {
keySet, err := v.cache.Get(ctx, v.jwksURI)
if err != nil {
return "", fmt.Errorf("get jwks: %w", err)
}
opts := []jwt.ParseOption{
jwt.WithKeySet(keySet),
jwt.WithValidate(true),
jwt.WithIssuer(v.issuer),
}
if v.audience != "" {
opts = append(opts, jwt.WithAudience(v.audience))
}
tok, err := jwt.ParseString(rawToken, opts...)
if err != nil {
return "", fmt.Errorf("validate jwt: %w", err)
}
return tok.Subject(), nil
}

View File

@@ -0,0 +1,169 @@
package auth_test
import (
"context"
"crypto/rand"
"crypto/rsa"
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"time"
"github.com/lestrrat-go/jwx/v2/jwa"
"github.com/lestrrat-go/jwx/v2/jwk"
"github.com/lestrrat-go/jwx/v2/jwt"
"github.com/mathiasbq/hyperguild/ingestion/internal/auth"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
type testKeys struct {
priv jwk.Key
pub jwk.Key
}
func generateRSAKeys(t *testing.T) testKeys {
t.Helper()
raw, err := rsa.GenerateKey(rand.Reader, 2048)
require.NoError(t, err)
priv, err := jwk.FromRaw(raw)
require.NoError(t, err)
require.NoError(t, priv.Set(jwk.KeyIDKey, "test-kid"))
require.NoError(t, priv.Set(jwk.AlgorithmKey, jwa.RS256))
pub, err := jwk.PublicKeyOf(priv)
require.NoError(t, err)
return testKeys{priv: priv, pub: pub}
}
func mockOIDCServer(t *testing.T, keys testKeys) *httptest.Server {
t.Helper()
set := jwk.NewSet()
require.NoError(t, set.AddKey(keys.pub))
jwksBytes, err := json.Marshal(set)
require.NoError(t, err)
mux := http.NewServeMux()
var srv *httptest.Server
mux.HandleFunc("/.well-known/openid-configuration", func(w http.ResponseWriter, _ *http.Request) {
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(map[string]string{
"issuer": srv.URL,
"jwks_uri": srv.URL + "/jwks",
})
})
mux.HandleFunc("/jwks", func(w http.ResponseWriter, _ *http.Request) {
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write(jwksBytes)
})
srv = httptest.NewServer(mux)
t.Cleanup(srv.Close)
return srv
}
func signToken(t *testing.T, keys testKeys, issuer, audience, subject string, exp time.Time) string {
t.Helper()
b := jwt.NewBuilder().
Issuer(issuer).
Subject(subject).
Expiration(exp)
if audience != "" {
b = b.Audience([]string{audience})
}
tok, err := b.Build()
require.NoError(t, err)
signed, err := jwt.Sign(tok, jwt.WithKey(jwa.RS256, keys.priv))
require.NoError(t, err)
return string(signed)
}
func TestValidator(t *testing.T) {
keys := generateRSAKeys(t)
srv := mockOIDCServer(t, keys)
ctx := context.Background()
v, err := auth.NewValidator(srv.URL, "brain")
require.NoError(t, err)
tests := []struct {
name string
token string
wantSub string
wantErr bool
}{
{
name: "valid jwt",
token: signToken(t, keys, srv.URL, "brain", "test-user", time.Now().Add(time.Hour)),
wantSub: "test-user",
},
{
name: "expired jwt",
token: signToken(t, keys, srv.URL, "brain", "test-user", time.Now().Add(-time.Hour)),
wantErr: true,
},
{
name: "wrong issuer",
token: signToken(t, keys, "https://evil.example.com", "brain", "test-user", time.Now().Add(time.Hour)),
wantErr: true,
},
{
name: "wrong audience",
token: signToken(t, keys, srv.URL, "other-service", "test-user", time.Now().Add(time.Hour)),
wantErr: true,
},
{
name: "tampered token",
token: signToken(t, keys, srv.URL, "brain", "test-user", time.Now().Add(time.Hour)) + "tampered",
wantErr: true,
},
{
name: "not a jwt",
token: "not-a-jwt",
wantErr: true,
},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
sub, err := v.Validate(ctx, tc.token)
if tc.wantErr {
assert.Error(t, err)
assert.Empty(t, sub)
} else {
require.NoError(t, err)
assert.Equal(t, tc.wantSub, sub)
}
})
}
}
func TestNewValidator_NoAudience(t *testing.T) {
keys := generateRSAKeys(t)
srv := mockOIDCServer(t, keys)
ctx := context.Background()
v, err := auth.NewValidator(srv.URL, "")
require.NoError(t, err)
// Token without audience passes when audience validation is disabled.
tok, err := jwt.NewBuilder().
Issuer(srv.URL).
Subject("sub").
Expiration(time.Now().Add(time.Hour)).
Build()
require.NoError(t, err)
signed, err := jwt.Sign(tok, jwt.WithKey(jwa.RS256, keys.priv))
require.NoError(t, err)
sub, err := v.Validate(ctx, string(signed))
require.NoError(t, err)
assert.Equal(t, "sub", sub)
}
func TestNewValidator_BadDiscoveryURL(t *testing.T) {
_, err := auth.NewValidator("http://127.0.0.1:1", "brain")
assert.Error(t, err)
}

View File

@@ -0,0 +1,23 @@
package auth
import (
"encoding/json"
"net/http"
)
// ProtectedResourceHandler returns an RFC 9728 oauth-protected-resource metadata
// handler. Mount at GET /.well-known/oauth-protected-resource (no auth required).
func ProtectedResourceHandler(resourceURL, issuerURL string) http.HandlerFunc {
type metadata struct {
Resource string `json:"resource"`
AuthorizationServers []string `json:"authorization_servers"`
}
body, _ := json.Marshal(metadata{
Resource: resourceURL,
AuthorizationServers: []string{issuerURL},
})
return func(w http.ResponseWriter, _ *http.Request) {
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write(body)
}
}

View File

@@ -0,0 +1,28 @@
package auth_test
import (
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"github.com/mathiasbq/hyperguild/ingestion/internal/auth"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestProtectedResourceHandler(t *testing.T) {
h := auth.ProtectedResourceHandler("https://brain-mcp.d-ma.be", "https://auth.d-ma.be")
req := httptest.NewRequest(http.MethodGet, "/.well-known/oauth-protected-resource", nil)
rr := httptest.NewRecorder()
h(rr, req)
assert.Equal(t, http.StatusOK, rr.Code)
assert.Equal(t, "application/json", rr.Header().Get("Content-Type"))
var body map[string]any
require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &body))
assert.Equal(t, "https://brain-mcp.d-ma.be", body["resource"])
servers := body["authorization_servers"].([]any)
assert.Equal(t, "https://auth.d-ma.be", servers[0])
}

View File

@@ -0,0 +1,29 @@
package llm
import (
"context"
"fmt"
)
// Router calls Primary first; on any error falls back to Fallback.
// Fallback may be nil, in which case primary errors are returned directly.
type Router struct {
Primary *Client
Fallback *Client
}
// Complete implements pipeline.CompleteFunc, routing through Primary then Fallback.
func (r *Router) Complete(ctx context.Context, system, user string) (string, error) {
out, err := r.Primary.Complete(ctx, system, user)
if err == nil {
return out, nil
}
if r.Fallback == nil {
return "", fmt.Errorf("primary llm: %w", err)
}
out, err2 := r.Fallback.Complete(ctx, system, user)
if err2 != nil {
return "", fmt.Errorf("primary llm: %w; fallback llm: %v", err, err2)
}
return out, nil
}

View File

@@ -0,0 +1,71 @@
package llm
import (
"context"
"net/http"
"net/http/httptest"
"testing"
"time"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestRouter_PrimarySucceeds(t *testing.T) {
primary := mockServer(t, "from-primary")
defer primary.Close()
fallback := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
t.Error("fallback must not be called when primary succeeds")
}))
defer fallback.Close()
r := &Router{
Primary: New(primary.URL, "", "m", time.Second),
Fallback: New(fallback.URL, "", "m", time.Second),
}
out, err := r.Complete(context.Background(), "sys", "user")
require.NoError(t, err)
assert.Equal(t, "from-primary", out)
}
func TestRouter_FallsBackOnPrimaryError(t *testing.T) {
primary := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
http.Error(w, "unavailable", http.StatusServiceUnavailable)
}))
defer primary.Close()
fallback := mockServer(t, "from-fallback")
defer fallback.Close()
r := &Router{
Primary: New(primary.URL, "", "m", time.Second),
Fallback: New(fallback.URL, "", "m", time.Second),
}
out, err := r.Complete(context.Background(), "sys", "user")
require.NoError(t, err)
assert.Equal(t, "from-fallback", out)
}
func TestRouter_BothFail(t *testing.T) {
fail := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
http.Error(w, "err", http.StatusBadGateway)
}))
defer fail.Close()
r := &Router{
Primary: New(fail.URL, "", "m", time.Second),
Fallback: New(fail.URL, "", "m", time.Second),
}
_, err := r.Complete(context.Background(), "sys", "user")
assert.Error(t, err)
}
func TestRouter_NilFallback(t *testing.T) {
fail := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
http.Error(w, "err", http.StatusBadGateway)
}))
defer fail.Close()
r := &Router{Primary: New(fail.URL, "", "m", time.Second)}
_, err := r.Complete(context.Background(), "sys", "user")
assert.Error(t, err)
}

View File

@@ -0,0 +1,36 @@
package mcp
import (
"crypto/subtle"
"net/http"
"strings"
"github.com/mathiasbq/hyperguild/ingestion/internal/auth"
)
// BearerAuth returns a middleware that enforces authentication on every request.
// It tries a valid Dex JWT first (when v is non-nil), then falls back to the
// static token. Rejects if token is empty and no valid JWT is presented.
func BearerAuth(token string, v *auth.Validator, next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
rawToken, ok := strings.CutPrefix(r.Header.Get("Authorization"), "Bearer ")
if !ok {
http.Error(w, "unauthorized", http.StatusUnauthorized)
return
}
if v != nil {
if _, err := v.Validate(r.Context(), rawToken); err == nil {
next.ServeHTTP(w, r)
return
}
}
if token != "" && subtle.ConstantTimeCompare([]byte(rawToken), []byte(token)) == 1 {
next.ServeHTTP(w, r)
return
}
http.Error(w, "unauthorized", http.StatusUnauthorized)
})
}

View File

@@ -0,0 +1,161 @@
package mcp_test
import (
"context"
"crypto/rand"
"crypto/rsa"
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"time"
"github.com/lestrrat-go/jwx/v2/jwa"
"github.com/lestrrat-go/jwx/v2/jwk"
"github.com/lestrrat-go/jwx/v2/jwt"
"github.com/mathiasbq/hyperguild/ingestion/internal/auth"
"github.com/mathiasbq/hyperguild/ingestion/internal/mcp"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func okHandler() http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
w.WriteHeader(http.StatusOK)
})
}
func TestBearerAuth_MissingHeader(t *testing.T) {
handler := mcp.BearerAuth("secret", nil, okHandler())
req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
rr := httptest.NewRecorder()
handler.ServeHTTP(rr, req)
assert.Equal(t, http.StatusUnauthorized, rr.Code)
}
func TestBearerAuth_WrongToken(t *testing.T) {
handler := mcp.BearerAuth("secret", nil, okHandler())
req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
req.Header.Set("Authorization", "Bearer wrong")
rr := httptest.NewRecorder()
handler.ServeHTTP(rr, req)
assert.Equal(t, http.StatusUnauthorized, rr.Code)
}
func TestBearerAuth_CorrectToken(t *testing.T) {
called := false
handler := mcp.BearerAuth("secret", nil, http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
called = true
w.WriteHeader(http.StatusOK)
}))
req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
req.Header.Set("Authorization", "Bearer secret")
rr := httptest.NewRecorder()
handler.ServeHTTP(rr, req)
assert.Equal(t, http.StatusOK, rr.Code)
assert.True(t, called)
}
func TestBearerAuth_EmptyConfiguredToken(t *testing.T) {
handler := mcp.BearerAuth("", nil, okHandler())
req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
rr := httptest.NewRecorder()
handler.ServeHTTP(rr, req)
assert.Equal(t, http.StatusUnauthorized, rr.Code)
}
// JWT auth tests
func buildOIDCServer(t *testing.T) (*httptest.Server, jwk.Key) {
t.Helper()
raw, err := rsa.GenerateKey(rand.Reader, 2048)
require.NoError(t, err)
priv, err := jwk.FromRaw(raw)
require.NoError(t, err)
require.NoError(t, priv.Set(jwk.KeyIDKey, "k1"))
require.NoError(t, priv.Set(jwk.AlgorithmKey, jwa.RS256))
pub, err := jwk.PublicKeyOf(priv)
require.NoError(t, err)
set := jwk.NewSet()
require.NoError(t, set.AddKey(pub))
jwksBytes, err := json.Marshal(set)
require.NoError(t, err)
muxSrv := http.NewServeMux()
var srv *httptest.Server
muxSrv.HandleFunc("/.well-known/openid-configuration", func(w http.ResponseWriter, _ *http.Request) {
_ = json.NewEncoder(w).Encode(map[string]string{
"issuer": srv.URL,
"jwks_uri": srv.URL + "/jwks",
})
})
muxSrv.HandleFunc("/jwks", func(w http.ResponseWriter, _ *http.Request) {
_, _ = w.Write(jwksBytes)
})
srv = httptest.NewServer(muxSrv)
t.Cleanup(srv.Close)
return srv, priv
}
func signJWT(t *testing.T, priv jwk.Key, issuer, audience string, exp time.Time) string {
t.Helper()
tok, err := jwt.NewBuilder().
Issuer(issuer).Audience([]string{audience}).
Subject("s").Expiration(exp).
Build()
require.NoError(t, err)
signed, err := jwt.Sign(tok, jwt.WithKey(jwa.RS256, priv))
require.NoError(t, err)
return string(signed)
}
func TestBearerAuth_ValidJWT(t *testing.T) {
oidcSrv, priv := buildOIDCServer(t)
v, err := auth.NewValidator(oidcSrv.URL, "brain")
require.NoError(t, err)
called := false
handler := mcp.BearerAuth("static-secret", v, http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
called = true
w.WriteHeader(http.StatusOK)
}))
token := signJWT(t, priv, oidcSrv.URL, "brain", time.Now().Add(time.Hour))
req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
req.Header.Set("Authorization", "Bearer "+token)
rr := httptest.NewRecorder()
handler.ServeHTTP(rr, req)
assert.Equal(t, http.StatusOK, rr.Code)
assert.True(t, called)
}
func TestBearerAuth_InvalidJWT_FallsBackToStaticToken(t *testing.T) {
oidcSrv, _ := buildOIDCServer(t)
v, err := auth.NewValidator(oidcSrv.URL, "brain")
require.NoError(t, err)
handler := mcp.BearerAuth("static-secret", v, okHandler())
req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
req.Header.Set("Authorization", "Bearer static-secret")
rr := httptest.NewRecorder()
handler.ServeHTTP(rr, req)
assert.Equal(t, http.StatusOK, rr.Code)
}
func TestBearerAuth_InvalidJWT_WrongStaticToken(t *testing.T) {
oidcSrv, priv := buildOIDCServer(t)
v, err := auth.NewValidator(oidcSrv.URL, "brain")
require.NoError(t, err)
handler := mcp.BearerAuth("static-secret", v, okHandler())
// Expired JWT — JWT fails, static token doesn't match either
token := signJWT(t, priv, oidcSrv.URL, "brain", time.Now().Add(-time.Hour))
req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
req.Header.Set("Authorization", "Bearer "+token)
_ = context.Background() // satisfies import
rr := httptest.NewRecorder()
handler.ServeHTTP(rr, req)
assert.Equal(t, http.StatusUnauthorized, rr.Code)
}

View File

@@ -69,6 +69,20 @@ func (s *Server) tools() []map[string]any {
"dry_run": map[string]any{"type": "boolean"},
}),
},
{
"name": "brain_answer",
"description": "Retrieve relevant brain content via BM25 and synthesize a coherent answer using an LLM.",
"inputSchema": schema([]string{"query"}, map[string]any{
"query": str("question to answer"),
}),
},
{
"name": "brain_classify",
"description": "Classify raw text into doc type, title, and tags using an LLM.",
"inputSchema": schema([]string{"text"}, map[string]any{
"text": str("raw document text to classify (first 3000 chars used)"),
}),
},
{
"name": "session_log",
"description": "Append a structured entry to brain/sessions/<session_id>.jsonl.",

View File

@@ -40,7 +40,7 @@ func TestBrainQueryReturnsResults(t *testing.T) {
0o644,
))
srv := mcp.NewServer(brainDir, nil, nil)
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_query", map[string]any{"query": "tdd"})
require.Nil(t, resp["error"])
@@ -53,7 +53,7 @@ func TestBrainQueryReturnsResults(t *testing.T) {
func TestBrainWriteCreatesFile(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_write", map[string]any{
"content": "# Test\n\nbody",
@@ -72,7 +72,7 @@ func TestBrainWriteCreatesFile(t *testing.T) {
func TestBrainWriteRejectsTraversal(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_write", map[string]any{
"content": "x",
@@ -83,7 +83,7 @@ func TestBrainWriteRejectsTraversal(t *testing.T) {
func TestBrainWriteAcceptsDoubleDotInName(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_write", map[string]any{
"content": "x",
@@ -98,7 +98,7 @@ func TestBrainWriteAcceptsDoubleDotInName(t *testing.T) {
func TestBrainIngestRawDryRun(t *testing.T) {
brainDir := t.TempDir()
require.NoError(t, os.MkdirAll(filepath.Join(brainDir, "wiki", "concepts"), 0o755))
srv := mcp.NewServer(brainDir, nil, nil)
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_ingest_raw", map[string]any{
"source": "test-source",
@@ -130,7 +130,7 @@ func TestBrainIngestRawDryRun(t *testing.T) {
func TestBrainIngestRejectsBoth(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_ingest", map[string]any{
"content": "x",
@@ -142,7 +142,7 @@ func TestBrainIngestRejectsBoth(t *testing.T) {
func TestBrainIngestRequiresOne(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_ingest", map[string]any{})
require.NotNil(t, resp["error"])
@@ -150,7 +150,7 @@ func TestBrainIngestRequiresOne(t *testing.T) {
func TestBrainIngestRejectsContentWithoutSource(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_ingest", map[string]any{
"content": "x",
@@ -160,7 +160,7 @@ func TestBrainIngestRejectsContentWithoutSource(t *testing.T) {
func TestBrainIngestRequiresLLMConfigured(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil) // nil pipelineCfg → no LLM
srv := mcp.NewServer(brainDir, nil, nil, nil) // nil pipelineCfg → no LLM
resp := toolCall(t, srv, "brain_ingest", map[string]any{
"content": "some content",
@@ -173,7 +173,7 @@ func TestBrainIngestRequiresLLMConfigured(t *testing.T) {
func TestSessionLogAppends(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "session_log", map[string]any{
"session_id": "session-x",
@@ -190,7 +190,7 @@ func TestSessionLogAppends(t *testing.T) {
}
func TestSessionLogRequiresSessionID(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil)
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
resp := toolCall(t, srv, "session_log", map[string]any{"skill": "tdd"})
require.NotNil(t, resp["error"])
}

View File

@@ -14,7 +14,7 @@ import (
)
func TestMCPMountedHandler(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil)
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
mux := http.NewServeMux()
mux.Handle("POST /mcp", srv)

View File

@@ -32,22 +32,38 @@ type rpcError struct {
// Server handles MCP JSON-RPC over HTTP for the ingestion service.
type Server struct {
brainDir string
pipeline pipeline.Config
llm pipeline.CompleteFunc
brainDir string
pipeline pipeline.Config
llm pipeline.CompleteFunc
answerLLM pipeline.CompleteFunc // nil = brain_answer and brain_classify unavailable
}
// NewServer constructs a Server bound to brainDir. pipelineCfg supplies the
// LLM-backed pipeline; llm may be nil for non-LLM tools only.
func NewServer(brainDir string, pipelineCfg *pipeline.Config, llm pipeline.CompleteFunc) *Server {
// answerLLM drives brain_answer and brain_classify; nil disables those tools.
func NewServer(brainDir string, pipelineCfg *pipeline.Config, llm pipeline.CompleteFunc, answerLLM pipeline.CompleteFunc) *Server {
cfg := pipeline.Config{}
if pipelineCfg != nil {
cfg = *pipelineCfg
}
return &Server{brainDir: brainDir, pipeline: cfg, llm: llm}
return &Server{brainDir: brainDir, pipeline: cfg, llm: llm, answerLLM: answerLLM}
}
func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {
// MCP streamable HTTP: GET establishes the SSE stream for server-to-client events.
if r.Method == http.MethodGet {
w.Header().Set("Content-Type", "text/event-stream")
w.Header().Set("Cache-Control", "no-cache")
w.Header().Set("Connection", "keep-alive")
w.Header().Set("X-Accel-Buffering", "no")
w.WriteHeader(http.StatusOK)
if f, ok := w.(http.Flusher); ok {
f.Flush()
}
<-r.Context().Done()
return
}
var req request
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
writeError(w, nil, -32700, "parse error")
@@ -126,6 +142,10 @@ func (s *Server) handleCall(ctx context.Context, name string, args json.RawMessa
return s.brainIngest(ctx, args)
case "session_log":
return s.sessionLog(ctx, args)
case "brain_answer":
return s.brainAnswer(ctx, args)
case "brain_classify":
return s.brainClassify(ctx, args)
default:
return nil, fmt.Errorf("unknown tool: %s", name)
}

View File

@@ -21,7 +21,7 @@ func body(t *testing.T, v any) *bytes.Buffer {
}
func TestServerInitialize(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil)
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
req := httptest.NewRequest(http.MethodPost, "/mcp", body(t, map[string]any{
"jsonrpc": "2.0", "id": 1, "method": "initialize",
@@ -38,7 +38,7 @@ func TestServerInitialize(t *testing.T) {
}
func TestServerToolsList(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil)
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
req := httptest.NewRequest(http.MethodPost, "/mcp", body(t, map[string]any{
"jsonrpc": "2.0", "id": 2, "method": "tools/list",
@@ -55,12 +55,13 @@ func TestServerToolsList(t *testing.T) {
names = append(names, t.(map[string]any)["name"].(string))
}
assert.ElementsMatch(t, []string{
"brain_query", "brain_write", "brain_ingest_raw", "brain_ingest", "session_log",
"brain_query", "brain_write", "brain_ingest_raw", "brain_ingest",
"brain_answer", "brain_classify", "session_log",
}, names)
}
func TestServerNotificationGetsNoBody(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil)
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
req := httptest.NewRequest(http.MethodPost, "/mcp", body(t, map[string]any{
"jsonrpc": "2.0", "method": "notifications/initialized",
@@ -73,7 +74,7 @@ func TestServerNotificationGetsNoBody(t *testing.T) {
}
func TestServerUnknownMethodReturnsError(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil)
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
req := httptest.NewRequest(http.MethodPost, "/mcp", body(t, map[string]any{
"jsonrpc": "2.0", "id": 3, "method": "unknown/method",

View File

@@ -0,0 +1,114 @@
package mcp
import (
"context"
"encoding/json"
"fmt"
"strings"
"github.com/mathiasbq/hyperguild/ingestion/internal/search"
)
const (
answerSystemPrompt = `You are a knowledge assistant. Answer the question using ONLY the provided sources.
Cite source file paths inline when referencing specific content.
If the context does not contain enough information to answer, say so clearly.`
classifySystemPrompt = `Classify the document. Respond with JSON only, no markdown fences.
{"type":"...","title":"...","tags":["..."]}
Valid types: spec, plan, decision, note, wiki, log, code, unknown.`
)
type brainAnswerArgs struct {
Query string `json:"query"`
}
func (s *Server) brainAnswer(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
if s.answerLLM == nil {
return nil, fmt.Errorf("answer LLM not configured: set BRAIN_LLM_PRIMARY_URL")
}
var a brainAnswerArgs
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if a.Query == "" {
return nil, fmt.Errorf("query is required")
}
results, err := search.Query(s.brainDir, a.Query, 10)
if err != nil {
return nil, fmt.Errorf("search: %w", err)
}
if len(results) == 0 {
return json.Marshal(map[string]any{
"answer": "No relevant content found in brain.",
"sources": []string{},
})
}
var sb strings.Builder
sources := make([]string, 0, len(results))
for _, r := range results {
fmt.Fprintf(&sb, "<source path=%q>\n%s\n</source>\n\n", r.Path, r.Excerpt)
sources = append(sources, r.Path)
}
answer, err := s.answerLLM(ctx, answerSystemPrompt, sb.String()+"Question: "+a.Query)
if err != nil {
return nil, fmt.Errorf("llm: %w", err)
}
return json.Marshal(map[string]any{
"answer": answer,
"sources": sources,
})
}
type brainClassifyArgs struct {
Text string `json:"text"`
}
type classifyResult struct {
Type string `json:"type"`
Title string `json:"title"`
Tags []string `json:"tags"`
}
func (s *Server) brainClassify(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
if s.answerLLM == nil {
return nil, fmt.Errorf("answer LLM not configured: set BRAIN_LLM_PRIMARY_URL")
}
var a brainClassifyArgs
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if a.Text == "" {
return nil, fmt.Errorf("text is required")
}
text := a.Text
if len(text) > 3000 {
text = text[:3000]
}
raw, err := s.answerLLM(ctx, classifySystemPrompt, text)
if err != nil {
return nil, fmt.Errorf("llm: %w", err)
}
// Strip markdown fences if model adds them despite the instruction.
raw = strings.TrimSpace(raw)
raw = strings.TrimPrefix(raw, "```json")
raw = strings.TrimPrefix(raw, "```")
raw = strings.TrimSuffix(raw, "```")
raw = strings.TrimSpace(raw)
var cr classifyResult
if err := json.Unmarshal([]byte(raw), &cr); err != nil {
return nil, fmt.Errorf("parse classify response %q: %w", raw, err)
}
if cr.Tags == nil {
cr.Tags = []string{}
}
return json.Marshal(cr)
}

View File

@@ -0,0 +1,103 @@
package mcp_test
import (
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"os"
"path/filepath"
"testing"
"github.com/mathiasbq/hyperguild/ingestion/internal/mcp"
"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func mockAnswerLLM(response string) pipeline.CompleteFunc {
return func(_ context.Context, _, _ string) (string, error) {
return response, nil
}
}
func brainDirWithContent(t *testing.T) string {
t.Helper()
dir := t.TempDir()
wikiDir := filepath.Join(dir, "wiki")
require.NoError(t, os.MkdirAll(wikiDir, 0o755))
require.NoError(t, os.WriteFile(filepath.Join(wikiDir, "test.md"), []byte(
"---\ntitle: Pass-rate Logging\ntype: spec\n---\n\nPass-rate logging tracks skill invocations.",
), 0o644))
return dir
}
func callTool(t *testing.T, ts *httptest.Server, name string, arguments map[string]any) map[string]any {
t.Helper()
req := map[string]any{
"jsonrpc": "2.0", "id": 1, "method": "tools/call",
"params": map[string]any{"name": name, "arguments": arguments},
}
resp, err := http.Post(ts.URL, "application/json", body(t, req))
require.NoError(t, err)
defer resp.Body.Close() //nolint:errcheck
var out map[string]any
require.NoError(t, json.NewDecoder(resp.Body).Decode(&out))
return out
}
func TestBrainAnswer_NoLLM(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
ts := httptest.NewServer(srv)
defer ts.Close()
rpc := callTool(t, ts, "brain_answer", map[string]any{"query": "test"})
assert.NotNil(t, rpc["error"], "expected error when answerLLM is nil")
}
func TestBrainAnswer_Synthesizes(t *testing.T) {
brainDir := brainDirWithContent(t)
srv := mcp.NewServer(brainDir, nil, nil, mockAnswerLLM("Pass-rate logging is described in spec."))
ts := httptest.NewServer(srv)
defer ts.Close()
rpc := callTool(t, ts, "brain_answer", map[string]any{"query": "pass-rate logging"})
require.Nil(t, rpc["error"])
content := rpc["result"].(map[string]any)["content"].([]any)[0].(map[string]any)["text"].(string)
var result map[string]any
require.NoError(t, json.Unmarshal([]byte(content), &result))
assert.Equal(t, "Pass-rate logging is described in spec.", result["answer"])
assert.NotEmpty(t, result["sources"])
}
func TestBrainClassify_ReturnsJSON(t *testing.T) {
llmResp := `{"type":"spec","title":"My Spec","tags":["go","mcp"]}`
srv := mcp.NewServer(t.TempDir(), nil, nil, mockAnswerLLM(llmResp))
ts := httptest.NewServer(srv)
defer ts.Close()
rpc := callTool(t, ts, "brain_classify", map[string]any{"text": "# My Spec\n\nThis is a Go MCP spec."})
require.Nil(t, rpc["error"])
content := rpc["result"].(map[string]any)["content"].([]any)[0].(map[string]any)["text"].(string)
var result map[string]any
require.NoError(t, json.Unmarshal([]byte(content), &result))
assert.Equal(t, "spec", result["type"])
assert.Equal(t, "My Spec", result["title"])
}
func TestBrainClassify_StripsFences(t *testing.T) {
llmResp := "```json\n{\"type\":\"note\",\"title\":\"T\",\"tags\":[]}\n```"
srv := mcp.NewServer(t.TempDir(), nil, nil, mockAnswerLLM(llmResp))
ts := httptest.NewServer(srv)
defer ts.Close()
rpc := callTool(t, ts, "brain_classify", map[string]any{"text": "some text"})
require.Nil(t, rpc["error"])
content := rpc["result"].(map[string]any)["content"].([]any)[0].(map[string]any)["text"].(string)
var result map[string]any
require.NoError(t, json.Unmarshal([]byte(content), &result))
assert.Equal(t, "note", result["type"])
}

84
internal/auth/jwt.go Normal file
View File

@@ -0,0 +1,84 @@
package auth
import (
"context"
"encoding/json"
"fmt"
"net/http"
"time"
"github.com/lestrrat-go/jwx/v2/jwk"
"github.com/lestrrat-go/jwx/v2/jwt"
)
// Validator validates Bearer JWTs issued by a Dex (OIDC) authorization server.
// Audience is optional; leave empty to skip audience validation.
type Validator struct {
issuer string
audience string
jwksURI string
cache *jwk.Cache
}
// NewValidator fetches the OIDC discovery document from issuerURL, extracts
// jwks_uri, seeds the JWKS cache, and returns a ready Validator.
// If DEX_ISSUER_URL is not set the caller should pass "" and skip construction.
func NewValidator(issuerURL, audience string) (*Validator, error) {
resp, err := http.Get(issuerURL + "/.well-known/openid-configuration") //nolint:noctx
if err != nil {
return nil, fmt.Errorf("fetch oidc discovery: %w", err)
}
defer resp.Body.Close() //nolint:errcheck
if resp.StatusCode != http.StatusOK {
return nil, fmt.Errorf("oidc discovery: status %d", resp.StatusCode)
}
var doc struct {
JWKSURI string `json:"jwks_uri"`
}
if err := json.NewDecoder(resp.Body).Decode(&doc); err != nil {
return nil, fmt.Errorf("decode oidc discovery: %w", err)
}
if doc.JWKSURI == "" {
return nil, fmt.Errorf("oidc discovery: empty jwks_uri")
}
ctx := context.Background()
cache := jwk.NewCache(ctx)
if err := cache.Register(doc.JWKSURI, jwk.WithMinRefreshInterval(time.Hour)); err != nil {
return nil, fmt.Errorf("register jwks cache: %w", err)
}
if _, err := cache.Refresh(ctx, doc.JWKSURI); err != nil {
return nil, fmt.Errorf("initial jwks fetch: %w", err)
}
return &Validator{
issuer: issuerURL,
audience: audience,
jwksURI: doc.JWKSURI,
cache: cache,
}, nil
}
// Validate parses and validates rawToken. Returns the subject claim on success.
func (v *Validator) Validate(ctx context.Context, rawToken string) (string, error) {
keySet, err := v.cache.Get(ctx, v.jwksURI)
if err != nil {
return "", fmt.Errorf("get jwks: %w", err)
}
opts := []jwt.ParseOption{
jwt.WithKeySet(keySet),
jwt.WithValidate(true),
jwt.WithIssuer(v.issuer),
}
if v.audience != "" {
opts = append(opts, jwt.WithAudience(v.audience))
}
tok, err := jwt.ParseString(rawToken, opts...)
if err != nil {
return "", fmt.Errorf("validate jwt: %w", err)
}
return tok.Subject(), nil
}

169
internal/auth/jwt_test.go Normal file
View File

@@ -0,0 +1,169 @@
package auth_test
import (
"context"
"crypto/rand"
"crypto/rsa"
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"time"
"github.com/lestrrat-go/jwx/v2/jwa"
"github.com/lestrrat-go/jwx/v2/jwk"
"github.com/lestrrat-go/jwx/v2/jwt"
"github.com/mathiasbq/supervisor/internal/auth"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
type testKeys struct {
priv jwk.Key
pub jwk.Key
}
func generateRSAKeys(t *testing.T) testKeys {
t.Helper()
raw, err := rsa.GenerateKey(rand.Reader, 2048)
require.NoError(t, err)
priv, err := jwk.FromRaw(raw)
require.NoError(t, err)
require.NoError(t, priv.Set(jwk.KeyIDKey, "test-kid"))
require.NoError(t, priv.Set(jwk.AlgorithmKey, jwa.RS256))
pub, err := jwk.PublicKeyOf(priv)
require.NoError(t, err)
return testKeys{priv: priv, pub: pub}
}
func mockOIDCServer(t *testing.T, keys testKeys) *httptest.Server {
t.Helper()
set := jwk.NewSet()
require.NoError(t, set.AddKey(keys.pub))
jwksBytes, err := json.Marshal(set)
require.NoError(t, err)
mux := http.NewServeMux()
var srv *httptest.Server
mux.HandleFunc("/.well-known/openid-configuration", func(w http.ResponseWriter, _ *http.Request) {
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(map[string]string{
"issuer": srv.URL,
"jwks_uri": srv.URL + "/jwks",
})
})
mux.HandleFunc("/jwks", func(w http.ResponseWriter, _ *http.Request) {
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write(jwksBytes)
})
srv = httptest.NewServer(mux)
t.Cleanup(srv.Close)
return srv
}
func signToken(t *testing.T, keys testKeys, issuer, audience, subject string, exp time.Time) string {
t.Helper()
b := jwt.NewBuilder().
Issuer(issuer).
Subject(subject).
Expiration(exp)
if audience != "" {
b = b.Audience([]string{audience})
}
tok, err := b.Build()
require.NoError(t, err)
signed, err := jwt.Sign(tok, jwt.WithKey(jwa.RS256, keys.priv))
require.NoError(t, err)
return string(signed)
}
func TestValidator(t *testing.T) {
keys := generateRSAKeys(t)
srv := mockOIDCServer(t, keys)
ctx := context.Background()
v, err := auth.NewValidator(srv.URL, "brain")
require.NoError(t, err)
tests := []struct {
name string
token string
wantSub string
wantErr bool
}{
{
name: "valid jwt",
token: signToken(t, keys, srv.URL, "brain", "test-user", time.Now().Add(time.Hour)),
wantSub: "test-user",
},
{
name: "expired jwt",
token: signToken(t, keys, srv.URL, "brain", "test-user", time.Now().Add(-time.Hour)),
wantErr: true,
},
{
name: "wrong issuer",
token: signToken(t, keys, "https://evil.example.com", "brain", "test-user", time.Now().Add(time.Hour)),
wantErr: true,
},
{
name: "wrong audience",
token: signToken(t, keys, srv.URL, "other-service", "test-user", time.Now().Add(time.Hour)),
wantErr: true,
},
{
name: "tampered token",
token: signToken(t, keys, srv.URL, "brain", "test-user", time.Now().Add(time.Hour)) + "tampered",
wantErr: true,
},
{
name: "not a jwt",
token: "not-a-jwt",
wantErr: true,
},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
sub, err := v.Validate(ctx, tc.token)
if tc.wantErr {
assert.Error(t, err)
assert.Empty(t, sub)
} else {
require.NoError(t, err)
assert.Equal(t, tc.wantSub, sub)
}
})
}
}
func TestNewValidator_NoAudience(t *testing.T) {
keys := generateRSAKeys(t)
srv := mockOIDCServer(t, keys)
ctx := context.Background()
v, err := auth.NewValidator(srv.URL, "")
require.NoError(t, err)
// Token without audience passes when audience validation is disabled.
tok, err := jwt.NewBuilder().
Issuer(srv.URL).
Subject("sub").
Expiration(time.Now().Add(time.Hour)).
Build()
require.NoError(t, err)
signed, err := jwt.Sign(tok, jwt.WithKey(jwa.RS256, keys.priv))
require.NoError(t, err)
sub, err := v.Validate(ctx, string(signed))
require.NoError(t, err)
assert.Equal(t, "sub", sub)
}
func TestNewValidator_BadDiscoveryURL(t *testing.T) {
_, err := auth.NewValidator("http://127.0.0.1:1", "brain")
assert.Error(t, err)
}

View File

@@ -0,0 +1,23 @@
package auth
import (
"encoding/json"
"net/http"
)
// ProtectedResourceHandler returns an RFC 9728 oauth-protected-resource metadata
// handler. Mount at GET /.well-known/oauth-protected-resource (no auth required).
func ProtectedResourceHandler(resourceURL, issuerURL string) http.HandlerFunc {
type metadata struct {
Resource string `json:"resource"`
AuthorizationServers []string `json:"authorization_servers"`
}
body, _ := json.Marshal(metadata{
Resource: resourceURL,
AuthorizationServers: []string{issuerURL},
})
return func(w http.ResponseWriter, _ *http.Request) {
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write(body)
}
}

View File

@@ -0,0 +1,28 @@
package auth_test
import (
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"github.com/mathiasbq/supervisor/internal/auth"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestProtectedResourceHandler(t *testing.T) {
h := auth.ProtectedResourceHandler("https://brain-mcp.d-ma.be", "https://auth.d-ma.be")
req := httptest.NewRequest(http.MethodGet, "/.well-known/oauth-protected-resource", nil)
rr := httptest.NewRecorder()
h(rr, req)
assert.Equal(t, http.StatusOK, rr.Code)
assert.Equal(t, "application/json", rr.Header().Get("Content-Type"))
var body map[string]any
require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &body))
assert.Equal(t, "https://brain-mcp.d-ma.be", body["resource"])
servers := body["authorization_servers"].([]any)
assert.Equal(t, "https://auth.d-ma.be", servers[0])
}

View File

@@ -14,8 +14,8 @@ type RoutingConfig struct {
LiteLLMBaseURL string // LITELLM_BASE_URL, default http://piguard:4000
LiteLLMAPIKey string // LITELLM_API_KEY
BrainURL string // BRAIN_URL, default http://ingestion.supervisor:3300
LocalModel string // HYPERGUILD_LOCAL_MODEL, default qwen35
ClaudeModel string // HYPERGUILD_CLAUDE_MODEL, default claude-sonnet-4-6
FastModel string // HYPERGUILD_FAST_MODEL, default koala/qwen35-9b-fast
ThinkingModel string // HYPERGUILD_THINKING_MODEL, default iguana/gemma4-26b
// RouteLocalFloor and RouteLocalCeil intentionally invert the usual
// floor < ceil mathematical convention: Floor (default 0.90) is the
// UPPER boundary — at/above it, always route local; Ceil (default 0.70)
@@ -25,6 +25,16 @@ type RoutingConfig struct {
RouteLocalFloor float64 // HYPERGUILD_ROUTE_LOCAL_FLOOR, default 0.90
RouteLocalCeil float64 // HYPERGUILD_ROUTE_LOCAL_CEIL, default 0.70
PassRateTTLSeconds int // HYPERGUILD_PASS_RATE_TTL_SECONDS, default 60
// project_create configuration. Empty GiteaMCPURL disables the
// project_create tool registration so the routing pod still starts
// in environments where it's not wired up.
GiteaMCPURL string // GITEA_MCP_URL, e.g. http://koala:30340/mcp
GiteaMCPToken string // GITEA_MCP_TOKEN, bearer for gitea-mcp
GiteaOwner string // GITEA_OWNER, default mathias
GitHubOwner string // GITHUB_OWNER, default mathiasb
InfraRepo string // INFRA_REPO, default infra
GitHubPAT string // GITHUB_PAT, repo scope; never logged
}
func LoadRouting() (RoutingConfig, error) {
@@ -34,8 +44,8 @@ func LoadRouting() (RoutingConfig, error) {
LiteLLMBaseURL: envOr("LITELLM_BASE_URL", "http://piguard:4000"),
LiteLLMAPIKey: os.Getenv("LITELLM_API_KEY"),
BrainURL: envOr("BRAIN_URL", "http://ingestion.supervisor:3300"),
LocalModel: envOr("HYPERGUILD_LOCAL_MODEL", "qwen35"),
ClaudeModel: envOr("HYPERGUILD_CLAUDE_MODEL", "claude-sonnet-4-6"),
FastModel: envOr("HYPERGUILD_FAST_MODEL", "koala/qwen35-9b-fast"),
ThinkingModel: envOr("HYPERGUILD_THINKING_MODEL", "iguana/gemma4-26b"),
}
floor, err := parseFloatEnv("HYPERGUILD_ROUTE_LOCAL_FLOOR", 0.90)
@@ -56,6 +66,13 @@ func LoadRouting() (RoutingConfig, error) {
}
cfg.PassRateTTLSeconds = ttl
cfg.GiteaMCPURL = os.Getenv("GITEA_MCP_URL")
cfg.GiteaMCPToken = os.Getenv("GITEA_MCP_TOKEN")
cfg.GiteaOwner = envOr("GITEA_OWNER", "mathias")
cfg.GitHubOwner = envOr("GITHUB_OWNER", "mathiasb")
cfg.InfraRepo = envOr("INFRA_REPO", "infra")
cfg.GitHubPAT = os.Getenv("GITHUB_PAT")
return cfg, nil
}

View File

@@ -11,7 +11,7 @@ import (
func TestLoadRoutingDefaults(t *testing.T) {
for _, k := range []string{
"ROUTING_PORT", "ROUTING_MCP_TOKEN", "LITELLM_BASE_URL", "LITELLM_API_KEY",
"BRAIN_URL", "HYPERGUILD_LOCAL_MODEL", "HYPERGUILD_CLAUDE_MODEL",
"BRAIN_URL", "HYPERGUILD_FAST_MODEL", "HYPERGUILD_THINKING_MODEL",
"HYPERGUILD_ROUTE_LOCAL_FLOOR", "HYPERGUILD_ROUTE_LOCAL_CEIL",
"HYPERGUILD_PASS_RATE_TTL_SECONDS",
} {
@@ -24,8 +24,8 @@ func TestLoadRoutingDefaults(t *testing.T) {
assert.Equal(t, "", cfg.MCPAuthToken)
assert.Equal(t, "http://piguard:4000", cfg.LiteLLMBaseURL)
assert.Equal(t, "http://ingestion.supervisor:3300", cfg.BrainURL)
assert.Equal(t, "qwen35", cfg.LocalModel)
assert.Equal(t, "claude-sonnet-4-6", cfg.ClaudeModel)
assert.Equal(t, "koala/qwen35-9b-fast", cfg.FastModel)
assert.Equal(t, "iguana/gemma4-26b", cfg.ThinkingModel)
assert.InDelta(t, 0.90, cfg.RouteLocalFloor, 1e-9)
assert.InDelta(t, 0.70, cfg.RouteLocalCeil, 1e-9)
assert.Equal(t, 60, cfg.PassRateTTLSeconds)
@@ -38,8 +38,8 @@ func TestLoadRoutingFromEnv(t *testing.T) {
t.Setenv("LITELLM_BASE_URL", "http://localhost:4000")
t.Setenv("LITELLM_API_KEY", "lk")
t.Setenv("BRAIN_URL", "http://localhost:3300")
t.Setenv("HYPERGUILD_LOCAL_MODEL", "qwen2-7b")
t.Setenv("HYPERGUILD_CLAUDE_MODEL", "claude-opus-4-7")
t.Setenv("HYPERGUILD_FAST_MODEL", "koala/phi4-14b")
t.Setenv("HYPERGUILD_THINKING_MODEL", "iguana/qwen3-14b-think")
t.Setenv("HYPERGUILD_ROUTE_LOCAL_FLOOR", "0.85")
t.Setenv("HYPERGUILD_ROUTE_LOCAL_CEIL", "0.65")
t.Setenv("HYPERGUILD_PASS_RATE_TTL_SECONDS", "30")
@@ -51,8 +51,8 @@ func TestLoadRoutingFromEnv(t *testing.T) {
assert.Equal(t, "http://localhost:4000", cfg.LiteLLMBaseURL)
assert.Equal(t, "lk", cfg.LiteLLMAPIKey)
assert.Equal(t, "http://localhost:3300", cfg.BrainURL)
assert.Equal(t, "qwen2-7b", cfg.LocalModel)
assert.Equal(t, "claude-opus-4-7", cfg.ClaudeModel)
assert.Equal(t, "koala/phi4-14b", cfg.FastModel)
assert.Equal(t, "iguana/qwen3-14b-think", cfg.ThinkingModel)
assert.InDelta(t, 0.85, cfg.RouteLocalFloor, 1e-9)
assert.InDelta(t, 0.65, cfg.RouteLocalCeil, 1e-9)
assert.Equal(t, 30, cfg.PassRateTTLSeconds)

View File

@@ -0,0 +1,108 @@
// Package githubclient is a minimal GitHub REST API client. The hyperguild
// project_create flow is gitea-first; this client exists only to create an
// empty repo on GitHub before the gitea→github push-mirror is configured,
// since the mirror cannot push to a non-existent remote.
package githubclient
import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"time"
)
const defaultAPI = "https://api.github.com"
type Client struct {
api string
token string
http *http.Client
}
// New returns a Client with the given personal access token (repo scope).
func New(token string) *Client {
return &Client{
api: defaultAPI,
token: token,
http: &http.Client{Timeout: 30 * time.Second},
}
}
// WithBaseURL overrides the API base (test injection).
func (c *Client) WithBaseURL(u string) *Client {
c.api = u
return c
}
// Repo is the subset of GitHub's repo response we surface upstream.
type Repo struct {
FullName string `json:"full_name"`
HTMLURL string `json:"html_url"`
CloneURL string `json:"clone_url"`
Private bool `json:"private"`
}
type createRepoArgs struct {
Name string `json:"name"`
Description string `json:"description,omitempty"`
Private bool `json:"private"`
AutoInit bool `json:"auto_init"`
}
// ErrAlreadyExists is returned by CreateRepo when GitHub responds 422 with
// "name already exists". Callers treat it as idempotent success.
var ErrAlreadyExists = fmt.Errorf("github repo already exists")
// CreateRepo creates a repo under the authenticated user's account.
// auto_init is always false — the push-mirror will populate the repo from
// gitea, so an auto-generated README would conflict on first push.
func (c *Client) CreateRepo(ctx context.Context, name, description string, private bool) (*Repo, error) {
if c.token == "" {
return nil, fmt.Errorf("github pat not configured")
}
body, _ := json.Marshal(createRepoArgs{
Name: name,
Description: description,
Private: private,
AutoInit: false,
})
req, err := http.NewRequestWithContext(ctx, http.MethodPost, c.api+"/user/repos", bytes.NewReader(body))
if err != nil {
return nil, fmt.Errorf("new request: %w", err)
}
req.Header.Set("Authorization", "token "+c.token)
req.Header.Set("Accept", "application/vnd.github+json")
req.Header.Set("Content-Type", "application/json")
req.Header.Set("X-GitHub-Api-Version", "2022-11-28")
resp, err := c.http.Do(req)
if err != nil {
return nil, fmt.Errorf("http: %w", err)
}
defer func() { _ = resp.Body.Close() }()
raw, _ := io.ReadAll(resp.Body)
switch resp.StatusCode {
case http.StatusCreated:
var r Repo
if err := json.Unmarshal(raw, &r); err != nil {
return nil, fmt.Errorf("decode response: %w", err)
}
return &r, nil
case http.StatusUnprocessableEntity:
// 422 covers "name already exists" + a handful of other validation
// errors. Treat any 422 that mentions "already exists" as idempotent
// success; everything else surfaces verbatim.
if bytes.Contains(raw, []byte("already exists")) {
return nil, ErrAlreadyExists
}
return nil, fmt.Errorf("github 422: %s", string(raw))
case http.StatusUnauthorized, http.StatusForbidden:
return nil, fmt.Errorf("github auth %d: PAT missing repo scope or invalid", resp.StatusCode)
default:
return nil, fmt.Errorf("github %d: %s", resp.StatusCode, string(raw))
}
}

View File

@@ -0,0 +1,71 @@
package githubclient_test
import (
"context"
"encoding/json"
"errors"
"io"
"net/http"
"net/http/httptest"
"testing"
"github.com/mathiasbq/supervisor/internal/githubclient"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestCreateRepo_Success(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
assert.Equal(t, http.MethodPost, r.Method)
assert.Equal(t, "/user/repos", r.URL.Path)
assert.Equal(t, "token ghp_test", r.Header.Get("Authorization"))
var args map[string]any
b, _ := io.ReadAll(r.Body)
_ = json.Unmarshal(b, &args)
assert.Equal(t, "test-repo", args["name"])
assert.Equal(t, true, args["private"])
assert.Equal(t, false, args["auto_init"])
w.WriteHeader(http.StatusCreated)
_, _ = w.Write([]byte(`{"full_name":"mathiasb/test-repo","html_url":"https://github.com/mathiasb/test-repo","clone_url":"https://github.com/mathiasb/test-repo.git","private":true}`))
}))
defer srv.Close()
c := githubclient.New("ghp_test").WithBaseURL(srv.URL)
r, err := c.CreateRepo(context.Background(), "test-repo", "desc", true)
require.NoError(t, err)
assert.Equal(t, "mathiasb/test-repo", r.FullName)
assert.True(t, r.Private)
}
func TestCreateRepo_AlreadyExists(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
w.WriteHeader(http.StatusUnprocessableEntity)
_, _ = w.Write([]byte(`{"message":"Validation Failed","errors":[{"resource":"Repository","code":"custom","field":"name","message":"name already exists on this account"}]}`))
}))
defer srv.Close()
c := githubclient.New("ghp_test").WithBaseURL(srv.URL)
_, err := c.CreateRepo(context.Background(), "x", "", false)
require.Error(t, err)
assert.True(t, errors.Is(err, githubclient.ErrAlreadyExists))
}
func TestCreateRepo_Unauthorized(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
w.WriteHeader(http.StatusUnauthorized)
_, _ = w.Write([]byte(`{"message":"Bad credentials"}`))
}))
defer srv.Close()
c := githubclient.New("ghp_test").WithBaseURL(srv.URL)
_, err := c.CreateRepo(context.Background(), "x", "", false)
require.Error(t, err)
assert.Contains(t, err.Error(), "PAT missing repo scope")
}
func TestCreateRepo_NoToken(t *testing.T) {
c := githubclient.New("")
_, err := c.CreateRepo(context.Background(), "x", "", false)
require.Error(t, err)
assert.Contains(t, err.Error(), "github pat not configured")
}

View File

@@ -8,6 +8,7 @@ import (
"net/http"
"strings"
"github.com/mathiasbq/supervisor/internal/auth"
"github.com/mathiasbq/supervisor/internal/registry"
)
@@ -32,15 +33,16 @@ type rpcError struct {
// Server is an HTTP handler implementing the MCP JSON-RPC protocol.
type Server struct {
reg *registry.Registry
token string
reg *registry.Registry
token string
validator *auth.Validator
}
// NewServer constructs an MCP HTTP handler. If token is non-empty, every
// request must carry "Authorization: Bearer <token>" or it is rejected with
// HTTP 401 and JSON-RPC error -32001. Empty token disables auth (default).
func NewServer(reg *registry.Registry, token string) *Server {
return &Server{reg: reg, token: token}
// NewServer constructs an MCP HTTP handler. token is the static bearer token
// (empty disables static auth). validator is optional; when non-nil, a valid
// JWT from Dex is accepted in addition to the static token.
func NewServer(reg *registry.Registry, token string, validator *auth.Validator) *Server {
return &Server{reg: reg, token: token, validator: validator}
}
func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {
@@ -48,6 +50,22 @@ func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {
return
}
// GET opens the SSE stream for server-to-client events (MCP streamable HTTP).
// claude.ai probes with GET before sending initialize, so accept without a session.
if r.Method == http.MethodGet {
w.Header().Set("Content-Type", "text/event-stream")
w.Header().Set("Cache-Control", "no-cache")
w.Header().Set("Connection", "keep-alive")
w.Header().Set("X-Accel-Buffering", "no")
w.WriteHeader(http.StatusOK)
if f, ok := w.(http.Flusher); ok {
_, _ = w.Write([]byte(": stream open\n\n"))
f.Flush()
}
<-r.Context().Done()
return
}
var req request
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
writeError(w, nil, -32700, "parse error")
@@ -104,27 +122,42 @@ func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {
})
}
// checkAuth verifies the bearer token when one is configured. Returns true if
// the request may proceed, false if it has been rejected (401 already written).
// checkAuth verifies the bearer token. Accepts a valid Dex JWT (when validator
// is configured) or the static token. Returns true if the request may proceed.
// When neither token nor validator is configured, auth is disabled (default).
func (s *Server) checkAuth(w http.ResponseWriter, r *http.Request) bool {
if s.token == "" {
if s.token == "" && s.validator == nil {
return true
}
const prefix = "Bearer "
hdr := r.Header.Get("Authorization")
if !strings.HasPrefix(hdr, prefix) ||
subtle.ConstantTimeCompare([]byte(hdr[len(prefix):]), []byte(s.token)) != 1 {
slog.Warn("mcp auth rejected", "remote", r.RemoteAddr, "method", r.Method)
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusUnauthorized)
_ = json.NewEncoder(w).Encode(response{
JSONRPC: "2.0",
Error: &rpcError{Code: -32001, Message: "unauthorized"},
})
rawToken, ok := strings.CutPrefix(r.Header.Get("Authorization"), "Bearer ")
if !ok {
s.rejectAuth(w, r)
return false
}
return true
if s.validator != nil {
if _, err := s.validator.Validate(r.Context(), rawToken); err == nil {
return true
}
}
if s.token != "" && subtle.ConstantTimeCompare([]byte(rawToken), []byte(s.token)) == 1 {
return true
}
s.rejectAuth(w, r)
return false
}
func (s *Server) rejectAuth(w http.ResponseWriter, r *http.Request) {
slog.Warn("mcp auth rejected", "remote", r.RemoteAddr, "method", r.Method)
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusUnauthorized)
_ = json.NewEncoder(w).Encode(response{
JSONRPC: "2.0",
Error: &rpcError{Code: -32001, Message: "unauthorized"},
})
}
func writeError(w http.ResponseWriter, id any, code int, msg string) {

View File

@@ -23,7 +23,7 @@ func jsonBody(t *testing.T, v any) *bytes.Buffer {
func TestMCPInitialize(t *testing.T) {
reg := registry.New()
srv := mcp.NewServer(reg, "")
srv := mcp.NewServer(reg, "", nil)
req := httptest.NewRequest(http.MethodPost, "/mcp", jsonBody(t, map[string]any{
"jsonrpc": "2.0",
@@ -45,7 +45,7 @@ func TestMCPInitialize(t *testing.T) {
func TestMCPToolsList(t *testing.T) {
reg := registry.New()
srv := mcp.NewServer(reg, "")
srv := mcp.NewServer(reg, "", nil)
req := httptest.NewRequest(http.MethodPost, "/mcp", jsonBody(t, map[string]any{
"jsonrpc": "2.0", "id": 2, "method": "tools/list", "params": map[string]any{},
@@ -63,7 +63,7 @@ func TestMCPToolsList(t *testing.T) {
func TestMCPUnknownMethod(t *testing.T) {
reg := registry.New()
srv := mcp.NewServer(reg, "")
srv := mcp.NewServer(reg, "", nil)
req := httptest.NewRequest(http.MethodPost, "/mcp", jsonBody(t, map[string]any{
"jsonrpc": "2.0", "id": 3, "method": "unknown/method", "params": map[string]any{},
@@ -80,7 +80,7 @@ func TestMCPUnknownMethod(t *testing.T) {
func TestMCPNotificationKnownMethodGetsNoResponseBody(t *testing.T) {
reg := registry.New()
srv := mcp.NewServer(reg, "")
srv := mcp.NewServer(reg, "", nil)
// JSON-RPC 2.0 notification: "id" field absent. Per spec, server MUST NOT
// reply. notifications/initialized is part of the standard MCP handshake.
@@ -116,7 +116,7 @@ func TestMCPAuth(t *testing.T) {
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
reg := registry.New()
srv := mcp.NewServer(reg, tc.token)
srv := mcp.NewServer(reg, tc.token, nil)
req := httptest.NewRequest(http.MethodPost, "/mcp", jsonBody(t, map[string]any{
"jsonrpc": "2.0", "id": 1, "method": "initialize", "params": map[string]any{},
@@ -142,7 +142,7 @@ func TestMCPAuth(t *testing.T) {
func TestMCPNotificationUnknownMethodGetsNoResponseBody(t *testing.T) {
reg := registry.New()
srv := mcp.NewServer(reg, "")
srv := mcp.NewServer(reg, "", nil)
req := httptest.NewRequest(http.MethodPost, "/mcp", jsonBody(t, map[string]any{
"jsonrpc": "2.0",

View File

@@ -0,0 +1,135 @@
// Package mcpclient is a minimal JSON-RPC over HTTP client for talking to
// MCP servers from inside hyperguild components. It only implements
// `tools/call` because that's all consumer skills need today.
package mcpclient
import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"time"
)
// Client calls an MCP server over Streamable HTTP / JSON-RPC.
type Client struct {
url string
token string
http *http.Client
}
// New returns a Client. token may be empty for unauthenticated servers.
func New(url, token string) *Client {
return &Client{
url: url,
token: token,
http: &http.Client{Timeout: 60 * time.Second},
}
}
// WithHTTPClient overrides the underlying HTTP client (test injection).
func (c *Client) WithHTTPClient(h *http.Client) *Client {
c.http = h
return c
}
type rpcRequest struct {
JSONRPC string `json:"jsonrpc"`
ID int `json:"id"`
Method string `json:"method"`
Params map[string]any `json:"params"`
}
type rpcError struct {
Code int `json:"code"`
Message string `json:"message"`
}
type rpcResponse struct {
JSONRPC string `json:"jsonrpc"`
ID int `json:"id"`
Result json.RawMessage `json:"result,omitempty"`
Error *rpcError `json:"error,omitempty"`
}
// Error is returned when the remote MCP server signals a typed failure.
// Code follows JSON-RPC conventions; see gitea-mcp internal/mcp/jsonrpc.go
// for the codes the server uses (e.g. -32002 NotFound, -32003 Conflict).
type Error struct {
Code int
Message string
}
func (e *Error) Error() string { return fmt.Sprintf("mcp error %d: %s", e.Code, e.Message) }
// CallTool issues `tools/call`. result is JSON-unmarshalled from the
// server's content[0].text field; pass nil to discard.
func (c *Client) CallTool(ctx context.Context, name string, args any, result any) error {
body, err := json.Marshal(rpcRequest{
JSONRPC: "2.0",
ID: 1,
Method: "tools/call",
Params: map[string]any{
"name": name,
"arguments": args,
},
})
if err != nil {
return fmt.Errorf("marshal request: %w", err)
}
req, err := http.NewRequestWithContext(ctx, http.MethodPost, c.url, bytes.NewReader(body))
if err != nil {
return fmt.Errorf("new request: %w", err)
}
req.Header.Set("Content-Type", "application/json")
if c.token != "" {
req.Header.Set("Authorization", "Bearer "+c.token)
}
resp, err := c.http.Do(req)
if err != nil {
return fmt.Errorf("http: %w", err)
}
defer func() { _ = resp.Body.Close() }()
raw, err := io.ReadAll(resp.Body)
if err != nil {
return fmt.Errorf("read body: %w", err)
}
if resp.StatusCode >= 400 {
return fmt.Errorf("mcp http %d: %s", resp.StatusCode, string(raw))
}
var rpc rpcResponse
if err := json.Unmarshal(raw, &rpc); err != nil {
return fmt.Errorf("decode response: %w (body=%s)", err, string(raw))
}
if rpc.Error != nil {
return &Error{Code: rpc.Error.Code, Message: rpc.Error.Message}
}
if result == nil {
return nil
}
// MCP success result shape: { content: [{type:"text", text:"<json>"}] }
var wrap struct {
Content []struct {
Type string `json:"type"`
Text string `json:"text"`
} `json:"content"`
}
if err := json.Unmarshal(rpc.Result, &wrap); err != nil {
return fmt.Errorf("decode wrap: %w (result=%s)", err, string(rpc.Result))
}
if len(wrap.Content) == 0 {
return fmt.Errorf("empty content in tool response")
}
if err := json.Unmarshal([]byte(wrap.Content[0].Text), result); err != nil {
return fmt.Errorf("decode tool result text: %w (text=%s)", err, wrap.Content[0].Text)
}
return nil
}

View File

@@ -0,0 +1,82 @@
package mcpclient_test
import (
"context"
"encoding/json"
"errors"
"io"
"net/http"
"net/http/httptest"
"testing"
"github.com/mathiasbq/supervisor/internal/mcpclient"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestCallTool_Success(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
assert.Equal(t, http.MethodPost, r.Method)
assert.Equal(t, "Bearer tok", r.Header.Get("Authorization"))
b, _ := io.ReadAll(r.Body)
var got map[string]any
_ = json.Unmarshal(b, &got)
assert.Equal(t, "tools/call", got["method"])
params := got["params"].(map[string]any)
assert.Equal(t, "x_y", params["name"])
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"jsonrpc":"2.0","id":1,"result":{"content":[{"type":"text","text":"{\"ok\":true,\"n\":7}"}]}}`))
}))
defer srv.Close()
c := mcpclient.New(srv.URL, "tok")
var out struct {
OK bool `json:"ok"`
N int `json:"n"`
}
err := c.CallTool(context.Background(), "x_y", map[string]any{"a": 1}, &out)
require.NoError(t, err)
assert.True(t, out.OK)
assert.Equal(t, 7, out.N)
}
func TestCallTool_RPCError(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"jsonrpc":"2.0","id":1,"error":{"code":-32003,"message":"already exists"}}`))
}))
defer srv.Close()
c := mcpclient.New(srv.URL, "")
err := c.CallTool(context.Background(), "x", nil, nil)
require.Error(t, err)
var me *mcpclient.Error
require.True(t, errors.As(err, &me))
assert.Equal(t, -32003, me.Code)
assert.Contains(t, me.Message, "already exists")
}
func TestCallTool_HTTPError(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
w.WriteHeader(http.StatusUnauthorized)
_, _ = w.Write([]byte(`unauthorized`))
}))
defer srv.Close()
c := mcpclient.New(srv.URL, "")
err := c.CallTool(context.Background(), "x", nil, nil)
require.Error(t, err)
assert.Contains(t, err.Error(), "401")
}
func TestCallTool_NilResult(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"jsonrpc":"2.0","id":1,"result":{"content":[{"type":"text","text":"{}"}]}}`))
}))
defer srv.Close()
c := mcpclient.New(srv.URL, "")
require.NoError(t, c.CallTool(context.Background(), "x", nil, nil))
}

View File

@@ -13,7 +13,7 @@ import (
type LogEntry struct {
SessionID string
Skill string // the original skill the call routed (e.g., "review")
Decision string // "local" or "claude" or "claude_fallback"
Decision string // "local" or "thinking" or "thinking_fallback"
Message string // free-form, e.g. "model=qwen35, pass_rate=0.94"
ProjectRoot string
DurationMs int64

View File

@@ -24,8 +24,8 @@ type Router struct {
Fetcher *Fetcher
Logger *Logger
Policy Policy
LocalModel string
ClaudeModel string
FastModel string
ThinkingModel string
Complete CompleteFunc
}
@@ -40,9 +40,9 @@ func (r *Router) Run(ctx context.Context, in RunInput) (string, int64, error) {
hash := CanonicalHash(in.System, in.User)
decision := r.Policy.Decide(pr, hash)
model := r.ClaudeModel
model := r.ThinkingModel
if decision == DecideLocal {
model = r.LocalModel
model = r.FastModel
}
out, ms, err := r.Complete(ctx, model, in.System, in.User)
@@ -59,13 +59,13 @@ func (r *Router) Run(ctx context.Context, in RunInput) (string, int64, error) {
}
if err != nil && decision == DecideLocal {
slog.Warn("router: local failed, falling open to claude", "skill", in.Skill, "err", err)
out, ms, err = r.Complete(ctx, r.ClaudeModel, in.System, in.User)
slog.Warn("router: fast failed, falling open to thinking model", "skill", in.Skill, "err", err)
out, ms, err = r.Complete(ctx, r.ThinkingModel, in.System, in.User)
if lerr := r.Logger.LogDecision(ctx, LogEntry{
SessionID: in.SessionID,
Skill: in.Skill,
Decision: "claude_fallback",
Message: fmt.Sprintf("model=%s, after-local-error", r.ClaudeModel),
Decision: "thinking_fallback",
Message: fmt.Sprintf("model=%s, after-fast-error", r.ThinkingModel),
ProjectRoot: in.ProjectRoot,
DurationMs: ms,
Failed: err != nil,

View File

@@ -49,12 +49,12 @@ func newRouter(t *testing.T, llm *fakeLLM, passRate float64) (*routing.Router, *
t.Cleanup(brain.Close)
r := &routing.Router{
Fetcher: routing.NewFetcher(brain.URL, "7d", time.Minute),
Logger: routing.NewLogger(brain.URL),
Policy: routing.Policy{Floor: 0.9, Ceil: 0.7},
LocalModel: "qwen35",
ClaudeModel: "claude-sonnet-4-6",
Complete: llm.Complete,
Fetcher: routing.NewFetcher(brain.URL, "7d", time.Minute),
Logger: routing.NewLogger(brain.URL),
Policy: routing.Policy{Floor: 0.9, Ceil: 0.7},
FastModel: "koala/qwen35-9b-fast",
ThinkingModel: "iguana/gemma4-26b",
Complete: llm.Complete,
}
return r, brain, brain
}
@@ -72,10 +72,10 @@ func TestRouterRoutesLocalAtHighPassRate(t *testing.T) {
llm.mu.Lock()
defer llm.mu.Unlock()
require.Len(t, llm.calls, 1)
assert.Equal(t, "qwen35", llm.calls[0].Model)
assert.Equal(t, "koala/qwen35-9b-fast", llm.calls[0].Model)
}
func TestRouterRoutesClaudeAtLowPassRate(t *testing.T) {
func TestRouterRoutesThinkingAtLowPassRate(t *testing.T) {
llm := &fakeLLM{resp: "ok"}
r, _, _ := newRouter(t, llm, 0.3)
@@ -87,12 +87,12 @@ func TestRouterRoutesClaudeAtLowPassRate(t *testing.T) {
llm.mu.Lock()
defer llm.mu.Unlock()
require.Len(t, llm.calls, 1)
assert.Equal(t, "claude-sonnet-4-6", llm.calls[0].Model)
assert.Equal(t, "iguana/gemma4-26b", llm.calls[0].Model)
}
func TestRouterFailsOpenLocalErrorToClaude(t *testing.T) {
llm := &fakeLLM{resp: "ok-after-fallback", err: errors.New("local boom"), errOn: "qwen35"}
r, _, _ := newRouter(t, llm, 0.95) // would route local
func TestRouterFailsOpenFastErrorToThinking(t *testing.T) {
llm := &fakeLLM{resp: "ok-after-fallback", err: errors.New("fast boom"), errOn: "koala/qwen35-9b-fast"}
r, _, _ := newRouter(t, llm, 0.95) // would route fast
out, _, err := r.Run(context.Background(), routing.RunInput{
Skill: "review", System: "sys", User: "user", SessionID: "s3",
@@ -103,12 +103,12 @@ func TestRouterFailsOpenLocalErrorToClaude(t *testing.T) {
llm.mu.Lock()
defer llm.mu.Unlock()
require.Len(t, llm.calls, 2)
assert.Equal(t, "qwen35", llm.calls[0].Model)
assert.Equal(t, "claude-sonnet-4-6", llm.calls[1].Model)
assert.Equal(t, "koala/qwen35-9b-fast", llm.calls[0].Model)
assert.Equal(t, "iguana/gemma4-26b", llm.calls[1].Model)
}
func TestRouterDefaultsToLocalWhenBrainUnreachable(t *testing.T) {
// Brain returns 500 → fetcher errors → router treats pass rate as nil → local.
func TestRouterDefaultsToFastWhenBrainUnreachable(t *testing.T) {
// Brain returns 500 → fetcher errors → router treats pass rate as nil → fast.
brain := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
http.Error(w, "down", http.StatusInternalServerError)
}))
@@ -116,12 +116,12 @@ func TestRouterDefaultsToLocalWhenBrainUnreachable(t *testing.T) {
llm := &fakeLLM{resp: "ok"}
r := &routing.Router{
Fetcher: routing.NewFetcher(brain.URL, "7d", time.Minute),
Logger: routing.NewLogger(brain.URL),
Policy: routing.Policy{Floor: 0.9, Ceil: 0.7},
LocalModel: "qwen35",
ClaudeModel: "claude-sonnet-4-6",
Complete: llm.Complete,
Fetcher: routing.NewFetcher(brain.URL, "7d", time.Minute),
Logger: routing.NewLogger(brain.URL),
Policy: routing.Policy{Floor: 0.9, Ceil: 0.7},
FastModel: "koala/qwen35-9b-fast",
ThinkingModel: "iguana/gemma4-26b",
Complete: llm.Complete,
}
_, _, err := r.Run(context.Background(), routing.RunInput{
@@ -132,5 +132,5 @@ func TestRouterDefaultsToLocalWhenBrainUnreachable(t *testing.T) {
llm.mu.Lock()
defer llm.mu.Unlock()
require.Len(t, llm.calls, 1)
assert.Equal(t, "qwen35", llm.calls[0].Model)
assert.Equal(t, "koala/qwen35-9b-fast", llm.calls[0].Model)
}

View File

@@ -0,0 +1,286 @@
package project
import (
"context"
"encoding/json"
"errors"
"fmt"
"strings"
"time"
"github.com/mathiasbq/supervisor/internal/githubclient"
"github.com/mathiasbq/supervisor/internal/mcpclient"
)
type createArgs struct {
Name string `json:"name"`
Description string `json:"description"`
Hypothesis string `json:"hypothesis"`
Folder string `json:"folder"`
Stack string `json:"stack"`
Private bool `json:"private"`
}
type createResult struct {
GiteaURL string `json:"gitea_url"`
GitHubURL string `json:"github_url"`
IssueURL string `json:"issue_url"`
NextSteps string `json:"next_steps"`
// Reached records the steps that completed. Populated on partial failure
// so callers can resume manually instead of guessing what already ran.
Reached []string `json:"reached,omitempty"`
// FailedStep is non-empty when a downstream gitea-mcp call returned an
// error; the error itself is surfaced via the JSON-RPC error response,
// this field tells the operator which step it happened in.
FailedStep string `json:"failed_step,omitempty"`
}
func errUnknownTool(name string) error { return fmt.Errorf("unknown tool: %s", name) }
// step names — must match what we surface in failed_step / reached.
const (
stepCreateRepo = "create_repo"
stepCreateGitHub = "create_github_repo"
stepMirror = "mirror"
stepInfraCommit = "infra_commit"
stepIssue = "issue"
)
func (s *Skill) handleCreate(ctx context.Context, raw json.RawMessage) (json.RawMessage, error) {
var args createArgs
if err := json.Unmarshal(raw, &args); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if err := validate(args); err != nil {
return nil, err
}
tmpl := templateFor(args.Stack)
giteaURL := fmt.Sprintf("http://gitea.d-ma.be/%s/%s", s.cfg.GiteaOwner, args.Name)
githubURL := fmt.Sprintf("https://github.com/%s/%s", s.cfg.GitHubOwner, args.Name)
res := createResult{
GiteaURL: giteaURL,
GitHubURL: githubURL,
}
// Step 1: create_project_from_template. If the repo already exists,
// gitea-mcp returns -32003 Conflict; we treat that as idempotent success
// and continue to the next steps so re-running self-heals partial runs.
existed, err := s.callCreateRepo(ctx, args, tmpl)
if err != nil {
return marshalPartial(res, stepCreateRepo, err)
}
res.Reached = append(res.Reached, stepCreateRepo)
// Step 2: create empty GitHub repo. Gitea's push-mirror cannot push
// to a non-existent remote, so the destination must exist before
// step 3 configures the mirror. Skipped when GitHub client is unset
// (degraded mode — see Config.GitHub doc).
if s.cfg.GitHub != nil {
if err := s.callCreateGitHubRepo(ctx, args); err != nil && !errors.Is(err, githubclient.ErrAlreadyExists) {
return marshalPartial(res, stepCreateGitHub, err)
}
res.Reached = append(res.Reached, stepCreateGitHub)
}
// Step 3: configure push mirror to GitHub. Idempotent: if a mirror with
// the same remote already exists, gitea-mcp returns Conflict; we swallow it.
if err := s.callMirror(ctx, args.Name); err != nil {
if !isConflict(err) {
return marshalPartial(res, stepMirror, err)
}
}
res.Reached = append(res.Reached, stepMirror)
// Step 3: commit staging namespace manifest to infra repo. Done before
// the issue so the staging env is reconciling by the time the issue lands.
branch := fmt.Sprintf("staging/%s", args.Name)
if err := s.callInfraCommit(ctx, args.Name, branch); err != nil {
if !isConflict(err) {
return marshalPartial(res, stepInfraCommit, err)
}
}
res.Reached = append(res.Reached, stepInfraCommit)
// Step 4: open the experiment-brief issue on the new repo.
issueURL, err := s.callIssue(ctx, args, existed)
if err != nil {
return marshalPartial(res, stepIssue, err)
}
res.IssueURL = issueURL
res.Reached = append(res.Reached, stepIssue)
folder := args.Folder
if folder == "" {
folder = "."
}
res.NextSteps = fmt.Sprintf(
"cd ~/dev/%s/%s && task new-project -- %s personal %s %s && git remote add origin http://gitea.d-ma.be/%s/%s.git && git push -u origin main",
folder, args.Name, args.Name, folder, args.Stack, s.cfg.GiteaOwner, args.Name,
)
return marshalResult(res)
}
// callCreateRepo invokes create_project_from_template. Returns (existed, err)
// where existed=true means the destination was already present and we should
// treat it as a no-op success (idempotency).
func (s *Skill) callCreateRepo(ctx context.Context, args createArgs, template string) (bool, error) {
var out struct {
HTMLURL string `json:"html_url"`
}
err := s.cfg.Client.CallTool(ctx, "create_project_from_template", map[string]any{
"owner": s.cfg.GiteaOwner,
"name": args.Name,
"description": args.Description,
"private": args.Private,
"template_name": template,
}, &out)
if err == nil {
return false, nil
}
if isConflict(err) {
return true, nil
}
return false, err
}
// callCreateGitHubRepo creates the empty destination repo on GitHub.
// auto_init=false in githubclient so first push from gitea doesn't conflict
// with an auto-generated README.
func (s *Skill) callCreateGitHubRepo(ctx context.Context, args createArgs) error {
_, err := s.cfg.GitHub.CreateRepo(ctx, args.Name, args.Description, args.Private)
return err
}
// callMirror configures the push mirror to GitHub.
func (s *Skill) callMirror(ctx context.Context, name string) error {
remote := fmt.Sprintf("https://github.com/%s/%s.git", s.cfg.GitHubOwner, name)
return s.cfg.Client.CallTool(ctx, "repo_mirror_push", map[string]any{
"owner": s.cfg.GiteaOwner,
"name": name,
"action": "add",
"remote_address": remote,
"remote_username": s.cfg.GitHubOwner,
"remote_password": s.cfg.GitHubPAT,
"interval": "8h0m0s",
"sync_on_commit": true,
}, nil)
}
// callInfraCommit writes the staging namespace manifest into the infra repo
// on a dedicated branch. Flux picks it up after merge.
func (s *Skill) callInfraCommit(ctx context.Context, name, branch string) error {
manifest := stagingNamespaceManifest(name, time.Now().UTC().Format(time.RFC3339))
return s.cfg.Client.CallTool(ctx, "file_write_branch", map[string]any{
"owner": s.cfg.GiteaOwner,
"name": s.cfg.InfraRepo,
"path": fmt.Sprintf("k3s/staging/%s/namespace.yaml", name),
"content": manifest,
"branch": branch,
"base": "main",
"message": fmt.Sprintf("feat(staging): add namespace for %s\n\nGenerated by hyperguild project_create.", name),
}, nil)
}
// callIssue opens the experiment-brief issue on the newly-created repo.
// existed=true (repo pre-existed) still posts a new brief — repeated runs
// can intentionally restate intent without colliding.
func (s *Skill) callIssue(ctx context.Context, args createArgs, existed bool) (string, error) {
body := experimentBrief(args, existed)
var out struct {
HTMLURL string `json:"html_url"`
}
err := s.cfg.Client.CallTool(ctx, "issue_create", map[string]any{
"owner": s.cfg.GiteaOwner,
"name": args.Name,
"title": "experiment brief: " + args.Description,
"body": body,
}, &out)
if err != nil {
return "", err
}
return out.HTMLURL, nil
}
func stagingNamespaceManifest(name, createdAt string) string {
return fmt.Sprintf(`apiVersion: v1
kind: Namespace
metadata:
name: staging-%s
labels:
managed-by: hyperguild
project: %s
created-at: "%s"
`, name, name, createdAt)
}
func experimentBrief(args createArgs, existed bool) string {
var b strings.Builder
b.WriteString("## Hypothesis\n\n")
b.WriteString(args.Hypothesis)
b.WriteString("\n\n## Description\n\n")
b.WriteString(args.Description)
b.WriteString("\n\n## Stack\n\n`")
b.WriteString(args.Stack)
b.WriteString("`\n\n## Provisioning\n\n")
b.WriteString("- Repo created from `template-")
b.WriteString(args.Stack)
b.WriteString("` on Gitea.\n")
b.WriteString("- Push-mirror configured to GitHub.\n")
b.WriteString("- Staging namespace manifest committed to infra repo.\n\n")
if existed {
b.WriteString("> Note: this repo already existed when `project_create` ran — provisioning steps were re-applied idempotently.\n")
}
return b.String()
}
func validate(args createArgs) error {
if args.Name == "" {
return errors.New("name is required")
}
if args.Description == "" {
return errors.New("description is required")
}
if args.Hypothesis == "" {
return errors.New("hypothesis is required")
}
if args.Stack != "go-agent" && args.Stack != "go-web" {
return fmt.Errorf("stack must be go-agent or go-web, got %q", args.Stack)
}
return nil
}
func templateFor(stack string) string {
switch stack {
case "go-agent":
return "template-go-agent"
default:
return "template-go-web"
}
}
func isConflict(err error) bool {
var me *mcpclient.Error
if errors.As(err, &me) && me.Code == -32003 {
return true
}
return false
}
func marshalResult(r createResult) (json.RawMessage, error) {
b, err := json.Marshal(r)
if err != nil {
return nil, fmt.Errorf("marshal result: %w", err)
}
return b, nil
}
func marshalPartial(r createResult, step string, inner error) (json.RawMessage, error) {
r.FailedStep = step
b, _ := json.Marshal(r)
return b, fmt.Errorf("project_create step %q failed: %w", step, inner)
}

View File

@@ -0,0 +1,349 @@
package project_test
import (
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"strings"
"sync"
"testing"
"github.com/mathiasbq/supervisor/internal/githubclient"
"github.com/mathiasbq/supervisor/internal/mcpclient"
"github.com/mathiasbq/supervisor/internal/skills/project"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
// fakeGitHub captures POST /user/repos calls.
type fakeGitHub struct {
mu sync.Mutex
Calls []map[string]any
ReturnError int // 0 = 201 Created, 422 = already exists, etc.
}
func (g *fakeGitHub) handler() http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
var args map[string]any
_ = json.NewDecoder(r.Body).Decode(&args)
g.mu.Lock()
g.Calls = append(g.Calls, args)
code := g.ReturnError
g.mu.Unlock()
switch code {
case 0:
w.WriteHeader(http.StatusCreated)
_, _ = w.Write([]byte(`{"full_name":"mathiasb/x","html_url":"https://github.com/mathiasb/x","clone_url":"https://github.com/mathiasb/x.git"}`))
case 422:
w.WriteHeader(http.StatusUnprocessableEntity)
_, _ = w.Write([]byte(`{"errors":[{"message":"name already exists on this account"}]}`))
default:
w.WriteHeader(code)
_, _ = w.Write([]byte(`{"message":"boom"}`))
}
})
}
// fakeGiteaMCP implements just enough of the JSON-RPC tools/call surface
// to drive project_create end-to-end without an actual gitea-mcp server.
type fakeGiteaMCP struct {
mu sync.Mutex
// Recorded calls in order.
Calls []recordedCall
// Per-tool response. Default is a generic success object.
Responses map[string]any
// Per-tool error response, takes precedence over Responses.
Errors map[string]rpcErr
}
type rpcErr struct {
Code int
Message string
}
type recordedCall struct {
Tool string
Args map[string]any
}
func (f *fakeGiteaMCP) handler() http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
var req struct {
ID int `json:"id"`
Params json.RawMessage `json:"params"`
}
_ = json.NewDecoder(r.Body).Decode(&req)
var p struct {
Name string `json:"name"`
Arguments json.RawMessage `json:"arguments"`
}
_ = json.Unmarshal(req.Params, &p)
var args map[string]any
_ = json.Unmarshal(p.Arguments, &args)
f.mu.Lock()
f.Calls = append(f.Calls, recordedCall{Tool: p.Name, Args: args})
errResp, hasErr := f.Errors[p.Name]
var resp any
if r, ok := f.Responses[p.Name]; ok {
resp = r
} else {
resp = map[string]any{"html_url": "http://gitea.example/" + p.Name}
}
f.mu.Unlock()
w.Header().Set("Content-Type", "application/json")
if hasErr {
body, _ := json.Marshal(map[string]any{
"jsonrpc": "2.0",
"id": req.ID,
"error": map[string]any{"code": errResp.Code, "message": errResp.Message},
})
_, _ = w.Write(body)
return
}
respText, _ := json.Marshal(resp)
body, _ := json.Marshal(map[string]any{
"jsonrpc": "2.0",
"id": req.ID,
"result": map[string]any{
"content": []map[string]any{{"type": "text", "text": string(respText)}},
},
})
_, _ = w.Write(body)
})
}
func newSkill(t *testing.T, f *fakeGiteaMCP) (*project.Skill, *fakeGitHub) {
t.Helper()
srv := httptest.NewServer(f.handler())
t.Cleanup(srv.Close)
gh := &fakeGitHub{}
ghSrv := httptest.NewServer(gh.handler())
t.Cleanup(ghSrv.Close)
return project.New(project.Config{
Client: mcpclient.New(srv.URL, ""),
GitHub: githubclient.New("ghp_test").WithBaseURL(ghSrv.URL),
GiteaOwner: "mathias",
GitHubOwner: "mathiasb",
GitHubPAT: "ghp_test",
InfraRepo: "infra",
}), gh
}
// newSkillNoGitHub builds a skill with the GitHub client unset — degraded
// mode where the github-repo-creation step is skipped.
func newSkillNoGitHub(t *testing.T, f *fakeGiteaMCP) *project.Skill {
t.Helper()
srv := httptest.NewServer(f.handler())
t.Cleanup(srv.Close)
return project.New(project.Config{
Client: mcpclient.New(srv.URL, ""),
GiteaOwner: "mathias",
GitHubOwner: "mathiasb",
InfraRepo: "infra",
})
}
func happyArgs() json.RawMessage {
return json.RawMessage(`{
"name":"my-experiment",
"description":"One-line desc",
"hypothesis":"We believe X produces Y",
"folder":"AGENTS",
"stack":"go-agent",
"private":true
}`)
}
func TestProjectCreate_HappyPath(t *testing.T) {
f := &fakeGiteaMCP{
Responses: map[string]any{
"issue_create": map[string]any{"html_url": "http://gitea.d-ma.be/mathias/my-experiment/issues/1"},
},
}
skill, gh := newSkill(t, f)
out, err := skill.Handle(context.Background(), "project_create", happyArgs())
require.NoError(t, err)
var res map[string]any
require.NoError(t, json.Unmarshal(out, &res))
assert.Equal(t, "http://gitea.d-ma.be/mathias/my-experiment", res["gitea_url"])
assert.Equal(t, "https://github.com/mathiasb/my-experiment", res["github_url"])
assert.Equal(t, "http://gitea.d-ma.be/mathias/my-experiment/issues/1", res["issue_url"])
assert.Contains(t, res["next_steps"], "cd ~/dev/AGENTS/my-experiment")
assert.Contains(t, res["next_steps"], "git remote add origin")
// All 4 gitea-mcp calls in order.
require.Len(t, f.Calls, 4)
assert.Equal(t, "create_project_from_template", f.Calls[0].Tool)
assert.Equal(t, "repo_mirror_push", f.Calls[1].Tool)
assert.Equal(t, "file_write_branch", f.Calls[2].Tool)
assert.Equal(t, "issue_create", f.Calls[3].Tool)
// GitHub repo created between create_project_from_template and mirror.
require.Len(t, gh.Calls, 1)
assert.Equal(t, "my-experiment", gh.Calls[0]["name"])
assert.Equal(t, true, gh.Calls[0]["private"])
assert.Equal(t, false, gh.Calls[0]["auto_init"])
// template selection wired from stack
assert.Equal(t, "template-go-agent", f.Calls[0].Args["template_name"])
// mirror config
assert.Equal(t, "add", f.Calls[1].Args["action"])
assert.Equal(t, "https://github.com/mathiasb/my-experiment.git", f.Calls[1].Args["remote_address"])
assert.Equal(t, "ghp_test", f.Calls[1].Args["remote_password"])
// infra commit path
assert.Equal(t, "k3s/staging/my-experiment/namespace.yaml", f.Calls[2].Args["path"])
assert.Contains(t, f.Calls[2].Args["content"], "name: staging-my-experiment")
assert.Contains(t, f.Calls[2].Args["content"], "managed-by: hyperguild")
// PAT must NOT appear in the response
assert.NotContains(t, string(out), "ghp_test")
// reached records the github step too.
reached := res["reached"].([]any)
assert.Equal(t, []any{"create_repo", "create_github_repo", "mirror", "infra_commit", "issue"}, reached)
}
func TestProjectCreate_GitHubExists_Idempotent(t *testing.T) {
f := &fakeGiteaMCP{
Responses: map[string]any{
"issue_create": map[string]any{"html_url": "http://gitea.d-ma.be/mathias/my-experiment/issues/1"},
},
}
skill, gh := newSkill(t, f)
gh.ReturnError = 422 // already exists
_, err := skill.Handle(context.Background(), "project_create", happyArgs())
require.NoError(t, err, "422 already-exists should be idempotent")
require.Len(t, f.Calls, 4, "all gitea steps still run despite github 422")
}
func TestProjectCreate_GitHubFails(t *testing.T) {
f := &fakeGiteaMCP{}
skill, gh := newSkill(t, f)
gh.ReturnError = 401 // bad PAT
out, err := skill.Handle(context.Background(), "project_create", happyArgs())
require.Error(t, err)
var res map[string]any
require.NoError(t, json.Unmarshal(out, &res))
assert.Equal(t, "create_github_repo", res["failed_step"])
assert.Equal(t, []any{"create_repo"}, res["reached"])
require.Len(t, f.Calls, 1, "mirror + later steps must not run when github creation fails")
}
func TestProjectCreate_NoGitHubClient_DegradedMode(t *testing.T) {
f := &fakeGiteaMCP{
Responses: map[string]any{
"issue_create": map[string]any{"html_url": "http://gitea.d-ma.be/mathias/my-experiment/issues/1"},
},
}
skill := newSkillNoGitHub(t, f)
out, err := skill.Handle(context.Background(), "project_create", happyArgs())
require.NoError(t, err)
var res map[string]any
require.NoError(t, json.Unmarshal(out, &res))
// reached does NOT include create_github_repo when client is nil.
reached := res["reached"].([]any)
assert.Equal(t, []any{"create_repo", "mirror", "infra_commit", "issue"}, reached)
}
func TestProjectCreate_Idempotent_RepoExists(t *testing.T) {
f := &fakeGiteaMCP{
Errors: map[string]rpcErr{
"create_project_from_template": {Code: -32003, Message: "already exists"},
},
Responses: map[string]any{
"issue_create": map[string]any{"html_url": "http://gitea.d-ma.be/mathias/my-experiment/issues/1"},
},
}
skill, _ := newSkill(t, f)
out, err := skill.Handle(context.Background(), "project_create", happyArgs())
require.NoError(t, err)
var res map[string]any
require.NoError(t, json.Unmarshal(out, &res))
assert.Equal(t, "http://gitea.d-ma.be/mathias/my-experiment", res["gitea_url"])
assert.Equal(t, "http://gitea.d-ma.be/mathias/my-experiment/issues/1", res["issue_url"])
// Still ran all 4 gitea-mcp steps; idempotent flow falls through.
require.Len(t, f.Calls, 4)
}
func TestProjectCreate_MirrorFails(t *testing.T) {
f := &fakeGiteaMCP{
Errors: map[string]rpcErr{
"repo_mirror_push": {Code: -32000, Message: "github unreachable"},
},
}
skill, _ := newSkill(t, f)
out, err := skill.Handle(context.Background(), "project_create", happyArgs())
require.Error(t, err)
assert.Contains(t, err.Error(), `"mirror" failed`)
var res map[string]any
require.NoError(t, json.Unmarshal(out, &res))
assert.Equal(t, "mirror", res["failed_step"])
reached := res["reached"].([]any)
assert.Equal(t, []any{"create_repo", "create_github_repo"}, reached)
// Steps 1 (create) + 2 (mirror attempt) reached gitea; github made 1 call.
require.Len(t, f.Calls, 2)
}
func TestProjectCreate_InfraCommitFails(t *testing.T) {
f := &fakeGiteaMCP{
Errors: map[string]rpcErr{
"file_write_branch": {Code: -32000, Message: "write rejected"},
},
}
skill, _ := newSkill(t, f)
out, err := skill.Handle(context.Background(), "project_create", happyArgs())
require.Error(t, err)
var res map[string]any
require.NoError(t, json.Unmarshal(out, &res))
assert.Equal(t, "infra_commit", res["failed_step"])
reached := res["reached"].([]any)
assert.Equal(t, []any{"create_repo", "create_github_repo", "mirror"}, reached)
require.Len(t, f.Calls, 3)
}
func TestProjectCreate_ValidationErrors(t *testing.T) {
f := &fakeGiteaMCP{}
skill, _ := newSkill(t, f)
cases := []struct {
name string
body string
want string
}{
{"missing name", `{"description":"d","hypothesis":"h","stack":"go-agent"}`, "name"},
{"missing description", `{"name":"x","hypothesis":"h","stack":"go-agent"}`, "description"},
{"missing hypothesis", `{"name":"x","description":"d","stack":"go-agent"}`, "hypothesis"},
{"bad stack", `{"name":"x","description":"d","hypothesis":"h","stack":"python"}`, "stack"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
_, err := skill.Handle(context.Background(), "project_create", json.RawMessage(tc.body))
require.Error(t, err)
assert.True(t, strings.Contains(err.Error(), tc.want), "want %q in %v", tc.want, err)
})
}
assert.Empty(t, f.Calls, "no upstream calls should occur on validation failure")
}
func TestProjectCreate_UnknownTool(t *testing.T) {
f := &fakeGiteaMCP{}
skill, _ := newSkill(t, f)
_, err := skill.Handle(context.Background(), "nope", happyArgs())
require.Error(t, err)
}

View File

@@ -0,0 +1,100 @@
// Package project implements the `project_create` MCP tool: a single-call
// pipeline that creates a Gitea repo from a template, configures push-mirror
// to GitHub, commits a staging namespace manifest to the infra repo, and
// opens an experiment-brief issue on the new repo. See hyperguild gitea
// issue #10 for the design.
package project
import (
"context"
"encoding/json"
"github.com/mathiasbq/supervisor/internal/githubclient"
"github.com/mathiasbq/supervisor/internal/mcpclient"
"github.com/mathiasbq/supervisor/internal/registry"
)
// Config holds the orchestration dependencies for the project skill.
type Config struct {
// Client talks to the gitea-mcp server. project_create makes
// sequential calls (create_project_from_template, repo_mirror_push,
// file_write_branch, issue_create) through this client.
Client *mcpclient.Client
// GitHub is the client used to create the empty destination repo on
// GitHub before the push-mirror is configured. Gitea's push-mirror
// cannot push to a non-existent remote, so this step is mandatory
// when GitHubPAT is set. Pass nil to skip github repo creation
// entirely (degraded mode — mirror config will land but the actual
// sync to github will fail until the repo exists).
GitHub *githubclient.Client
// GiteaOwner is the org/user that owns the new repo and the infra repo
// the namespace manifest is committed to (typically "mathias").
GiteaOwner string
// GitHubOwner is the GitHub org/user the push-mirror targets
// (typically "mathiasb").
GitHubOwner string
// GitHubPAT is the personal access token used as the push-mirror
// password and to create the destination repo on GitHub. Must have
// `repo` scope. Never logged.
GitHubPAT string
// InfraRepo is the name of the infra repo on Gitea where the
// k3s/staging/<name>/namespace.yaml manifest gets committed
// (typically "infra").
InfraRepo string
}
// Skill exposes project_create as an MCP tool.
type Skill struct{ cfg Config }
// New constructs the project Skill.
func New(cfg Config) *Skill { return &Skill{cfg: cfg} }
// Name returns the skill identifier.
func (s *Skill) Name() string { return "project" }
// Tools returns the MCP tool definitions for this skill.
func (s *Skill) Tools() []registry.ToolDef {
schema, _ := json.Marshal(map[string]any{
"type": "object",
"properties": map[string]any{
"name": map[string]any{
"type": "string",
"pattern": `^[a-z][a-z0-9-]{1,38}[a-z0-9]$`,
"description": "Lowercase repo name. 3-40 chars, must start with a letter.",
},
"description": map[string]any{"type": "string"},
"hypothesis": map[string]any{"type": "string"},
"folder": map[string]any{
"type": "string",
"description": "Informational only — appears in next_steps. Example: AGENTS, AI, QKX.",
},
"stack": map[string]any{
"type": "string",
"enum": []string{"go-agent", "go-web"},
"description": "Selects template-go-agent or template-go-web.",
},
"private": map[string]any{"type": "boolean"},
},
"required": []string{"name", "description", "hypothesis", "stack"},
})
return []registry.ToolDef{
{
Name: "project_create",
Description: "Bootstrap a new project: Gitea repo from template, GitHub push-mirror, staging namespace manifest, experiment-brief issue. Idempotent — re-running with an existing repo returns the existing URLs.",
InputSchema: schema,
},
}
}
// Handle dispatches the tool call.
func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
if tool != "project_create" {
return nil, errUnknownTool(tool)
}
return s.handleCreate(ctx, args)
}

View File

@@ -1,87 +0,0 @@
// internal/skills/spec/handlers.go
package spec
import (
"context"
"encoding/json"
"fmt"
"time"
"github.com/mathiasbq/supervisor/internal/brain"
"github.com/mathiasbq/supervisor/internal/session"
)
type specArgs struct {
ProjectRoot string `json:"project_root"`
Requirements string `json:"requirements"`
OutputPath string `json:"output_path"`
Context string `json:"context"`
Model string `json:"model"`
SessionID string `json:"session_id"`
}
// Handle dispatches the MCP tool call to the appropriate handler.
func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
if tool != "spec" {
return nil, fmt.Errorf("unknown tool: %s", tool)
}
var a specArgs
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if a.ProjectRoot == "" {
return nil, fmt.Errorf("project_root is required")
}
if a.Requirements == "" {
return nil, fmt.Errorf("requirements is required")
}
outputPath := a.OutputPath
if outputPath == "" {
outputPath = "docs/spec.md"
}
model := a.Model
if model == "" {
model = s.cfg.DefaultModel
}
brainCtx, _ := brain.Query(ctx, s.cfg.IngestBaseURL, a.Requirements+" "+a.Context, 3)
task := fmt.Sprintf(
"phase: spec\nproject_root: %s\nrequirements: %s\noutput_path: %s\ncontext: %s\nmodel: %s",
a.ProjectRoot, a.Requirements, outputPath, a.Context, model,
)
task = session.PrependHistory(s.cfg.SessionsDir, a.SessionID, "spec", task)
if brainCtx != "" {
task = brainCtx + "\n---\n\n" + task
}
if s.cfg.CompleteFunc == nil {
return nil, fmt.Errorf("no executor configured")
}
t0 := time.Now()
text, dur, err := s.cfg.CompleteFunc(ctx, model, s.cfg.SkillPrompt, task)
if err != nil {
return nil, err
}
if a.SessionID != "" && s.cfg.SessionsDir != "" {
msg := text
if len(msg) > 200 {
msg = msg[:200]
}
_ = session.Append(s.cfg.SessionsDir, a.SessionID, session.Entry{
SessionID: a.SessionID,
Timestamp: time.Now(),
Skill: "spec",
Phase: "spec",
ProjectRoot: a.ProjectRoot,
FinalStatus: "ok",
ModelUsed: model,
DurationMs: time.Since(t0).Milliseconds(),
Message: msg,
})
}
return json.Marshal(map[string]any{"text": text, "model": model, "duration_ms": dur})
}

View File

@@ -1,53 +0,0 @@
// internal/skills/spec/handlers_test.go
package spec_test
import (
"context"
"encoding/json"
"testing"
"github.com/mathiasbq/supervisor/internal/skills/spec"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestSpecToolRegistered(t *testing.T) {
sk := spec.New(spec.Config{SkillPrompt: "spec rules"})
names := make([]string, 0)
for _, tool := range sk.Tools() {
names = append(names, tool.Name)
}
assert.Contains(t, names, "spec")
}
func TestSpecRequiresProjectRoot(t *testing.T) {
sk := spec.New(spec.Config{SkillPrompt: "s"})
_, err := sk.Handle(context.Background(), "spec", json.RawMessage(`{"requirements":"add login"}`))
assert.ErrorContains(t, err, "project_root")
}
func TestSpecRequiresRequirements(t *testing.T) {
sk := spec.New(spec.Config{SkillPrompt: "s"})
_, err := sk.Handle(context.Background(), "spec", json.RawMessage(`{"project_root":"/tmp"}`))
assert.ErrorContains(t, err, "requirements")
}
func TestSpecCallsCompleteFunc(t *testing.T) {
var capturedTask string
fakeFn := func(_ context.Context, _, _, user string) (string, int64, error) {
capturedTask = user
return "# OAuth2 Login Spec\n\n## Overview\nImplement OAuth2 login flow.", 110, nil
}
sk := spec.New(spec.Config{SkillPrompt: "spec rules", CompleteFunc: fakeFn, SessionsDir: t.TempDir()})
out, err := sk.Handle(context.Background(), "spec", json.RawMessage(
`{"project_root":"/tmp/proj","requirements":"add OAuth2 login","output_path":"docs/login-spec.md"}`,
))
require.NoError(t, err)
assert.Contains(t, capturedTask, "OAuth2 login")
assert.Contains(t, capturedTask, "docs/login-spec.md")
var result map[string]any
require.NoError(t, json.Unmarshal(out, &result))
assert.Contains(t, result["text"], "OAuth2 Login Spec")
}

View File

@@ -1,56 +0,0 @@
// internal/skills/spec/skill.go
package spec
import (
"context"
"encoding/json"
"github.com/mathiasbq/supervisor/internal/registry"
)
// CompleteFunc is the function used to call a local model.
type CompleteFunc func(ctx context.Context, model, system, user string) (string, int64, error)
// Config holds dependencies for the spec skill.
type Config struct {
SkillPrompt string
DefaultModel string
CompleteFunc CompleteFunc
SessionsDir string
IngestBaseURL string
}
// Skill implements the spec MCP tool.
type Skill struct{ cfg Config }
// New creates a new spec Skill.
func New(cfg Config) *Skill { return &Skill{cfg: cfg} }
// Name returns the skill identifier.
func (s *Skill) Name() string { return "spec" }
// Tools returns the MCP tool definitions for this skill.
func (s *Skill) Tools() []registry.ToolDef {
schema := func(required []string, props map[string]any) json.RawMessage {
b, _ := json.Marshal(map[string]any{"type": "object", "required": required, "properties": props})
return b
}
str := map[string]any{"type": "string"}
return []registry.ToolDef{
{
Name: "spec",
Description: "Consult a local model to draft a structured implementation spec from requirements. Returns the spec text.",
InputSchema: schema(
[]string{"project_root", "requirements"},
map[string]any{
"project_root": str,
"requirements": str,
"output_path": str,
"context": str,
"model": str,
"session_id": str,
},
),
},
}
}

View File

@@ -1,173 +0,0 @@
package tdd
import (
"context"
"encoding/json"
"fmt"
"time"
"github.com/mathiasbq/supervisor/internal/brain"
"github.com/mathiasbq/supervisor/internal/session"
)
func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
switch tool {
case "tdd_red":
return s.handleRed(ctx, args)
case "tdd_green":
return s.handleGreen(ctx, args)
case "tdd_refactor":
return s.handleRefactor(ctx, args)
default:
return nil, fmt.Errorf("unknown tool: %s", tool)
}
}
type redArgs struct {
ProjectRoot string `json:"project_root"`
Spec string `json:"spec"`
Model string `json:"model"`
TestCmd string `json:"test_cmd"`
}
func (s *Skill) handleRed(ctx context.Context, raw json.RawMessage) (json.RawMessage, error) {
var args redArgs
if err := json.Unmarshal(raw, &args); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if args.ProjectRoot == "" {
return nil, fmt.Errorf("project_root is required")
}
if args.Spec == "" {
return nil, fmt.Errorf("spec is required")
}
brainCtx, _ := brain.Query(ctx, s.cfg.IngestBaseURL, args.Spec, 3)
task := fmt.Sprintf(
"phase: red\nproject_root: %s\nspec: %s\nmodel: %s\ntest_cmd: %s",
args.ProjectRoot, args.Spec, s.resolveModel(args.Model), args.TestCmd,
)
if brainCtx != "" {
task = brainCtx + "\n---\n\n" + task
}
return s.complete(ctx, s.resolveModel(args.Model), task)
}
type greenArgs struct {
ProjectRoot string `json:"project_root"`
TestPath string `json:"test_path"`
Model string `json:"model"`
TestCmd string `json:"test_cmd"`
SessionID string `json:"session_id"`
}
func (s *Skill) handleGreen(ctx context.Context, raw json.RawMessage) (json.RawMessage, error) {
var args greenArgs
if err := json.Unmarshal(raw, &args); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if args.ProjectRoot == "" {
return nil, fmt.Errorf("project_root is required")
}
if args.TestPath == "" {
return nil, fmt.Errorf("test_path is required")
}
task := fmt.Sprintf(
"phase: green\nproject_root: %s\ntest_path: %s\nmodel: %s\ntest_cmd: %s",
args.ProjectRoot, args.TestPath, s.resolveModel(args.Model), args.TestCmd,
)
task = session.PrependHistory(s.cfg.SessionsDir, args.SessionID, "green", task)
t0 := time.Now()
result, err := s.complete(ctx, s.resolveModel(args.Model), task)
if err != nil {
return nil, err
}
s.logEntry(args.SessionID, args.ProjectRoot, "tdd", "green", s.resolveModel(args.Model), t0, result)
return result, nil
}
type refactorArgs struct {
ProjectRoot string `json:"project_root"`
TestPath string `json:"test_path"`
ImplPath string `json:"impl_path"`
Model string `json:"model"`
TestCmd string `json:"test_cmd"`
SessionID string `json:"session_id"`
}
func (s *Skill) handleRefactor(ctx context.Context, raw json.RawMessage) (json.RawMessage, error) {
var args refactorArgs
if err := json.Unmarshal(raw, &args); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if args.ProjectRoot == "" {
return nil, fmt.Errorf("project_root is required")
}
if args.TestPath == "" {
return nil, fmt.Errorf("test_path is required")
}
if args.ImplPath == "" {
return nil, fmt.Errorf("impl_path is required")
}
task := fmt.Sprintf(
"phase: refactor\nproject_root: %s\ntest_path: %s\nimpl_path: %s\nmodel: %s\ntest_cmd: %s",
args.ProjectRoot, args.TestPath, args.ImplPath, s.resolveModel(args.Model), args.TestCmd,
)
task = session.PrependHistory(s.cfg.SessionsDir, args.SessionID, "refactor", task)
t0 := time.Now()
result, err := s.complete(ctx, s.resolveModel(args.Model), task)
if err != nil {
return nil, err
}
s.logEntry(args.SessionID, args.ProjectRoot, "tdd", "refactor", s.resolveModel(args.Model), t0, result)
return result, nil
}
func (s *Skill) resolveModel(override string) string {
if override != "" {
return override
}
return s.cfg.DefaultModel
}
// complete calls CompleteFunc and returns the text as JSON.
func (s *Skill) complete(ctx context.Context, model, task string) (json.RawMessage, error) {
if s.cfg.CompleteFunc == nil {
return nil, fmt.Errorf("no executor configured")
}
text, dur, err := s.cfg.CompleteFunc(ctx, model, s.cfg.SkillPrompt, task)
if err != nil {
return nil, err
}
return json.Marshal(map[string]any{"text": text, "model": model, "duration_ms": dur})
}
// logEntry writes a session.Entry for a completed phase if session_id is set.
func (s *Skill) logEntry(sessionID, projectRoot, skill, phase, model string, t0 time.Time, raw json.RawMessage) {
if sessionID == "" || s.cfg.SessionsDir == "" {
return
}
var msg string
var result struct {
Text string `json:"text"`
}
if err := json.Unmarshal(raw, &result); err == nil && len(result.Text) > 0 {
msg = result.Text
if len(msg) > 200 {
msg = msg[:200]
}
}
_ = session.Append(s.cfg.SessionsDir, sessionID, session.Entry{
SessionID: sessionID,
Timestamp: time.Now(),
Skill: skill,
Phase: phase,
ProjectRoot: projectRoot,
FinalStatus: "ok",
ModelUsed: model,
DurationMs: time.Since(t0).Milliseconds(),
Message: msg,
})
}

View File

@@ -1,97 +0,0 @@
package tdd_test
import (
"context"
"encoding/json"
"testing"
"github.com/mathiasbq/supervisor/internal/session"
"github.com/mathiasbq/supervisor/internal/skills/tdd"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestTDDSkillTools(t *testing.T) {
skill := tdd.New(tdd.Config{
SkillPrompt: "tdd rules",
})
tools := skill.Tools()
names := make([]string, len(tools))
for i, tool := range tools {
names[i] = tool.Name
}
assert.ElementsMatch(t, []string{"tdd_red", "tdd_green", "tdd_refactor"}, names)
}
func TestTDDSkillHandleUnknown(t *testing.T) {
skill := tdd.New(tdd.Config{SkillPrompt: "t"})
_, err := skill.Handle(context.Background(), "tdd_unknown", json.RawMessage(`{}`))
assert.ErrorContains(t, err, "unknown tool")
}
func TestTDDRedRequiresProjectRoot(t *testing.T) {
skill := tdd.New(tdd.Config{SkillPrompt: "t"})
_, err := skill.Handle(context.Background(), "tdd_red", json.RawMessage(`{"spec":"add two numbers"}`))
assert.ErrorContains(t, err, "project_root")
}
func TestTDDRedRequiresSpec(t *testing.T) {
skill := tdd.New(tdd.Config{SkillPrompt: "t"})
_, err := skill.Handle(context.Background(), "tdd_red", json.RawMessage(`{"project_root":"/tmp/proj"}`))
assert.ErrorContains(t, err, "spec")
}
func TestTDDGreenInjectsSessionHistory(t *testing.T) {
sessDir := t.TempDir()
require.NoError(t, session.Append(sessDir, "sess-1", session.Entry{
SessionID: "sess-1", Skill: "tdd", Phase: "red", FinalStatus: "pass",
FilePath: "internal/foo/foo_test.go",
Message: "wrote failing test for Foo",
}))
var capturedTask string
fakeFn := func(_ context.Context, _, _, user string) (string, int64, error) {
capturedTask = user
return "here is my suggestion", 100, nil
}
sk := tdd.New(tdd.Config{SkillPrompt: "tdd", CompleteFunc: fakeFn, SessionsDir: sessDir})
_, err := sk.Handle(context.Background(), "tdd_green", json.RawMessage(
`{"project_root":"/tmp","test_path":"internal/foo/foo_test.go","test_cmd":"go test ./...","session_id":"sess-1"}`,
))
require.NoError(t, err)
assert.Contains(t, capturedTask, "## Session history")
assert.Contains(t, capturedTask, "wrote failing test for Foo")
}
func TestTDDGreenNoHistoryWhenSessionIDEmpty(t *testing.T) {
var capturedTask string
fakeFn := func(_ context.Context, _, _, user string) (string, int64, error) {
capturedTask = user
return "suggestion", 50, nil
}
sk := tdd.New(tdd.Config{SkillPrompt: "tdd", CompleteFunc: fakeFn, SessionsDir: t.TempDir()})
_, err := sk.Handle(context.Background(), "tdd_green", json.RawMessage(
`{"project_root":"/tmp","test_path":"internal/foo/foo_test.go"}`,
))
require.NoError(t, err)
assert.NotContains(t, capturedTask, "## Session history")
}
func TestTDDGreenReturnsTextJSON(t *testing.T) {
fakeFn := func(_ context.Context, _, _, _ string) (string, int64, error) {
return "write a func that adds two ints", 42, nil
}
sk := tdd.New(tdd.Config{SkillPrompt: "tdd", CompleteFunc: fakeFn})
raw, err := sk.Handle(context.Background(), "tdd_green", json.RawMessage(
`{"project_root":"/tmp","test_path":"foo_test.go"}`,
))
require.NoError(t, err)
var result map[string]any
require.NoError(t, json.Unmarshal(raw, &result))
assert.Equal(t, "write a func that adds two ints", result["text"])
assert.Equal(t, float64(42), result["duration_ms"])
}

View File

@@ -1,86 +0,0 @@
package tdd
import (
"context"
"encoding/json"
"github.com/mathiasbq/supervisor/internal/registry"
)
// CompleteFunc is the function used to call a local model.
type CompleteFunc func(ctx context.Context, model, system, user string) (string, int64, error)
type Config struct {
SkillPrompt string
CompleteFunc CompleteFunc // nil = no executor (tests that don't reach execute())
DefaultModel string
SessionsDir string // optional: path to brain/sessions/ for history injection
IngestBaseURL string // optional: base URL of ingestion server for brain context
}
type Skill struct {
cfg Config
}
func New(cfg Config) *Skill {
return &Skill{cfg: cfg}
}
func (s *Skill) Name() string { return "tdd" }
func (s *Skill) Tools() []registry.ToolDef {
schema := func(required []string, props map[string]any) json.RawMessage {
b, _ := json.Marshal(map[string]any{
"type": "object",
"required": required,
"properties": props,
})
return b
}
strProp := map[string]any{"type": "string"}
return []registry.ToolDef{
{
Name: "tdd_red",
Description: "Consult a local model for help writing a failing test for the described behavior.",
InputSchema: schema(
[]string{"project_root", "spec"},
map[string]any{
"project_root": strProp,
"spec": strProp,
"model": strProp,
"test_cmd": strProp,
},
),
},
{
Name: "tdd_green",
Description: "Consult a local model for implementation ideas to make the test at test_path pass.",
InputSchema: schema(
[]string{"project_root", "test_path"},
map[string]any{
"project_root": strProp,
"test_path": strProp,
"model": strProp,
"test_cmd": strProp,
"session_id": strProp,
},
),
},
{
Name: "tdd_refactor",
Description: "Consult a local model for refactoring suggestions for impl_path while keeping tests green.",
InputSchema: schema(
[]string{"project_root", "test_path", "impl_path"},
map[string]any{
"project_root": strProp,
"test_path": strProp,
"impl_path": strProp,
"model": strProp,
"test_cmd": strProp,
"session_id": strProp,
},
),
},
}
}