Compare commits
84 Commits
923a665365
...
v0.6.0
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
a220fcaf2b | ||
|
|
d1c8e3396f | ||
|
|
3b79311fdd | ||
|
|
7baf8d7e7a | ||
|
|
a8de04c7b6 | ||
|
|
87cf9d0afc | ||
|
|
46adaf2148 | ||
|
|
c11763472c | ||
|
|
189ff89c34 | ||
|
|
c7e0192486 | ||
| 1c3c9de550 | |||
| d0edc1a725 | |||
|
|
5b207425ed | ||
|
|
cb51ff7ba1 | ||
|
|
43a8255272 | ||
|
|
78be3d1f9c | ||
|
|
7139a3ca74 | ||
|
|
c509ae2a5f | ||
|
|
228ee57d4c | ||
|
|
bee4bb3c1f | ||
|
|
d72454d929 | ||
|
|
cf94d14922 | ||
|
|
78a43d6a42 | ||
|
|
ca933eef46 | ||
|
|
88782de07c | ||
|
|
083c2d7db9 | ||
|
|
751f410ca6 | ||
|
|
3a99d5e20e | ||
|
|
9a258ca32a | ||
|
|
2a5a74f7c0 | ||
|
|
d40a5ac890 | ||
|
|
b77820534a | ||
|
|
db64ecb1d9 | ||
|
|
ea29e5ebb8 | ||
|
|
ccf080db59 | ||
|
|
69c038478b | ||
|
|
b6bcc93048 | ||
|
|
51e01233a4 | ||
|
|
f49850d23b | ||
|
|
928f23ab1b | ||
|
|
1b9c4905a5 | ||
|
|
400025715a | ||
|
|
986e3e1d12 | ||
|
|
593d1a4c6d | ||
|
|
417bf224eb | ||
|
|
37dbd22eff | ||
|
|
cbf5cab5e7 | ||
|
|
af52f501fe | ||
|
|
b3b1fde825 | ||
|
|
ab4cfaaeb7 | ||
|
|
eb844edb29 | ||
|
|
317ec20392 | ||
|
|
eab8775f5f | ||
|
|
a0d0914a85 | ||
|
|
8f9642df69 | ||
|
|
cd5f3c0175 | ||
|
|
ed4966927c | ||
|
|
3c4e8e8bb8 | ||
|
|
5c88eff46f | ||
|
|
646a86f2c3 | ||
|
|
adf0504116 | ||
|
|
d44427e71f | ||
|
|
2635cdcaa7 | ||
|
|
e922471229 | ||
|
|
87ff1f907c | ||
|
|
9cc179dec6 | ||
|
|
370d30e376 | ||
|
|
bd0c1d75fd | ||
|
|
8c87460bff | ||
|
|
809d435480 | ||
|
|
e4a94df4fc | ||
|
|
7dcb5610fe | ||
|
|
63c8d114e8 | ||
|
|
54f7d373bd | ||
|
|
a412eee427 | ||
|
|
3d6f33881b | ||
|
|
07e3f341ef | ||
|
|
5c532e708c | ||
|
|
a34c66d7cd | ||
|
|
cc401d92d6 | ||
|
|
9bdf00f51f | ||
|
|
7f7524c859 | ||
|
|
0a70d9e972 | ||
|
|
3e9a648115 |
315
.aider.conventions.md
Normal file
315
.aider.conventions.md
Normal file
@@ -0,0 +1,315 @@
|
||||
# Agent context — Mathias workspace
|
||||
|
||||
<!-- Canonical root context for all AI coding agents.
|
||||
Lives at: ~/dev/.context/AGENT.md
|
||||
Applies to every project under ~/dev/ unless overridden.
|
||||
|
||||
Run `task context:sync` from ~/dev/ to regenerate harness-specific files.
|
||||
Project-level context in .context/PROJECT.md layers on top of this. -->
|
||||
|
||||
## Who I am
|
||||
|
||||
I'm Mathias, a digital product manager and technology consultant based in Sweden.
|
||||
I build software, research emerging tech, and deliver consulting engagements
|
||||
for clients under NDA. I work across AI/ML, financial automation, web applications,
|
||||
and climate/sustainability tech.
|
||||
|
||||
## How I work with agents
|
||||
|
||||
- I think like a product manager — I care about *why* before *how*
|
||||
- I want agents to be opinionated and push back, not just execute blindly
|
||||
- I prefer concise responses; skip ceremony and get to the point
|
||||
- When I say "build this", I mean production-quality with tests, not a demo
|
||||
- Ask me before making irreversible changes or adding heavy dependencies
|
||||
- I work with confidential client data — never send it to cloud APIs unless I explicitly say it's OK
|
||||
|
||||
## Behavior rules
|
||||
|
||||
These rules apply to every task across every project, regardless of harness.
|
||||
|
||||
1. **No assumptions.** Don't hide confusion — surface it. Surface tradeoffs explicitly.
|
||||
Think before coding; if the problem is unclear, ask or state assumptions before acting.
|
||||
2. **Minimum viable code.** Solve with the smallest change that works. Nothing
|
||||
speculative, no "while we're here" cleanups, no premature abstractions. Simplicity first.
|
||||
3. **Surgical changes.** Touch only what the task requires. Leave unrelated code,
|
||||
files, and formatting alone. Diffs should be small and reviewable.
|
||||
4. **Goal-driven execution.** Define clear success criteria up front for every task.
|
||||
Loop — implement, verify, refine — until those criteria are met. Don't claim
|
||||
completion without evidence (tests pass, command output, observed behavior).
|
||||
5. **Trunk-Based Development — commit directly to main.** Every commit is one
|
||||
logical change (one tool, one fix, one test) with passing tests. Main is always
|
||||
deployable. Never create long-lived feature branches.
|
||||
|
||||
**Exception — parallel agents on same repo:** If another agent is known to be
|
||||
actively working on the same repo simultaneously, create a short-lived branch
|
||||
(`agent/<description>`), finish the task, and merge to main within the same
|
||||
session. Do not leave agent branches open between sessions.
|
||||
|
||||
**Exception — external contributor or client four-eyes requirement:** Use
|
||||
PR flow only when a human reviewer outside the project is required. Document
|
||||
the reason in PROJECT.md.
|
||||
|
||||
## Default stack
|
||||
|
||||
| Layer | Default | Fallback | Last resort |
|
||||
|-------|---------|----------|-------------|
|
||||
| Language | Go | Python | TypeScript, Java, C |
|
||||
| UI | HTMX + Templ | Server-rendered HTML | React (only if SPA is justified) |
|
||||
| Build | Task (taskfile.dev) | Make | — |
|
||||
| Containers | Docker Compose (dev), k3s (prod) | — | — |
|
||||
| DB | PostgreSQL + sqlc | SQLite | — |
|
||||
| Search | pgvector (vector), BM25 | Qdrant (when >1M vectors or hybrid retrieval) | — |
|
||||
| Logging | slog (structured) | — | — |
|
||||
| Testing | Table-driven, testify | — | — |
|
||||
| Agents (Go) | google.golang.org/adk + pkg/litellm adapter | — | — |
|
||||
|
||||
Exploratory: Rust, Zig — I'll tell you when I want these.
|
||||
|
||||
## Code conventions
|
||||
|
||||
- **Go style**: golines, gofumpt, golangci-lint
|
||||
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
|
||||
- **Naming**: stdlib conventions, no stuttering
|
||||
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
|
||||
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), commit directly to main,
|
||||
one logical change per commit, CI is the quality gate
|
||||
- **Never**: long-lived feature branches, PRs for solo work, direct push without
|
||||
passing `task check` locally first
|
||||
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
|
||||
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc, google.golang.org/adk (agent projects only) are pre-approved; anything else needs justification in the commit message
|
||||
|
||||
## Infrastructure
|
||||
|
||||
Three machines on Tailscale:
|
||||
|
||||
| Machine | Role | Key specs |
|
||||
|---------|------|-----------|
|
||||
| koala | GPU inference, heavy compute | RTX 5070, runs k3s + llama-swap + shared postgres18/pgvector |
|
||||
| iguana | Services, builds | M2 Ultra Mac |
|
||||
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
|
||||
|
||||
- **Model routing**: LiteLLM in front of llama-swap (local) + cloud APIs (when permitted)
|
||||
- **Orchestration**: k3s cluster across all three machines
|
||||
- **Networking**: Tailscale mesh
|
||||
|
||||
## Project landscape
|
||||
|
||||
All development repos live at `~/dev/` (softlink from `~/Documents/local-dev/`).
|
||||
|
||||
Organized in thematic folders:
|
||||
|
||||
| Folder | Focus | Count |
|
||||
|--------|-------|-------|
|
||||
| `GO/` | Go web frameworks, API integrations, learning projects | ~10 |
|
||||
| `AI/` | ML research, AI frameworks (FinRL, DSPy, crawl4ai) | ~6 |
|
||||
| `AGENTS/` | Autonomous agents, coding agents, MCP servers, infra | ~15 |
|
||||
| `QKX/` | Invoice processing, financial automation, payment systems | ~13 |
|
||||
| `XT/` | Climate data, sustainability (Klimatkollen, Garbo) | ~2 |
|
||||
|
||||
See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
|
||||
|
||||
### Key active projects
|
||||
|
||||
- **super-koala** (`AGENTS/`) — multi-component agent stack with LangGraph, DSPy, MCP
|
||||
- **azure-tiger** (`QKX/`) — invoice extraction → ISO 20022 payment instructions
|
||||
- **gocrwl** (`AGENTS/`) — Go web crawler with containerized deployment
|
||||
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
|
||||
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
|
||||
|
||||
## Knowledge base — actively use it
|
||||
|
||||
A persistent brain (BM25 search + LLM-synthesised Q&A) survives across sessions,
|
||||
hosts, and harnesses. It holds 100+ hard-won entries: infra incident postmortems,
|
||||
Go pitfalls, framework gotchas, design principles, ADRs. **It is not optional
|
||||
reference material — query it actively, not just when explicitly told.**
|
||||
|
||||
### When to query (treat as a reflex)
|
||||
|
||||
- **Before** starting a non-trivial task — search for prior art with the symptom
|
||||
AND the system component ("how did we solve X in Y?"). 5 seconds beats 5 hours.
|
||||
- **When debugging** — search for the error string, the stack frame, the affected
|
||||
service. Past you may have already paid this tax.
|
||||
- **Before adopting** a pattern, library, framework, or model name — check if it
|
||||
was tried and rejected, or what the integration footguns are.
|
||||
- **When making architectural decisions** — search for the domain + "ADR" or
|
||||
"decision" to find prior reasoning before re-deriving it.
|
||||
- **When a recommendation feels novel** — challenge yourself: "has this been
|
||||
documented?" The brain often has it.
|
||||
|
||||
### When to write
|
||||
|
||||
After you discover something that **future-you would forget** and that **isn't
|
||||
recoverable from the code, git log, or PR description alone**:
|
||||
|
||||
- Bugs whose root cause is non-obvious and generalisable beyond this project.
|
||||
- Framework / library / model-name quirks that bit you and would bite anyone.
|
||||
- Design principles validated under fire (e.g. "every `_get` needs a `_list`").
|
||||
- Postmortems for incidents: what broke, why, how diagnosed, what to do next time.
|
||||
|
||||
DON'T write project status, sprint progress, PR summaries, or "what I did this
|
||||
session" — those rot fast and the originals are in git/gitea anyway. Brain
|
||||
entries that age well are about *why*, *how to avoid*, and *what to do when*.
|
||||
|
||||
### How to access (per harness)
|
||||
|
||||
| Harness | Query | Write |
|
||||
|---------|-------|-------|
|
||||
| **Claude Code, Claude Desktop** | `brain_query` (BM25), `brain_answer` (LLM-synth + sources) MCP tools | `brain_write` MCP tool |
|
||||
| **Crush, Pi, Antigravity, other MCP-capable** | same MCP server: `ingestion-brain` (via the `mcp__*_brain__*` namespace once authenticated) | same |
|
||||
| **Anything HTTP-only (curl, scripts)** | `POST https://brain-mcp.d-ma.be/query` with `{"query":"..."}` (auth via `BRAIN_MCP_TOKEN`) | `POST .../write` with `{"content":"...","filename":"..."}` |
|
||||
| **Browser / human inspection** | `https://gitea.d-ma.be/mathias/hyperguild` → `knowledge/` and `wiki/` markdown files |
|
||||
|
||||
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`.
|
||||
- **Routing**: brain_answer's LLM uses berget.ai as primary, iguana ollama as
|
||||
fallback. Both are configurable in the `supervisor/ingestion-deployment.yaml`
|
||||
on the koala k3s cluster; don't hardcode local-only model names into the
|
||||
berget URL (see knowledge entry on namespace mismatches).
|
||||
|
||||
### Quick reflex checks
|
||||
|
||||
If you find yourself about to say any of these out loud, you owe yourself a brain query first:
|
||||
|
||||
- "I think the issue might be..."
|
||||
- "Let me try X and see..."
|
||||
- "I'll just write a script to..."
|
||||
- "This is probably a new bug..."
|
||||
- "Has anyone done this before?" — *yes, probably, go check.*
|
||||
|
||||
## Client work rules
|
||||
|
||||
When working on a project tagged with a client name:
|
||||
1. Never send code, data, or context to cloud APIs — use local models only
|
||||
2. Never reference other client projects or their data
|
||||
3. Keep all artifacts within the client's git org / directory
|
||||
4. Treat everything as confidential unless told otherwise
|
||||
|
||||
## Harness-agnostic principles
|
||||
|
||||
This context is designed to work with any AI coding tool:
|
||||
- Claude Code, Cursor, Aider, Open WebUI, Charmbracelet Mods/Crush
|
||||
- Pi Coding Agent, Mistral Vibe, Antigravity
|
||||
- Any tool that accepts a system prompt or reads a markdown context file
|
||||
|
||||
The canonical source is always `.context/AGENT.md` (root) and `.context/PROJECT.md` (per-project).
|
||||
Derived files are committed (see *How context propagates* below) so a `git pull` on any host yields full agent context with no setup.
|
||||
|
||||
## How context propagates
|
||||
|
||||
Canonical sources of truth:
|
||||
- Universal: `~/dev/.context/AGENT.md` (this file)
|
||||
- Project: `<repo>/.context/PROJECT.md` (per-repo)
|
||||
|
||||
Derived files (committed, regenerated by `task context:sync`):
|
||||
- `CLAUDE.md`, `AGENTS.md`, `.cursorrules`, `.aider.conventions.md`,
|
||||
`.context/system-prompt.txt`
|
||||
|
||||
Workflow:
|
||||
1. Edit a canonical file. Run `task context:sync`. Commit canonical and
|
||||
derived together. Push.
|
||||
2. On any other host, `git pull` brings both. Claude Code (tree-walking)
|
||||
uses `CLAUDE.md`; Crush / Pi / Antigravity (cwd-only) use `AGENTS.md`;
|
||||
Cursor uses `.cursorrules`; Aider uses `.aider.conventions.md`.
|
||||
3. `task check` runs `context:sync` then asserts `git status --porcelain`
|
||||
is empty over the derived files (catches both modified-tracked drift
|
||||
and missing-untracked adapters). A drift fails the check with a
|
||||
message telling you to stage the regenerated files.
|
||||
|
||||
Behavior rules in this file and per-project rules in `PROJECT.md` apply
|
||||
unconditionally on every host, every harness.
|
||||
|
||||
## Engineering Skills
|
||||
|
||||
Shared engineering skills are available in `~/dev/.skills/`. Load on demand via the index.
|
||||
|
||||
See `~/dev/.skills/SKILLS_INDEX.md` for the full list with descriptions and "use when" triggers.
|
||||
|
||||
Key skills:
|
||||
- **TDD**: always write tests first — load `tdd` skill
|
||||
- **Code Review**: load `code-review` skill before any review
|
||||
- **SOLID/Clean Code**: load `solid` or `clean-code` skill for design work
|
||||
- **Problem first**: load `problem-analysis` skill before coding non-trivial features
|
||||
|
||||
---
|
||||
|
||||
# Project context
|
||||
|
||||
<!-- Canonical project context. Edit this, run `task context:sync`.
|
||||
Root agent context from ~/dev/.context/AGENT.md is automatically
|
||||
prepended for harnesses that don't walk the directory tree. -->
|
||||
|
||||
## Identity
|
||||
|
||||
- **Name**: supervisor
|
||||
- **Owner**: Mathias
|
||||
- **Client**: personal
|
||||
- **Repo**:
|
||||
- **Status**: active
|
||||
|
||||
## Stack
|
||||
|
||||
- **Primary language**: Go
|
||||
- **UI layer**: HTMX + Templ (when applicable)
|
||||
- **Fallback languages**: Python, TypeScript (justify in PR if used)
|
||||
- **Build**: Task (taskfile.dev), not Make
|
||||
- **Containers**: Docker (compose for dev, k3s for deploy)
|
||||
- **Target infra**: koala (GPU workloads), iguana (services), flamingo (edge)
|
||||
|
||||
## Conventions
|
||||
|
||||
### Code style
|
||||
- Go: follow `golines`, `gofumpt`, `golangci-lint` with project config
|
||||
- Tests: table-driven, in `_test.go` next to source, `testify` for assertions
|
||||
- Errors: wrap with `fmt.Errorf("operation: %w", err)`, no naked returns
|
||||
- Naming: stdlib conventions, no stuttering (`http.Client` not `http.HTTPClient`)
|
||||
|
||||
### Architecture preferences
|
||||
- Prefer standard library over frameworks (net/http over gin/echo)
|
||||
- Dependency injection via constructor functions, not containers
|
||||
- Configuration via environment variables, parsed at startup into a typed struct
|
||||
- Structured logging via `slog`
|
||||
|
||||
### Git
|
||||
- Conventional commits: `feat:`, `fix:`, `chore:`, `docs:`, `refactor:`
|
||||
- Branch naming: `feat/short-description`, `fix/short-description`
|
||||
- PRs: one concern per PR, description explains *why* not *what*
|
||||
|
||||
### Security
|
||||
- No secrets in code, ever — use env vars or SOPS-encrypted files
|
||||
- Client data never leaves local network unless explicitly cleared
|
||||
- Dependencies: audit with `govulncheck` before adding
|
||||
|
||||
## MCP endpoints
|
||||
|
||||
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
|
||||
|
||||
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
|
||||
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
|
||||
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
|
||||
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
|
||||
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
|
||||
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
|
||||
(opt-in). Only `mode client-local` registers this endpoint.
|
||||
|
||||
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
|
||||
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
|
||||
the routing pod; brain tools moved to the brain MCP.
|
||||
|
||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
|
||||
for shell scripts and non-MCP clients.
|
||||
|
||||
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
|
||||
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
|
||||
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
|
||||
|
||||
## Agent instructions
|
||||
|
||||
When acting as a coding agent on this project:
|
||||
|
||||
1. Read this file and all `SKILL.md` files in `.skills/` before starting work
|
||||
2. Run `task check` before committing (lint + test + vet)
|
||||
3. If unsure about a convention, check `DECISIONS.md` or ask
|
||||
4. Never modify files outside the project root without explicit permission
|
||||
5. When adding a dependency, explain why in the commit message
|
||||
6. For client projects: never send code or context to cloud APIs — use local models via LiteLLM
|
||||
@@ -45,13 +45,30 @@
|
||||
- Client data never leaves local network unless explicitly cleared
|
||||
- Dependencies: audit with `govulncheck` before adding
|
||||
|
||||
## Knowledge base access
|
||||
## MCP endpoints
|
||||
|
||||
This project can query the shared knowledge base via MCP or HTTP:
|
||||
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
|
||||
|
||||
- **MCP endpoint**: `mcp://localhost:3100/knowledge`
|
||||
- **HTTP fallback**: `http://localhost:3100/api/v1/search`
|
||||
- **Scoping**: queries are filtered to collection `personal` + `public`
|
||||
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
|
||||
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
|
||||
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
|
||||
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
|
||||
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
|
||||
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
|
||||
(opt-in). Only `mode client-local` registers this endpoint.
|
||||
|
||||
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
|
||||
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
|
||||
the routing pod; brain tools moved to the brain MCP.
|
||||
|
||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
|
||||
for shell scripts and non-MCP clients.
|
||||
|
||||
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
|
||||
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
|
||||
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
|
||||
|
||||
## Agent instructions
|
||||
|
||||
|
||||
322
.context/system-prompt.txt
Normal file
322
.context/system-prompt.txt
Normal file
@@ -0,0 +1,322 @@
|
||||
You are a coding assistant working on a specific project.
|
||||
Follow all conventions from both the root agent context and project context.
|
||||
|
||||
---
|
||||
|
||||
# Agent context — Mathias workspace
|
||||
|
||||
<!-- Canonical root context for all AI coding agents.
|
||||
Lives at: ~/dev/.context/AGENT.md
|
||||
Applies to every project under ~/dev/ unless overridden.
|
||||
|
||||
Run `task context:sync` from ~/dev/ to regenerate harness-specific files.
|
||||
Project-level context in .context/PROJECT.md layers on top of this. -->
|
||||
|
||||
## Who I am
|
||||
|
||||
I'm Mathias, a digital product manager and technology consultant based in Sweden.
|
||||
I build software, research emerging tech, and deliver consulting engagements
|
||||
for clients under NDA. I work across AI/ML, financial automation, web applications,
|
||||
and climate/sustainability tech.
|
||||
|
||||
## How I work with agents
|
||||
|
||||
- I think like a product manager — I care about *why* before *how*
|
||||
- I want agents to be opinionated and push back, not just execute blindly
|
||||
- I prefer concise responses; skip ceremony and get to the point
|
||||
- When I say "build this", I mean production-quality with tests, not a demo
|
||||
- Ask me before making irreversible changes or adding heavy dependencies
|
||||
- I work with confidential client data — never send it to cloud APIs unless I explicitly say it's OK
|
||||
|
||||
## Behavior rules
|
||||
|
||||
These rules apply to every task across every project, regardless of harness.
|
||||
|
||||
1. **No assumptions.** Don't hide confusion — surface it. Surface tradeoffs explicitly.
|
||||
Think before coding; if the problem is unclear, ask or state assumptions before acting.
|
||||
2. **Minimum viable code.** Solve with the smallest change that works. Nothing
|
||||
speculative, no "while we're here" cleanups, no premature abstractions. Simplicity first.
|
||||
3. **Surgical changes.** Touch only what the task requires. Leave unrelated code,
|
||||
files, and formatting alone. Diffs should be small and reviewable.
|
||||
4. **Goal-driven execution.** Define clear success criteria up front for every task.
|
||||
Loop — implement, verify, refine — until those criteria are met. Don't claim
|
||||
completion without evidence (tests pass, command output, observed behavior).
|
||||
5. **Trunk-Based Development — commit directly to main.** Every commit is one
|
||||
logical change (one tool, one fix, one test) with passing tests. Main is always
|
||||
deployable. Never create long-lived feature branches.
|
||||
|
||||
**Exception — parallel agents on same repo:** If another agent is known to be
|
||||
actively working on the same repo simultaneously, create a short-lived branch
|
||||
(`agent/<description>`), finish the task, and merge to main within the same
|
||||
session. Do not leave agent branches open between sessions.
|
||||
|
||||
**Exception — external contributor or client four-eyes requirement:** Use
|
||||
PR flow only when a human reviewer outside the project is required. Document
|
||||
the reason in PROJECT.md.
|
||||
|
||||
## Default stack
|
||||
|
||||
| Layer | Default | Fallback | Last resort |
|
||||
|-------|---------|----------|-------------|
|
||||
| Language | Go | Python | TypeScript, Java, C |
|
||||
| UI | HTMX + Templ | Server-rendered HTML | React (only if SPA is justified) |
|
||||
| Build | Task (taskfile.dev) | Make | — |
|
||||
| Containers | Docker Compose (dev), k3s (prod) | — | — |
|
||||
| DB | PostgreSQL + sqlc | SQLite | — |
|
||||
| Search | pgvector (vector), BM25 | Qdrant (when >1M vectors or hybrid retrieval) | — |
|
||||
| Logging | slog (structured) | — | — |
|
||||
| Testing | Table-driven, testify | — | — |
|
||||
| Agents (Go) | google.golang.org/adk + pkg/litellm adapter | — | — |
|
||||
|
||||
Exploratory: Rust, Zig — I'll tell you when I want these.
|
||||
|
||||
## Code conventions
|
||||
|
||||
- **Go style**: golines, gofumpt, golangci-lint
|
||||
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
|
||||
- **Naming**: stdlib conventions, no stuttering
|
||||
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
|
||||
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), commit directly to main,
|
||||
one logical change per commit, CI is the quality gate
|
||||
- **Never**: long-lived feature branches, PRs for solo work, direct push without
|
||||
passing `task check` locally first
|
||||
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
|
||||
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc, google.golang.org/adk (agent projects only) are pre-approved; anything else needs justification in the commit message
|
||||
|
||||
## Infrastructure
|
||||
|
||||
Three machines on Tailscale:
|
||||
|
||||
| Machine | Role | Key specs |
|
||||
|---------|------|-----------|
|
||||
| koala | GPU inference, heavy compute | RTX 5070, runs k3s + llama-swap + shared postgres18/pgvector |
|
||||
| iguana | Services, builds | M2 Ultra Mac |
|
||||
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
|
||||
|
||||
- **Model routing**: LiteLLM in front of llama-swap (local) + cloud APIs (when permitted)
|
||||
- **Orchestration**: k3s cluster across all three machines
|
||||
- **Networking**: Tailscale mesh
|
||||
|
||||
## Project landscape
|
||||
|
||||
All development repos live at `~/dev/` (softlink from `~/Documents/local-dev/`).
|
||||
|
||||
Organized in thematic folders:
|
||||
|
||||
| Folder | Focus | Count |
|
||||
|--------|-------|-------|
|
||||
| `GO/` | Go web frameworks, API integrations, learning projects | ~10 |
|
||||
| `AI/` | ML research, AI frameworks (FinRL, DSPy, crawl4ai) | ~6 |
|
||||
| `AGENTS/` | Autonomous agents, coding agents, MCP servers, infra | ~15 |
|
||||
| `QKX/` | Invoice processing, financial automation, payment systems | ~13 |
|
||||
| `XT/` | Climate data, sustainability (Klimatkollen, Garbo) | ~2 |
|
||||
|
||||
See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
|
||||
|
||||
### Key active projects
|
||||
|
||||
- **super-koala** (`AGENTS/`) — multi-component agent stack with LangGraph, DSPy, MCP
|
||||
- **azure-tiger** (`QKX/`) — invoice extraction → ISO 20022 payment instructions
|
||||
- **gocrwl** (`AGENTS/`) — Go web crawler with containerized deployment
|
||||
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
|
||||
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
|
||||
|
||||
## Knowledge base — actively use it
|
||||
|
||||
A persistent brain (BM25 search + LLM-synthesised Q&A) survives across sessions,
|
||||
hosts, and harnesses. It holds 100+ hard-won entries: infra incident postmortems,
|
||||
Go pitfalls, framework gotchas, design principles, ADRs. **It is not optional
|
||||
reference material — query it actively, not just when explicitly told.**
|
||||
|
||||
### When to query (treat as a reflex)
|
||||
|
||||
- **Before** starting a non-trivial task — search for prior art with the symptom
|
||||
AND the system component ("how did we solve X in Y?"). 5 seconds beats 5 hours.
|
||||
- **When debugging** — search for the error string, the stack frame, the affected
|
||||
service. Past you may have already paid this tax.
|
||||
- **Before adopting** a pattern, library, framework, or model name — check if it
|
||||
was tried and rejected, or what the integration footguns are.
|
||||
- **When making architectural decisions** — search for the domain + "ADR" or
|
||||
"decision" to find prior reasoning before re-deriving it.
|
||||
- **When a recommendation feels novel** — challenge yourself: "has this been
|
||||
documented?" The brain often has it.
|
||||
|
||||
### When to write
|
||||
|
||||
After you discover something that **future-you would forget** and that **isn't
|
||||
recoverable from the code, git log, or PR description alone**:
|
||||
|
||||
- Bugs whose root cause is non-obvious and generalisable beyond this project.
|
||||
- Framework / library / model-name quirks that bit you and would bite anyone.
|
||||
- Design principles validated under fire (e.g. "every `_get` needs a `_list`").
|
||||
- Postmortems for incidents: what broke, why, how diagnosed, what to do next time.
|
||||
|
||||
DON'T write project status, sprint progress, PR summaries, or "what I did this
|
||||
session" — those rot fast and the originals are in git/gitea anyway. Brain
|
||||
entries that age well are about *why*, *how to avoid*, and *what to do when*.
|
||||
|
||||
### How to access (per harness)
|
||||
|
||||
| Harness | Query | Write |
|
||||
|---------|-------|-------|
|
||||
| **Claude Code, Claude Desktop** | `brain_query` (BM25), `brain_answer` (LLM-synth + sources) MCP tools | `brain_write` MCP tool |
|
||||
| **Crush, Pi, Antigravity, other MCP-capable** | same MCP server: `ingestion-brain` (via the `mcp__*_brain__*` namespace once authenticated) | same |
|
||||
| **Anything HTTP-only (curl, scripts)** | `POST https://brain-mcp.d-ma.be/query` with `{"query":"..."}` (auth via `BRAIN_MCP_TOKEN`) | `POST .../write` with `{"content":"...","filename":"..."}` |
|
||||
| **Browser / human inspection** | `https://gitea.d-ma.be/mathias/hyperguild` → `knowledge/` and `wiki/` markdown files |
|
||||
|
||||
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`.
|
||||
- **Routing**: brain_answer's LLM uses berget.ai as primary, iguana ollama as
|
||||
fallback. Both are configurable in the `supervisor/ingestion-deployment.yaml`
|
||||
on the koala k3s cluster; don't hardcode local-only model names into the
|
||||
berget URL (see knowledge entry on namespace mismatches).
|
||||
|
||||
### Quick reflex checks
|
||||
|
||||
If you find yourself about to say any of these out loud, you owe yourself a brain query first:
|
||||
|
||||
- "I think the issue might be..."
|
||||
- "Let me try X and see..."
|
||||
- "I'll just write a script to..."
|
||||
- "This is probably a new bug..."
|
||||
- "Has anyone done this before?" — *yes, probably, go check.*
|
||||
|
||||
## Client work rules
|
||||
|
||||
When working on a project tagged with a client name:
|
||||
1. Never send code, data, or context to cloud APIs — use local models only
|
||||
2. Never reference other client projects or their data
|
||||
3. Keep all artifacts within the client's git org / directory
|
||||
4. Treat everything as confidential unless told otherwise
|
||||
|
||||
## Harness-agnostic principles
|
||||
|
||||
This context is designed to work with any AI coding tool:
|
||||
- Claude Code, Cursor, Aider, Open WebUI, Charmbracelet Mods/Crush
|
||||
- Pi Coding Agent, Mistral Vibe, Antigravity
|
||||
- Any tool that accepts a system prompt or reads a markdown context file
|
||||
|
||||
The canonical source is always `.context/AGENT.md` (root) and `.context/PROJECT.md` (per-project).
|
||||
Derived files are committed (see *How context propagates* below) so a `git pull` on any host yields full agent context with no setup.
|
||||
|
||||
## How context propagates
|
||||
|
||||
Canonical sources of truth:
|
||||
- Universal: `~/dev/.context/AGENT.md` (this file)
|
||||
- Project: `<repo>/.context/PROJECT.md` (per-repo)
|
||||
|
||||
Derived files (committed, regenerated by `task context:sync`):
|
||||
- `CLAUDE.md`, `AGENTS.md`, `.cursorrules`, `.aider.conventions.md`,
|
||||
`.context/system-prompt.txt`
|
||||
|
||||
Workflow:
|
||||
1. Edit a canonical file. Run `task context:sync`. Commit canonical and
|
||||
derived together. Push.
|
||||
2. On any other host, `git pull` brings both. Claude Code (tree-walking)
|
||||
uses `CLAUDE.md`; Crush / Pi / Antigravity (cwd-only) use `AGENTS.md`;
|
||||
Cursor uses `.cursorrules`; Aider uses `.aider.conventions.md`.
|
||||
3. `task check` runs `context:sync` then asserts `git status --porcelain`
|
||||
is empty over the derived files (catches both modified-tracked drift
|
||||
and missing-untracked adapters). A drift fails the check with a
|
||||
message telling you to stage the regenerated files.
|
||||
|
||||
Behavior rules in this file and per-project rules in `PROJECT.md` apply
|
||||
unconditionally on every host, every harness.
|
||||
|
||||
## Engineering Skills
|
||||
|
||||
Shared engineering skills are available in `~/dev/.skills/`. Load on demand via the index.
|
||||
|
||||
See `~/dev/.skills/SKILLS_INDEX.md` for the full list with descriptions and "use when" triggers.
|
||||
|
||||
Key skills:
|
||||
- **TDD**: always write tests first — load `tdd` skill
|
||||
- **Code Review**: load `code-review` skill before any review
|
||||
- **SOLID/Clean Code**: load `solid` or `clean-code` skill for design work
|
||||
- **Problem first**: load `problem-analysis` skill before coding non-trivial features
|
||||
|
||||
---
|
||||
|
||||
# Project context
|
||||
|
||||
<!-- Canonical project context. Edit this, run `task context:sync`.
|
||||
Root agent context from ~/dev/.context/AGENT.md is automatically
|
||||
prepended for harnesses that don't walk the directory tree. -->
|
||||
|
||||
## Identity
|
||||
|
||||
- **Name**: supervisor
|
||||
- **Owner**: Mathias
|
||||
- **Client**: personal
|
||||
- **Repo**:
|
||||
- **Status**: active
|
||||
|
||||
## Stack
|
||||
|
||||
- **Primary language**: Go
|
||||
- **UI layer**: HTMX + Templ (when applicable)
|
||||
- **Fallback languages**: Python, TypeScript (justify in PR if used)
|
||||
- **Build**: Task (taskfile.dev), not Make
|
||||
- **Containers**: Docker (compose for dev, k3s for deploy)
|
||||
- **Target infra**: koala (GPU workloads), iguana (services), flamingo (edge)
|
||||
|
||||
## Conventions
|
||||
|
||||
### Code style
|
||||
- Go: follow `golines`, `gofumpt`, `golangci-lint` with project config
|
||||
- Tests: table-driven, in `_test.go` next to source, `testify` for assertions
|
||||
- Errors: wrap with `fmt.Errorf("operation: %w", err)`, no naked returns
|
||||
- Naming: stdlib conventions, no stuttering (`http.Client` not `http.HTTPClient`)
|
||||
|
||||
### Architecture preferences
|
||||
- Prefer standard library over frameworks (net/http over gin/echo)
|
||||
- Dependency injection via constructor functions, not containers
|
||||
- Configuration via environment variables, parsed at startup into a typed struct
|
||||
- Structured logging via `slog`
|
||||
|
||||
### Git
|
||||
- Conventional commits: `feat:`, `fix:`, `chore:`, `docs:`, `refactor:`
|
||||
- Branch naming: `feat/short-description`, `fix/short-description`
|
||||
- PRs: one concern per PR, description explains *why* not *what*
|
||||
|
||||
### Security
|
||||
- No secrets in code, ever — use env vars or SOPS-encrypted files
|
||||
- Client data never leaves local network unless explicitly cleared
|
||||
- Dependencies: audit with `govulncheck` before adding
|
||||
|
||||
## MCP endpoints
|
||||
|
||||
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
|
||||
|
||||
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
|
||||
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
|
||||
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
|
||||
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
|
||||
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
|
||||
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
|
||||
(opt-in). Only `mode client-local` registers this endpoint.
|
||||
|
||||
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
|
||||
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
|
||||
the routing pod; brain tools moved to the brain MCP.
|
||||
|
||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
|
||||
for shell scripts and non-MCP clients.
|
||||
|
||||
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
|
||||
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
|
||||
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
|
||||
|
||||
## Agent instructions
|
||||
|
||||
When acting as a coding agent on this project:
|
||||
|
||||
1. Read this file and all `SKILL.md` files in `.skills/` before starting work
|
||||
2. Run `task check` before committing (lint + test + vet)
|
||||
3. If unsure about a convention, check `DECISIONS.md` or ask
|
||||
4. Never modify files outside the project root without explicit permission
|
||||
5. When adding a dependency, explain why in the commit message
|
||||
6. For client projects: never send code or context to cloud APIs — use local models via LiteLLM
|
||||
|
||||
---
|
||||
318
.cursorrules
Normal file
318
.cursorrules
Normal file
@@ -0,0 +1,318 @@
|
||||
# Cursor rules — auto-generated
|
||||
# Do not edit. Run: task context:sync
|
||||
|
||||
# Agent context — Mathias workspace
|
||||
|
||||
<!-- Canonical root context for all AI coding agents.
|
||||
Lives at: ~/dev/.context/AGENT.md
|
||||
Applies to every project under ~/dev/ unless overridden.
|
||||
|
||||
Run `task context:sync` from ~/dev/ to regenerate harness-specific files.
|
||||
Project-level context in .context/PROJECT.md layers on top of this. -->
|
||||
|
||||
## Who I am
|
||||
|
||||
I'm Mathias, a digital product manager and technology consultant based in Sweden.
|
||||
I build software, research emerging tech, and deliver consulting engagements
|
||||
for clients under NDA. I work across AI/ML, financial automation, web applications,
|
||||
and climate/sustainability tech.
|
||||
|
||||
## How I work with agents
|
||||
|
||||
- I think like a product manager — I care about *why* before *how*
|
||||
- I want agents to be opinionated and push back, not just execute blindly
|
||||
- I prefer concise responses; skip ceremony and get to the point
|
||||
- When I say "build this", I mean production-quality with tests, not a demo
|
||||
- Ask me before making irreversible changes or adding heavy dependencies
|
||||
- I work with confidential client data — never send it to cloud APIs unless I explicitly say it's OK
|
||||
|
||||
## Behavior rules
|
||||
|
||||
These rules apply to every task across every project, regardless of harness.
|
||||
|
||||
1. **No assumptions.** Don't hide confusion — surface it. Surface tradeoffs explicitly.
|
||||
Think before coding; if the problem is unclear, ask or state assumptions before acting.
|
||||
2. **Minimum viable code.** Solve with the smallest change that works. Nothing
|
||||
speculative, no "while we're here" cleanups, no premature abstractions. Simplicity first.
|
||||
3. **Surgical changes.** Touch only what the task requires. Leave unrelated code,
|
||||
files, and formatting alone. Diffs should be small and reviewable.
|
||||
4. **Goal-driven execution.** Define clear success criteria up front for every task.
|
||||
Loop — implement, verify, refine — until those criteria are met. Don't claim
|
||||
completion without evidence (tests pass, command output, observed behavior).
|
||||
5. **Trunk-Based Development — commit directly to main.** Every commit is one
|
||||
logical change (one tool, one fix, one test) with passing tests. Main is always
|
||||
deployable. Never create long-lived feature branches.
|
||||
|
||||
**Exception — parallel agents on same repo:** If another agent is known to be
|
||||
actively working on the same repo simultaneously, create a short-lived branch
|
||||
(`agent/<description>`), finish the task, and merge to main within the same
|
||||
session. Do not leave agent branches open between sessions.
|
||||
|
||||
**Exception — external contributor or client four-eyes requirement:** Use
|
||||
PR flow only when a human reviewer outside the project is required. Document
|
||||
the reason in PROJECT.md.
|
||||
|
||||
## Default stack
|
||||
|
||||
| Layer | Default | Fallback | Last resort |
|
||||
|-------|---------|----------|-------------|
|
||||
| Language | Go | Python | TypeScript, Java, C |
|
||||
| UI | HTMX + Templ | Server-rendered HTML | React (only if SPA is justified) |
|
||||
| Build | Task (taskfile.dev) | Make | — |
|
||||
| Containers | Docker Compose (dev), k3s (prod) | — | — |
|
||||
| DB | PostgreSQL + sqlc | SQLite | — |
|
||||
| Search | pgvector (vector), BM25 | Qdrant (when >1M vectors or hybrid retrieval) | — |
|
||||
| Logging | slog (structured) | — | — |
|
||||
| Testing | Table-driven, testify | — | — |
|
||||
| Agents (Go) | google.golang.org/adk + pkg/litellm adapter | — | — |
|
||||
|
||||
Exploratory: Rust, Zig — I'll tell you when I want these.
|
||||
|
||||
## Code conventions
|
||||
|
||||
- **Go style**: golines, gofumpt, golangci-lint
|
||||
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
|
||||
- **Naming**: stdlib conventions, no stuttering
|
||||
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
|
||||
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), commit directly to main,
|
||||
one logical change per commit, CI is the quality gate
|
||||
- **Never**: long-lived feature branches, PRs for solo work, direct push without
|
||||
passing `task check` locally first
|
||||
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
|
||||
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc, google.golang.org/adk (agent projects only) are pre-approved; anything else needs justification in the commit message
|
||||
|
||||
## Infrastructure
|
||||
|
||||
Three machines on Tailscale:
|
||||
|
||||
| Machine | Role | Key specs |
|
||||
|---------|------|-----------|
|
||||
| koala | GPU inference, heavy compute | RTX 5070, runs k3s + llama-swap + shared postgres18/pgvector |
|
||||
| iguana | Services, builds | M2 Ultra Mac |
|
||||
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
|
||||
|
||||
- **Model routing**: LiteLLM in front of llama-swap (local) + cloud APIs (when permitted)
|
||||
- **Orchestration**: k3s cluster across all three machines
|
||||
- **Networking**: Tailscale mesh
|
||||
|
||||
## Project landscape
|
||||
|
||||
All development repos live at `~/dev/` (softlink from `~/Documents/local-dev/`).
|
||||
|
||||
Organized in thematic folders:
|
||||
|
||||
| Folder | Focus | Count |
|
||||
|--------|-------|-------|
|
||||
| `GO/` | Go web frameworks, API integrations, learning projects | ~10 |
|
||||
| `AI/` | ML research, AI frameworks (FinRL, DSPy, crawl4ai) | ~6 |
|
||||
| `AGENTS/` | Autonomous agents, coding agents, MCP servers, infra | ~15 |
|
||||
| `QKX/` | Invoice processing, financial automation, payment systems | ~13 |
|
||||
| `XT/` | Climate data, sustainability (Klimatkollen, Garbo) | ~2 |
|
||||
|
||||
See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
|
||||
|
||||
### Key active projects
|
||||
|
||||
- **super-koala** (`AGENTS/`) — multi-component agent stack with LangGraph, DSPy, MCP
|
||||
- **azure-tiger** (`QKX/`) — invoice extraction → ISO 20022 payment instructions
|
||||
- **gocrwl** (`AGENTS/`) — Go web crawler with containerized deployment
|
||||
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
|
||||
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
|
||||
|
||||
## Knowledge base — actively use it
|
||||
|
||||
A persistent brain (BM25 search + LLM-synthesised Q&A) survives across sessions,
|
||||
hosts, and harnesses. It holds 100+ hard-won entries: infra incident postmortems,
|
||||
Go pitfalls, framework gotchas, design principles, ADRs. **It is not optional
|
||||
reference material — query it actively, not just when explicitly told.**
|
||||
|
||||
### When to query (treat as a reflex)
|
||||
|
||||
- **Before** starting a non-trivial task — search for prior art with the symptom
|
||||
AND the system component ("how did we solve X in Y?"). 5 seconds beats 5 hours.
|
||||
- **When debugging** — search for the error string, the stack frame, the affected
|
||||
service. Past you may have already paid this tax.
|
||||
- **Before adopting** a pattern, library, framework, or model name — check if it
|
||||
was tried and rejected, or what the integration footguns are.
|
||||
- **When making architectural decisions** — search for the domain + "ADR" or
|
||||
"decision" to find prior reasoning before re-deriving it.
|
||||
- **When a recommendation feels novel** — challenge yourself: "has this been
|
||||
documented?" The brain often has it.
|
||||
|
||||
### When to write
|
||||
|
||||
After you discover something that **future-you would forget** and that **isn't
|
||||
recoverable from the code, git log, or PR description alone**:
|
||||
|
||||
- Bugs whose root cause is non-obvious and generalisable beyond this project.
|
||||
- Framework / library / model-name quirks that bit you and would bite anyone.
|
||||
- Design principles validated under fire (e.g. "every `_get` needs a `_list`").
|
||||
- Postmortems for incidents: what broke, why, how diagnosed, what to do next time.
|
||||
|
||||
DON'T write project status, sprint progress, PR summaries, or "what I did this
|
||||
session" — those rot fast and the originals are in git/gitea anyway. Brain
|
||||
entries that age well are about *why*, *how to avoid*, and *what to do when*.
|
||||
|
||||
### How to access (per harness)
|
||||
|
||||
| Harness | Query | Write |
|
||||
|---------|-------|-------|
|
||||
| **Claude Code, Claude Desktop** | `brain_query` (BM25), `brain_answer` (LLM-synth + sources) MCP tools | `brain_write` MCP tool |
|
||||
| **Crush, Pi, Antigravity, other MCP-capable** | same MCP server: `ingestion-brain` (via the `mcp__*_brain__*` namespace once authenticated) | same |
|
||||
| **Anything HTTP-only (curl, scripts)** | `POST https://brain-mcp.d-ma.be/query` with `{"query":"..."}` (auth via `BRAIN_MCP_TOKEN`) | `POST .../write` with `{"content":"...","filename":"..."}` |
|
||||
| **Browser / human inspection** | `https://gitea.d-ma.be/mathias/hyperguild` → `knowledge/` and `wiki/` markdown files |
|
||||
|
||||
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`.
|
||||
- **Routing**: brain_answer's LLM uses berget.ai as primary, iguana ollama as
|
||||
fallback. Both are configurable in the `supervisor/ingestion-deployment.yaml`
|
||||
on the koala k3s cluster; don't hardcode local-only model names into the
|
||||
berget URL (see knowledge entry on namespace mismatches).
|
||||
|
||||
### Quick reflex checks
|
||||
|
||||
If you find yourself about to say any of these out loud, you owe yourself a brain query first:
|
||||
|
||||
- "I think the issue might be..."
|
||||
- "Let me try X and see..."
|
||||
- "I'll just write a script to..."
|
||||
- "This is probably a new bug..."
|
||||
- "Has anyone done this before?" — *yes, probably, go check.*
|
||||
|
||||
## Client work rules
|
||||
|
||||
When working on a project tagged with a client name:
|
||||
1. Never send code, data, or context to cloud APIs — use local models only
|
||||
2. Never reference other client projects or their data
|
||||
3. Keep all artifacts within the client's git org / directory
|
||||
4. Treat everything as confidential unless told otherwise
|
||||
|
||||
## Harness-agnostic principles
|
||||
|
||||
This context is designed to work with any AI coding tool:
|
||||
- Claude Code, Cursor, Aider, Open WebUI, Charmbracelet Mods/Crush
|
||||
- Pi Coding Agent, Mistral Vibe, Antigravity
|
||||
- Any tool that accepts a system prompt or reads a markdown context file
|
||||
|
||||
The canonical source is always `.context/AGENT.md` (root) and `.context/PROJECT.md` (per-project).
|
||||
Derived files are committed (see *How context propagates* below) so a `git pull` on any host yields full agent context with no setup.
|
||||
|
||||
## How context propagates
|
||||
|
||||
Canonical sources of truth:
|
||||
- Universal: `~/dev/.context/AGENT.md` (this file)
|
||||
- Project: `<repo>/.context/PROJECT.md` (per-repo)
|
||||
|
||||
Derived files (committed, regenerated by `task context:sync`):
|
||||
- `CLAUDE.md`, `AGENTS.md`, `.cursorrules`, `.aider.conventions.md`,
|
||||
`.context/system-prompt.txt`
|
||||
|
||||
Workflow:
|
||||
1. Edit a canonical file. Run `task context:sync`. Commit canonical and
|
||||
derived together. Push.
|
||||
2. On any other host, `git pull` brings both. Claude Code (tree-walking)
|
||||
uses `CLAUDE.md`; Crush / Pi / Antigravity (cwd-only) use `AGENTS.md`;
|
||||
Cursor uses `.cursorrules`; Aider uses `.aider.conventions.md`.
|
||||
3. `task check` runs `context:sync` then asserts `git status --porcelain`
|
||||
is empty over the derived files (catches both modified-tracked drift
|
||||
and missing-untracked adapters). A drift fails the check with a
|
||||
message telling you to stage the regenerated files.
|
||||
|
||||
Behavior rules in this file and per-project rules in `PROJECT.md` apply
|
||||
unconditionally on every host, every harness.
|
||||
|
||||
## Engineering Skills
|
||||
|
||||
Shared engineering skills are available in `~/dev/.skills/`. Load on demand via the index.
|
||||
|
||||
See `~/dev/.skills/SKILLS_INDEX.md` for the full list with descriptions and "use when" triggers.
|
||||
|
||||
Key skills:
|
||||
- **TDD**: always write tests first — load `tdd` skill
|
||||
- **Code Review**: load `code-review` skill before any review
|
||||
- **SOLID/Clean Code**: load `solid` or `clean-code` skill for design work
|
||||
- **Problem first**: load `problem-analysis` skill before coding non-trivial features
|
||||
|
||||
---
|
||||
|
||||
# Project context
|
||||
|
||||
<!-- Canonical project context. Edit this, run `task context:sync`.
|
||||
Root agent context from ~/dev/.context/AGENT.md is automatically
|
||||
prepended for harnesses that don't walk the directory tree. -->
|
||||
|
||||
## Identity
|
||||
|
||||
- **Name**: supervisor
|
||||
- **Owner**: Mathias
|
||||
- **Client**: personal
|
||||
- **Repo**:
|
||||
- **Status**: active
|
||||
|
||||
## Stack
|
||||
|
||||
- **Primary language**: Go
|
||||
- **UI layer**: HTMX + Templ (when applicable)
|
||||
- **Fallback languages**: Python, TypeScript (justify in PR if used)
|
||||
- **Build**: Task (taskfile.dev), not Make
|
||||
- **Containers**: Docker (compose for dev, k3s for deploy)
|
||||
- **Target infra**: koala (GPU workloads), iguana (services), flamingo (edge)
|
||||
|
||||
## Conventions
|
||||
|
||||
### Code style
|
||||
- Go: follow `golines`, `gofumpt`, `golangci-lint` with project config
|
||||
- Tests: table-driven, in `_test.go` next to source, `testify` for assertions
|
||||
- Errors: wrap with `fmt.Errorf("operation: %w", err)`, no naked returns
|
||||
- Naming: stdlib conventions, no stuttering (`http.Client` not `http.HTTPClient`)
|
||||
|
||||
### Architecture preferences
|
||||
- Prefer standard library over frameworks (net/http over gin/echo)
|
||||
- Dependency injection via constructor functions, not containers
|
||||
- Configuration via environment variables, parsed at startup into a typed struct
|
||||
- Structured logging via `slog`
|
||||
|
||||
### Git
|
||||
- Conventional commits: `feat:`, `fix:`, `chore:`, `docs:`, `refactor:`
|
||||
- Branch naming: `feat/short-description`, `fix/short-description`
|
||||
- PRs: one concern per PR, description explains *why* not *what*
|
||||
|
||||
### Security
|
||||
- No secrets in code, ever — use env vars or SOPS-encrypted files
|
||||
- Client data never leaves local network unless explicitly cleared
|
||||
- Dependencies: audit with `govulncheck` before adding
|
||||
|
||||
## MCP endpoints
|
||||
|
||||
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
|
||||
|
||||
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
|
||||
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
|
||||
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
|
||||
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
|
||||
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
|
||||
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
|
||||
(opt-in). Only `mode client-local` registers this endpoint.
|
||||
|
||||
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
|
||||
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
|
||||
the routing pod; brain tools moved to the brain MCP.
|
||||
|
||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
|
||||
for shell scripts and non-MCP clients.
|
||||
|
||||
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
|
||||
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
|
||||
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
|
||||
|
||||
## Agent instructions
|
||||
|
||||
When acting as a coding agent on this project:
|
||||
|
||||
1. Read this file and all `SKILL.md` files in `.skills/` before starting work
|
||||
2. Run `task check` before committing (lint + test + vet)
|
||||
3. If unsure about a convention, check `DECISIONS.md` or ask
|
||||
4. Never modify files outside the project root without explicit permission
|
||||
5. When adding a dependency, explain why in the commit message
|
||||
6. For client projects: never send code or context to cloud APIs — use local models via LiteLLM
|
||||
@@ -11,37 +11,16 @@ jobs:
|
||||
name: Build and deploy
|
||||
runs-on: self-hosted
|
||||
if: ${{ github.event.workflow_run.conclusion == 'success' && github.event.workflow_run.event == 'push' }}
|
||||
environment: staging
|
||||
env:
|
||||
SERVICE: supervisor
|
||||
IMAGE: gitea.d-ma.be/mathias/supervisor
|
||||
INGESTION_IMAGE: gitea.d-ma.be/mathias/ingestion
|
||||
ROUTING_IMAGE: gitea.d-ma.be/mathias/routing
|
||||
INFRA_REPO: git@gitea.d-ma.be:mathias/infra.git
|
||||
BUILDKIT_HOST: unix:///run/buildkit/buildkitd.sock
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
|
||||
- name: Build and push supervisor image
|
||||
run: |
|
||||
set -e
|
||||
trap 'rm -f /tmp/supervisor-image.tar' EXIT
|
||||
IMAGE_TAG="${{ github.sha }}"
|
||||
echo "Building ${IMAGE}:${IMAGE_TAG}"
|
||||
|
||||
buildctl --addr "${BUILDKIT_HOST}" build \
|
||||
--frontend dockerfile.v0 \
|
||||
--local context=. \
|
||||
--local dockerfile=. \
|
||||
--opt build-arg:VERSION="${IMAGE_TAG}" \
|
||||
--output type=oci,dest=/tmp/supervisor-image.tar
|
||||
|
||||
skopeo copy \
|
||||
oci-archive:/tmp/supervisor-image.tar \
|
||||
docker://${IMAGE}:${IMAGE_TAG} \
|
||||
--dest-creds "${{ secrets.REGISTRY_CREDS }}"
|
||||
|
||||
echo "Built and pushed ${IMAGE}:${IMAGE_TAG}"
|
||||
|
||||
- name: Build and push ingestion image
|
||||
run: |
|
||||
set -e
|
||||
@@ -62,6 +41,28 @@ jobs:
|
||||
|
||||
echo "Built and pushed ${INGESTION_IMAGE}:${IMAGE_TAG}"
|
||||
|
||||
- name: Build and push routing image
|
||||
run: |
|
||||
set -e
|
||||
trap 'rm -f /tmp/routing-image.tar' EXIT
|
||||
IMAGE_TAG="${{ github.sha }}"
|
||||
echo "Building ${ROUTING_IMAGE}:${IMAGE_TAG}"
|
||||
|
||||
buildctl --addr "${BUILDKIT_HOST}" build \
|
||||
--frontend dockerfile.v0 \
|
||||
--local context=. \
|
||||
--local dockerfile=. \
|
||||
--opt filename=Dockerfile.routing \
|
||||
--opt build-arg:VERSION="${IMAGE_TAG}" \
|
||||
--output type=oci,dest=/tmp/routing-image.tar
|
||||
|
||||
skopeo copy \
|
||||
oci-archive:/tmp/routing-image.tar \
|
||||
docker://${ROUTING_IMAGE}:${IMAGE_TAG} \
|
||||
--dest-creds "${{ secrets.REGISTRY_CREDS }}"
|
||||
|
||||
echo "Built and pushed ${ROUTING_IMAGE}:${IMAGE_TAG}"
|
||||
|
||||
- name: Update infra repo
|
||||
run: |
|
||||
set -e
|
||||
@@ -77,17 +78,89 @@ jobs:
|
||||
|
||||
cd /tmp/infra-update
|
||||
|
||||
sed -i "s|gitea.d-ma.be/mathias/supervisor:.*|gitea.d-ma.be/mathias/supervisor:${IMAGE_TAG}|" \
|
||||
"k3s/apps/${SERVICE}/deployment.yaml"
|
||||
|
||||
sed -i "s|gitea.d-ma.be/mathias/ingestion:.*|gitea.d-ma.be/mathias/ingestion:${IMAGE_TAG}|" \
|
||||
"k3s/apps/${SERVICE}/ingestion-deployment.yaml"
|
||||
"k3s/apps/supervisor/ingestion-deployment.yaml"
|
||||
|
||||
sed -i "s|gitea.d-ma.be/mathias/routing:.*|gitea.d-ma.be/mathias/routing:${IMAGE_TAG}|" \
|
||||
"k3s/apps/routing/deployment.yaml"
|
||||
|
||||
git config user.email "cd-bot@d-ma.be"
|
||||
git config user.name "CD Bot"
|
||||
git add "k3s/apps/${SERVICE}/deployment.yaml" "k3s/apps/${SERVICE}/ingestion-deployment.yaml"
|
||||
git commit -m "chore(deploy): ${SERVICE}+ingestion → ${IMAGE_TAG}"
|
||||
git add "k3s/apps/supervisor/ingestion-deployment.yaml" \
|
||||
"k3s/apps/routing/deployment.yaml"
|
||||
git commit -m "chore(deploy): ingestion+routing → ${IMAGE_TAG}"
|
||||
GIT_SSH_COMMAND="ssh -i ~/.ssh/infra_deploy_key -o IdentitiesOnly=yes" \
|
||||
git push
|
||||
|
||||
echo "Infra repo updated: ${SERVICE}+ingestion → ${IMAGE_TAG}"
|
||||
echo "Infra repo updated: ingestion+routing → ${IMAGE_TAG}"
|
||||
|
||||
- name: Trigger Flux reconcile (immediate)
|
||||
run: |
|
||||
kubectl -n flux-system annotate gitrepository flux-system \
|
||||
reconcile.fluxcd.io/requestedAt="$(date +%s)" --overwrite
|
||||
kubectl -n flux-system annotate kustomization apps \
|
||||
reconcile.fluxcd.io/requestedAt="$(date +%s)" --overwrite
|
||||
|
||||
- name: Wait for Flux to apply new ingestion image
|
||||
run: |
|
||||
EXPECTED="gitea.d-ma.be/mathias/ingestion:${{ github.sha }}"
|
||||
for i in $(seq 1 60); do
|
||||
CURRENT=$(kubectl get deploy ingestion -n supervisor \
|
||||
-o jsonpath='{.spec.template.spec.containers[0].image}' 2>/dev/null || echo "")
|
||||
if [ "$CURRENT" = "$EXPECTED" ]; then
|
||||
echo "✓ Flux applied ingestion image after ${i}s"
|
||||
break
|
||||
fi
|
||||
sleep 1
|
||||
done
|
||||
kubectl get deploy ingestion -n supervisor \
|
||||
-o jsonpath='{.spec.template.spec.containers[0].image}' \
|
||||
| grep -qx "$EXPECTED" \
|
||||
|| { echo "✗ Flux did not apply ingestion image within 60s"; exit 1; }
|
||||
|
||||
- name: Verify ingestion rollout
|
||||
run: |
|
||||
kubectl rollout status deployment/ingestion \
|
||||
--namespace supervisor \
|
||||
--timeout=120s \
|
||||
|| {
|
||||
echo "── pod status ──"
|
||||
kubectl get pods -n supervisor -o wide
|
||||
echo "── events ──"
|
||||
kubectl get events -n supervisor --sort-by='.lastTimestamp' | tail -20
|
||||
echo "── describe ──"
|
||||
kubectl describe pods -n supervisor -l app=ingestion | tail -40
|
||||
exit 1
|
||||
}
|
||||
|
||||
- name: Wait for Flux to apply new routing image
|
||||
run: |
|
||||
EXPECTED="gitea.d-ma.be/mathias/routing:${{ github.sha }}"
|
||||
for i in $(seq 1 60); do
|
||||
CURRENT=$(kubectl get deploy routing -n routing \
|
||||
-o jsonpath='{.spec.template.spec.containers[0].image}' 2>/dev/null || echo "")
|
||||
if [ "$CURRENT" = "$EXPECTED" ]; then
|
||||
echo "✓ Flux applied routing image after ${i}s"
|
||||
break
|
||||
fi
|
||||
sleep 1
|
||||
done
|
||||
kubectl get deploy routing -n routing \
|
||||
-o jsonpath='{.spec.template.spec.containers[0].image}' \
|
||||
| grep -qx "$EXPECTED" \
|
||||
|| { echo "✗ Flux did not apply routing image within 60s"; exit 1; }
|
||||
|
||||
- name: Verify routing rollout
|
||||
run: |
|
||||
kubectl rollout status deployment/routing \
|
||||
--namespace routing \
|
||||
--timeout=120s \
|
||||
|| {
|
||||
echo "── pod status ──"
|
||||
kubectl get pods -n routing -o wide
|
||||
echo "── events ──"
|
||||
kubectl get events -n routing --sort-by='.lastTimestamp' | tail -20
|
||||
echo "── describe ──"
|
||||
kubectl describe pods -n routing -l app=routing | tail -40
|
||||
exit 1
|
||||
}
|
||||
|
||||
8
.gitignore
vendored
8
.gitignore
vendored
@@ -13,15 +13,7 @@ brain/training-data/**/*.jsonl
|
||||
# Go
|
||||
vendor/
|
||||
|
||||
# ── Generated context files (adapter outputs) ──
|
||||
# Canonical sources: .context/PROJECT.md + .skills/*/SKILL.md
|
||||
# Everything below is disposable — regenerate with: task context:sync
|
||||
AGENTS.md
|
||||
CLAUDE.md
|
||||
.cursorrules
|
||||
.aider.conventions.md
|
||||
.aider.conf.yml
|
||||
.context/system-prompt.txt
|
||||
|
||||
# ── Sensitive ──
|
||||
.env
|
||||
|
||||
@@ -1,9 +1,10 @@
|
||||
{
|
||||
"mcpServers": {
|
||||
"supervisor": {
|
||||
"command": "/Users/mathias/dev/AI/supervisor/bin/supervisor-bridge",
|
||||
"env": {
|
||||
"SUPERVISOR_URL": "http://koala:30320/mcp"
|
||||
"brain": {
|
||||
"type": "http",
|
||||
"url": "https://brain-mcp.d-ma.be/mcp",
|
||||
"headers": {
|
||||
"Authorization": "Bearer ${BRAIN_MCP_TOKEN}"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
315
AGENTS.md
Normal file
315
AGENTS.md
Normal file
@@ -0,0 +1,315 @@
|
||||
# Agent context — Mathias workspace
|
||||
|
||||
<!-- Canonical root context for all AI coding agents.
|
||||
Lives at: ~/dev/.context/AGENT.md
|
||||
Applies to every project under ~/dev/ unless overridden.
|
||||
|
||||
Run `task context:sync` from ~/dev/ to regenerate harness-specific files.
|
||||
Project-level context in .context/PROJECT.md layers on top of this. -->
|
||||
|
||||
## Who I am
|
||||
|
||||
I'm Mathias, a digital product manager and technology consultant based in Sweden.
|
||||
I build software, research emerging tech, and deliver consulting engagements
|
||||
for clients under NDA. I work across AI/ML, financial automation, web applications,
|
||||
and climate/sustainability tech.
|
||||
|
||||
## How I work with agents
|
||||
|
||||
- I think like a product manager — I care about *why* before *how*
|
||||
- I want agents to be opinionated and push back, not just execute blindly
|
||||
- I prefer concise responses; skip ceremony and get to the point
|
||||
- When I say "build this", I mean production-quality with tests, not a demo
|
||||
- Ask me before making irreversible changes or adding heavy dependencies
|
||||
- I work with confidential client data — never send it to cloud APIs unless I explicitly say it's OK
|
||||
|
||||
## Behavior rules
|
||||
|
||||
These rules apply to every task across every project, regardless of harness.
|
||||
|
||||
1. **No assumptions.** Don't hide confusion — surface it. Surface tradeoffs explicitly.
|
||||
Think before coding; if the problem is unclear, ask or state assumptions before acting.
|
||||
2. **Minimum viable code.** Solve with the smallest change that works. Nothing
|
||||
speculative, no "while we're here" cleanups, no premature abstractions. Simplicity first.
|
||||
3. **Surgical changes.** Touch only what the task requires. Leave unrelated code,
|
||||
files, and formatting alone. Diffs should be small and reviewable.
|
||||
4. **Goal-driven execution.** Define clear success criteria up front for every task.
|
||||
Loop — implement, verify, refine — until those criteria are met. Don't claim
|
||||
completion without evidence (tests pass, command output, observed behavior).
|
||||
5. **Trunk-Based Development — commit directly to main.** Every commit is one
|
||||
logical change (one tool, one fix, one test) with passing tests. Main is always
|
||||
deployable. Never create long-lived feature branches.
|
||||
|
||||
**Exception — parallel agents on same repo:** If another agent is known to be
|
||||
actively working on the same repo simultaneously, create a short-lived branch
|
||||
(`agent/<description>`), finish the task, and merge to main within the same
|
||||
session. Do not leave agent branches open between sessions.
|
||||
|
||||
**Exception — external contributor or client four-eyes requirement:** Use
|
||||
PR flow only when a human reviewer outside the project is required. Document
|
||||
the reason in PROJECT.md.
|
||||
|
||||
## Default stack
|
||||
|
||||
| Layer | Default | Fallback | Last resort |
|
||||
|-------|---------|----------|-------------|
|
||||
| Language | Go | Python | TypeScript, Java, C |
|
||||
| UI | HTMX + Templ | Server-rendered HTML | React (only if SPA is justified) |
|
||||
| Build | Task (taskfile.dev) | Make | — |
|
||||
| Containers | Docker Compose (dev), k3s (prod) | — | — |
|
||||
| DB | PostgreSQL + sqlc | SQLite | — |
|
||||
| Search | pgvector (vector), BM25 | Qdrant (when >1M vectors or hybrid retrieval) | — |
|
||||
| Logging | slog (structured) | — | — |
|
||||
| Testing | Table-driven, testify | — | — |
|
||||
| Agents (Go) | google.golang.org/adk + pkg/litellm adapter | — | — |
|
||||
|
||||
Exploratory: Rust, Zig — I'll tell you when I want these.
|
||||
|
||||
## Code conventions
|
||||
|
||||
- **Go style**: golines, gofumpt, golangci-lint
|
||||
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
|
||||
- **Naming**: stdlib conventions, no stuttering
|
||||
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
|
||||
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), commit directly to main,
|
||||
one logical change per commit, CI is the quality gate
|
||||
- **Never**: long-lived feature branches, PRs for solo work, direct push without
|
||||
passing `task check` locally first
|
||||
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
|
||||
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc, google.golang.org/adk (agent projects only) are pre-approved; anything else needs justification in the commit message
|
||||
|
||||
## Infrastructure
|
||||
|
||||
Three machines on Tailscale:
|
||||
|
||||
| Machine | Role | Key specs |
|
||||
|---------|------|-----------|
|
||||
| koala | GPU inference, heavy compute | RTX 5070, runs k3s + llama-swap + shared postgres18/pgvector |
|
||||
| iguana | Services, builds | M2 Ultra Mac |
|
||||
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
|
||||
|
||||
- **Model routing**: LiteLLM in front of llama-swap (local) + cloud APIs (when permitted)
|
||||
- **Orchestration**: k3s cluster across all three machines
|
||||
- **Networking**: Tailscale mesh
|
||||
|
||||
## Project landscape
|
||||
|
||||
All development repos live at `~/dev/` (softlink from `~/Documents/local-dev/`).
|
||||
|
||||
Organized in thematic folders:
|
||||
|
||||
| Folder | Focus | Count |
|
||||
|--------|-------|-------|
|
||||
| `GO/` | Go web frameworks, API integrations, learning projects | ~10 |
|
||||
| `AI/` | ML research, AI frameworks (FinRL, DSPy, crawl4ai) | ~6 |
|
||||
| `AGENTS/` | Autonomous agents, coding agents, MCP servers, infra | ~15 |
|
||||
| `QKX/` | Invoice processing, financial automation, payment systems | ~13 |
|
||||
| `XT/` | Climate data, sustainability (Klimatkollen, Garbo) | ~2 |
|
||||
|
||||
See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
|
||||
|
||||
### Key active projects
|
||||
|
||||
- **super-koala** (`AGENTS/`) — multi-component agent stack with LangGraph, DSPy, MCP
|
||||
- **azure-tiger** (`QKX/`) — invoice extraction → ISO 20022 payment instructions
|
||||
- **gocrwl** (`AGENTS/`) — Go web crawler with containerized deployment
|
||||
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
|
||||
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
|
||||
|
||||
## Knowledge base — actively use it
|
||||
|
||||
A persistent brain (BM25 search + LLM-synthesised Q&A) survives across sessions,
|
||||
hosts, and harnesses. It holds 100+ hard-won entries: infra incident postmortems,
|
||||
Go pitfalls, framework gotchas, design principles, ADRs. **It is not optional
|
||||
reference material — query it actively, not just when explicitly told.**
|
||||
|
||||
### When to query (treat as a reflex)
|
||||
|
||||
- **Before** starting a non-trivial task — search for prior art with the symptom
|
||||
AND the system component ("how did we solve X in Y?"). 5 seconds beats 5 hours.
|
||||
- **When debugging** — search for the error string, the stack frame, the affected
|
||||
service. Past you may have already paid this tax.
|
||||
- **Before adopting** a pattern, library, framework, or model name — check if it
|
||||
was tried and rejected, or what the integration footguns are.
|
||||
- **When making architectural decisions** — search for the domain + "ADR" or
|
||||
"decision" to find prior reasoning before re-deriving it.
|
||||
- **When a recommendation feels novel** — challenge yourself: "has this been
|
||||
documented?" The brain often has it.
|
||||
|
||||
### When to write
|
||||
|
||||
After you discover something that **future-you would forget** and that **isn't
|
||||
recoverable from the code, git log, or PR description alone**:
|
||||
|
||||
- Bugs whose root cause is non-obvious and generalisable beyond this project.
|
||||
- Framework / library / model-name quirks that bit you and would bite anyone.
|
||||
- Design principles validated under fire (e.g. "every `_get` needs a `_list`").
|
||||
- Postmortems for incidents: what broke, why, how diagnosed, what to do next time.
|
||||
|
||||
DON'T write project status, sprint progress, PR summaries, or "what I did this
|
||||
session" — those rot fast and the originals are in git/gitea anyway. Brain
|
||||
entries that age well are about *why*, *how to avoid*, and *what to do when*.
|
||||
|
||||
### How to access (per harness)
|
||||
|
||||
| Harness | Query | Write |
|
||||
|---------|-------|-------|
|
||||
| **Claude Code, Claude Desktop** | `brain_query` (BM25), `brain_answer` (LLM-synth + sources) MCP tools | `brain_write` MCP tool |
|
||||
| **Crush, Pi, Antigravity, other MCP-capable** | same MCP server: `ingestion-brain` (via the `mcp__*_brain__*` namespace once authenticated) | same |
|
||||
| **Anything HTTP-only (curl, scripts)** | `POST https://brain-mcp.d-ma.be/query` with `{"query":"..."}` (auth via `BRAIN_MCP_TOKEN`) | `POST .../write` with `{"content":"...","filename":"..."}` |
|
||||
| **Browser / human inspection** | `https://gitea.d-ma.be/mathias/hyperguild` → `knowledge/` and `wiki/` markdown files |
|
||||
|
||||
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`.
|
||||
- **Routing**: brain_answer's LLM uses berget.ai as primary, iguana ollama as
|
||||
fallback. Both are configurable in the `supervisor/ingestion-deployment.yaml`
|
||||
on the koala k3s cluster; don't hardcode local-only model names into the
|
||||
berget URL (see knowledge entry on namespace mismatches).
|
||||
|
||||
### Quick reflex checks
|
||||
|
||||
If you find yourself about to say any of these out loud, you owe yourself a brain query first:
|
||||
|
||||
- "I think the issue might be..."
|
||||
- "Let me try X and see..."
|
||||
- "I'll just write a script to..."
|
||||
- "This is probably a new bug..."
|
||||
- "Has anyone done this before?" — *yes, probably, go check.*
|
||||
|
||||
## Client work rules
|
||||
|
||||
When working on a project tagged with a client name:
|
||||
1. Never send code, data, or context to cloud APIs — use local models only
|
||||
2. Never reference other client projects or their data
|
||||
3. Keep all artifacts within the client's git org / directory
|
||||
4. Treat everything as confidential unless told otherwise
|
||||
|
||||
## Harness-agnostic principles
|
||||
|
||||
This context is designed to work with any AI coding tool:
|
||||
- Claude Code, Cursor, Aider, Open WebUI, Charmbracelet Mods/Crush
|
||||
- Pi Coding Agent, Mistral Vibe, Antigravity
|
||||
- Any tool that accepts a system prompt or reads a markdown context file
|
||||
|
||||
The canonical source is always `.context/AGENT.md` (root) and `.context/PROJECT.md` (per-project).
|
||||
Derived files are committed (see *How context propagates* below) so a `git pull` on any host yields full agent context with no setup.
|
||||
|
||||
## How context propagates
|
||||
|
||||
Canonical sources of truth:
|
||||
- Universal: `~/dev/.context/AGENT.md` (this file)
|
||||
- Project: `<repo>/.context/PROJECT.md` (per-repo)
|
||||
|
||||
Derived files (committed, regenerated by `task context:sync`):
|
||||
- `CLAUDE.md`, `AGENTS.md`, `.cursorrules`, `.aider.conventions.md`,
|
||||
`.context/system-prompt.txt`
|
||||
|
||||
Workflow:
|
||||
1. Edit a canonical file. Run `task context:sync`. Commit canonical and
|
||||
derived together. Push.
|
||||
2. On any other host, `git pull` brings both. Claude Code (tree-walking)
|
||||
uses `CLAUDE.md`; Crush / Pi / Antigravity (cwd-only) use `AGENTS.md`;
|
||||
Cursor uses `.cursorrules`; Aider uses `.aider.conventions.md`.
|
||||
3. `task check` runs `context:sync` then asserts `git status --porcelain`
|
||||
is empty over the derived files (catches both modified-tracked drift
|
||||
and missing-untracked adapters). A drift fails the check with a
|
||||
message telling you to stage the regenerated files.
|
||||
|
||||
Behavior rules in this file and per-project rules in `PROJECT.md` apply
|
||||
unconditionally on every host, every harness.
|
||||
|
||||
## Engineering Skills
|
||||
|
||||
Shared engineering skills are available in `~/dev/.skills/`. Load on demand via the index.
|
||||
|
||||
See `~/dev/.skills/SKILLS_INDEX.md` for the full list with descriptions and "use when" triggers.
|
||||
|
||||
Key skills:
|
||||
- **TDD**: always write tests first — load `tdd` skill
|
||||
- **Code Review**: load `code-review` skill before any review
|
||||
- **SOLID/Clean Code**: load `solid` or `clean-code` skill for design work
|
||||
- **Problem first**: load `problem-analysis` skill before coding non-trivial features
|
||||
|
||||
---
|
||||
|
||||
# Project context
|
||||
|
||||
<!-- Canonical project context. Edit this, run `task context:sync`.
|
||||
Root agent context from ~/dev/.context/AGENT.md is automatically
|
||||
prepended for harnesses that don't walk the directory tree. -->
|
||||
|
||||
## Identity
|
||||
|
||||
- **Name**: supervisor
|
||||
- **Owner**: Mathias
|
||||
- **Client**: personal
|
||||
- **Repo**:
|
||||
- **Status**: active
|
||||
|
||||
## Stack
|
||||
|
||||
- **Primary language**: Go
|
||||
- **UI layer**: HTMX + Templ (when applicable)
|
||||
- **Fallback languages**: Python, TypeScript (justify in PR if used)
|
||||
- **Build**: Task (taskfile.dev), not Make
|
||||
- **Containers**: Docker (compose for dev, k3s for deploy)
|
||||
- **Target infra**: koala (GPU workloads), iguana (services), flamingo (edge)
|
||||
|
||||
## Conventions
|
||||
|
||||
### Code style
|
||||
- Go: follow `golines`, `gofumpt`, `golangci-lint` with project config
|
||||
- Tests: table-driven, in `_test.go` next to source, `testify` for assertions
|
||||
- Errors: wrap with `fmt.Errorf("operation: %w", err)`, no naked returns
|
||||
- Naming: stdlib conventions, no stuttering (`http.Client` not `http.HTTPClient`)
|
||||
|
||||
### Architecture preferences
|
||||
- Prefer standard library over frameworks (net/http over gin/echo)
|
||||
- Dependency injection via constructor functions, not containers
|
||||
- Configuration via environment variables, parsed at startup into a typed struct
|
||||
- Structured logging via `slog`
|
||||
|
||||
### Git
|
||||
- Conventional commits: `feat:`, `fix:`, `chore:`, `docs:`, `refactor:`
|
||||
- Branch naming: `feat/short-description`, `fix/short-description`
|
||||
- PRs: one concern per PR, description explains *why* not *what*
|
||||
|
||||
### Security
|
||||
- No secrets in code, ever — use env vars or SOPS-encrypted files
|
||||
- Client data never leaves local network unless explicitly cleared
|
||||
- Dependencies: audit with `govulncheck` before adding
|
||||
|
||||
## MCP endpoints
|
||||
|
||||
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
|
||||
|
||||
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
|
||||
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
|
||||
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
|
||||
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
|
||||
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
|
||||
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
|
||||
(opt-in). Only `mode client-local` registers this endpoint.
|
||||
|
||||
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
|
||||
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
|
||||
the routing pod; brain tools moved to the brain MCP.
|
||||
|
||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
|
||||
for shell scripts and non-MCP clients.
|
||||
|
||||
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
|
||||
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
|
||||
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
|
||||
|
||||
## Agent instructions
|
||||
|
||||
When acting as a coding agent on this project:
|
||||
|
||||
1. Read this file and all `SKILL.md` files in `.skills/` before starting work
|
||||
2. Run `task check` before committing (lint + test + vet)
|
||||
3. If unsure about a convention, check `DECISIONS.md` or ask
|
||||
4. Never modify files outside the project root without explicit permission
|
||||
5. When adding a dependency, explain why in the commit message
|
||||
6. For client projects: never send code or context to cloud APIs — use local models via LiteLLM
|
||||
82
CLAUDE.md
Normal file
82
CLAUDE.md
Normal file
@@ -0,0 +1,82 @@
|
||||
# Project context
|
||||
|
||||
<!-- Canonical project context. Edit this, run `task context:sync`.
|
||||
Root agent context from ~/dev/.context/AGENT.md is automatically
|
||||
prepended for harnesses that don't walk the directory tree. -->
|
||||
|
||||
## Identity
|
||||
|
||||
- **Name**: supervisor
|
||||
- **Owner**: Mathias
|
||||
- **Client**: personal
|
||||
- **Repo**:
|
||||
- **Status**: active
|
||||
|
||||
## Stack
|
||||
|
||||
- **Primary language**: Go
|
||||
- **UI layer**: HTMX + Templ (when applicable)
|
||||
- **Fallback languages**: Python, TypeScript (justify in PR if used)
|
||||
- **Build**: Task (taskfile.dev), not Make
|
||||
- **Containers**: Docker (compose for dev, k3s for deploy)
|
||||
- **Target infra**: koala (GPU workloads), iguana (services), flamingo (edge)
|
||||
|
||||
## Conventions
|
||||
|
||||
### Code style
|
||||
- Go: follow `golines`, `gofumpt`, `golangci-lint` with project config
|
||||
- Tests: table-driven, in `_test.go` next to source, `testify` for assertions
|
||||
- Errors: wrap with `fmt.Errorf("operation: %w", err)`, no naked returns
|
||||
- Naming: stdlib conventions, no stuttering (`http.Client` not `http.HTTPClient`)
|
||||
|
||||
### Architecture preferences
|
||||
- Prefer standard library over frameworks (net/http over gin/echo)
|
||||
- Dependency injection via constructor functions, not containers
|
||||
- Configuration via environment variables, parsed at startup into a typed struct
|
||||
- Structured logging via `slog`
|
||||
|
||||
### Git
|
||||
- Conventional commits: `feat:`, `fix:`, `chore:`, `docs:`, `refactor:`
|
||||
- Branch naming: `feat/short-description`, `fix/short-description`
|
||||
- PRs: one concern per PR, description explains *why* not *what*
|
||||
|
||||
### Security
|
||||
- No secrets in code, ever — use env vars or SOPS-encrypted files
|
||||
- Client data never leaves local network unless explicitly cleared
|
||||
- Dependencies: audit with `govulncheck` before adding
|
||||
|
||||
## MCP endpoints
|
||||
|
||||
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
|
||||
|
||||
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
|
||||
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
|
||||
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
|
||||
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
|
||||
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
|
||||
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
|
||||
(opt-in). Only `mode client-local` registers this endpoint.
|
||||
|
||||
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
|
||||
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
|
||||
the routing pod; brain tools moved to the brain MCP.
|
||||
|
||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
|
||||
for shell scripts and non-MCP clients.
|
||||
|
||||
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
|
||||
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
|
||||
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
|
||||
|
||||
## Agent instructions
|
||||
|
||||
When acting as a coding agent on this project:
|
||||
|
||||
1. Read this file and all `SKILL.md` files in `.skills/` before starting work
|
||||
2. Run `task check` before committing (lint + test + vet)
|
||||
3. If unsure about a convention, check `DECISIONS.md` or ask
|
||||
4. Never modify files outside the project root without explicit permission
|
||||
5. When adding a dependency, explain why in the commit message
|
||||
6. For client projects: never send code or context to cloud APIs — use local models via LiteLLM
|
||||
44
DECISIONS.md
44
DECISIONS.md
@@ -67,6 +67,50 @@ Record *why* things are the way they are. Future-you will thank present-you.
|
||||
|
||||
---
|
||||
|
||||
## Plan 6: routing pod reuses internal/skills/{review,debug,retrospective,trainer}
|
||||
|
||||
Plan 6 (Mode 2 routing pod, 2026-05-04) introduces a second consumer of
|
||||
the four cost-routable skill packages. The routing pod constructs each
|
||||
skill via `<pkg>.New(Config{...})` and hands it `routing.Router.Run` as
|
||||
the `CompleteFunc`.
|
||||
|
||||
**Preserved code (do not delete):**
|
||||
- `internal/skills/{review,debug,retrospective,trainer}/`
|
||||
- `internal/registry`, `internal/mcp`, `internal/exec/litellm.go`
|
||||
- `internal/routing/`, `cmd/routing/`
|
||||
|
||||
---
|
||||
|
||||
## Plan 7: supervisor pod retired (2026-05-12)
|
||||
|
||||
**What was deleted:** `cmd/supervisor/`, `internal/skills/{tdd,spec}/`,
|
||||
root `Dockerfile`, supervisor k8s manifests (Deployment, Service, Ingress,
|
||||
NodePort 30320), `supervisor` entry removed from all `.mcp.json` configs.
|
||||
|
||||
**Coverage:** `tdd`/`spec` → SKILL.md files in `~/dev/.skills/`; `review`,
|
||||
`debug`, `retrospective`, `trainer` → routing pod; `brain_*`/`session_log` →
|
||||
brain MCP; `tier` → `hyperguild tier` CLI.
|
||||
|
||||
---
|
||||
|
||||
## 2026-05-12 — brain_answer and brain_classify: LLM routing via berget.ai → iguana
|
||||
|
||||
**Context:** Brain MCP returned raw BM25 excerpts with no synthesis. Adding
|
||||
LLM-backed tools enables Q&A and ingestion enrichment without a separate service.
|
||||
|
||||
**Decision:** Two new MCP tools in the ingestion service (`ingestion/internal/mcp/`):
|
||||
- `brain_answer(query)` — BM25 top-10 → LLM synthesis → answer + sources
|
||||
- `brain_classify(text)` — LLM classifies doc into type/title/tags
|
||||
|
||||
Primary LLM: berget.ai `gemma4:31b` (EU cloud, spend tokens while available).
|
||||
Fallback: iguana `gemma4:31b` (local Ollama). Reranker deferred to follow-up.
|
||||
Router lives in `ingestion/internal/llm.Router`; opt-in via `BRAIN_LLM_PRIMARY_URL`.
|
||||
|
||||
**Consequences:** Brain becomes a knowledge assistant, not just a search index.
|
||||
When berget.ai tokens run out, flip `BRAIN_LLM_PRIMARY_URL` to iguana.
|
||||
|
||||
---
|
||||
|
||||
## 2026-04-08 — Mistral Vibe gets its own adapter
|
||||
|
||||
**Context**: Vibe doesn't read `AGENTS.md` — it uses `~/.vibe/prompts/` and `~/.vibe/agents/` with TOML config.
|
||||
|
||||
50
Dockerfile
50
Dockerfile
@@ -1,50 +0,0 @@
|
||||
# syntax=docker/dockerfile:1
|
||||
|
||||
# ── Build stage ───────────────────────────────────────────────────────────────
|
||||
FROM golang:1.26-bookworm AS builder
|
||||
|
||||
ARG VERSION=dev
|
||||
WORKDIR /src
|
||||
|
||||
COPY go.mod go.sum ./
|
||||
RUN go mod download
|
||||
|
||||
COPY . .
|
||||
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 \
|
||||
go build -trimpath -ldflags="-s -w -X main.version=${VERSION}" \
|
||||
-o /out/supervisor ./cmd/supervisor
|
||||
|
||||
# ── Runtime stage ─────────────────────────────────────────────────────────────
|
||||
# Node.js 22 slim — needed for claude CLI subprocess
|
||||
FROM node:22-slim
|
||||
|
||||
# Install claude CLI (provides the `claude` binary the supervisor shells out to)
|
||||
RUN npm install -g @anthropic-ai/claude-code \
|
||||
&& claude --version \
|
||||
&& echo "claude CLI installed"
|
||||
|
||||
# Copy supervisor binary
|
||||
COPY --from=builder /out/supervisor /usr/local/bin/supervisor
|
||||
|
||||
# Bake in config (models.yaml + skill discipline files)
|
||||
COPY config/ /app/config/
|
||||
|
||||
# Run as non-root
|
||||
RUN groupadd -r supervisor && useradd -r -g supervisor -d /app supervisor
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# brain/ is writable state — mount a PersistentVolume here
|
||||
VOLUME /app/brain
|
||||
|
||||
ENV SUPERVISOR_CONFIG_DIR=/app/config/supervisor
|
||||
ENV SUPERVISOR_MODELS_FILE=/app/config/models.yaml
|
||||
ENV SUPERVISOR_BRAIN_DIR=/app/brain
|
||||
ENV SUPERVISOR_SESSIONS_DIR=/app/brain/sessions
|
||||
ENV SUPERVISOR_PORT=3200
|
||||
|
||||
USER supervisor
|
||||
|
||||
EXPOSE 3200
|
||||
|
||||
ENTRYPOINT ["/usr/local/bin/supervisor"]
|
||||
30
Dockerfile.routing
Normal file
30
Dockerfile.routing
Normal file
@@ -0,0 +1,30 @@
|
||||
# syntax=docker/dockerfile:1
|
||||
|
||||
# ── Build stage ───────────────────────────────────────────────────────────────
|
||||
FROM golang:1.26-bookworm AS builder
|
||||
|
||||
ARG VERSION=dev
|
||||
WORKDIR /src
|
||||
|
||||
COPY go.mod go.sum ./
|
||||
RUN go mod download
|
||||
|
||||
COPY . .
|
||||
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 \
|
||||
go build -trimpath -ldflags="-s -w -X main.version=${VERSION}" \
|
||||
-o /out/routing ./cmd/routing
|
||||
|
||||
# ── Runtime stage ─────────────────────────────────────────────────────────────
|
||||
FROM gcr.io/distroless/base-debian12
|
||||
|
||||
COPY --from=builder /out/routing /usr/local/bin/routing
|
||||
COPY config/ /app/config/
|
||||
|
||||
ENV SUPERVISOR_CONFIG_DIR=/app/config/supervisor
|
||||
ENV ROUTING_PORT=3210
|
||||
|
||||
EXPOSE 3210
|
||||
|
||||
USER 65532:65532
|
||||
|
||||
ENTRYPOINT ["/usr/local/bin/routing"]
|
||||
43
README.md
43
README.md
@@ -10,10 +10,12 @@ into a searchable brain.
|
||||
```
|
||||
Your Claude Code session (in any project)
|
||||
│
|
||||
│ MCP tools (over stdio bridge → HTTP)
|
||||
▼
|
||||
supervisor :3200 — skill workers: tdd, retrospective
|
||||
ingestion :3300 — brain HTTP API: query wiki, write notes
|
||||
│ MCP over HTTP (Tailscale)
|
||||
├──▶ supervisor :3200 (NodePort 30320 on koala) — skill workers: tdd, debug, spec, …
|
||||
├──▶ routing :3210 (NodePort 30310 on koala) — Mode 2 only: review, debug, retrospective, trainer
|
||||
└──▶ brain :3300 (NodePort 30330 on koala) — brain_query, brain_write, brain_ingest, session_log
|
||||
│
|
||||
└─ also serves the legacy REST endpoints (/query, /write, /ingest, …)
|
||||
│
|
||||
▼
|
||||
brain/
|
||||
@@ -55,18 +57,28 @@ Create `.mcp.json` in your project root:
|
||||
{
|
||||
"mcpServers": {
|
||||
"supervisor": {
|
||||
"command": "/Users/mathias/dev/AI/supervisor/bin/supervisor-bridge",
|
||||
"env": {
|
||||
"SUPERVISOR_URL": "http://localhost:3200/mcp"
|
||||
}
|
||||
"type": "http",
|
||||
"url": "http://koala:30320/mcp"
|
||||
},
|
||||
"brain": {
|
||||
"type": "http",
|
||||
"url": "http://koala:30330/mcp"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Build the bridge binary once: `task bridge:build`
|
||||
Two MCP servers are exposed today, both reachable over Tailscale:
|
||||
|
||||
Then open Claude Code in your project — run `/mcp` to confirm `supervisor` is listed.
|
||||
- **`supervisor`** at `koala:30320` — skill workers (`tdd_red/green/refactor`,
|
||||
`review`, `debug`, `spec`, `retrospective`, `trainer`, `tier`).
|
||||
- **`brain`** at `koala:30330` — knowledge access (`brain_query`, `brain_write`,
|
||||
`brain_ingest`, `brain_ingest_raw`) and `session_log`. Hosted by the ingestion
|
||||
service directly, no separate pod.
|
||||
|
||||
No local binary or stdio shim is required — Claude Code talks to both via HTTP.
|
||||
|
||||
Open Claude Code in your project — run `/mcp` to confirm both servers are listed.
|
||||
|
||||
## A typical TDD session
|
||||
|
||||
@@ -100,6 +112,17 @@ The supervisor probes connectivity at call time:
|
||||
| `SUPERVISOR_SESSIONS_DIR` | `./brain/sessions` | JSONL session logs |
|
||||
| `INGEST_BASE_URL` | `http://localhost:3300` | Supervisor → ingestion |
|
||||
| `LITELLM_BASE_URL` | — | LiteLLM proxy for Tier 2 model routing |
|
||||
| `SUPERVISOR_MCP_TOKEN` | — | Optional bearer token for the supervisor MCP HTTP endpoint; when empty, no auth is enforced |
|
||||
| `ROUTING_PORT` | `3210` | Routing pod's listen port |
|
||||
| `ROUTING_MCP_TOKEN` | — | Optional bearer token for the routing MCP HTTP endpoint |
|
||||
| `BRAIN_URL` | `http://ingestion.supervisor:3300` | Routing pod → brain (in-cluster) |
|
||||
| `HYPERGUILD_FAST_MODEL` | `koala/qwen35-9b-fast` | Fast model for high-pass-rate skill calls |
|
||||
| `HYPERGUILD_THINKING_MODEL` | `iguana/gemma4-26b` | Thinking model for low-pass-rate skill calls |
|
||||
| `HYPERGUILD_ROUTE_LOCAL_FLOOR` | `0.90` | At/above pass rate, route to fast model |
|
||||
| `HYPERGUILD_ROUTE_LOCAL_CEIL` | `0.70` | Below pass rate, route to thinking model. Between CEIL and FLOOR is the sample band. |
|
||||
| `HYPERGUILD_PASS_RATE_TTL_SECONDS` | `60` | Per-skill pass-rate cache TTL |
|
||||
|
||||
> **Operator note:** LiteLLM at `LITELLM_BASE_URL` must register both `HYPERGUILD_FAST_MODEL` and `HYPERGUILD_THINKING_MODEL` for routing to do useful work. If a model is missing, LiteLLM returns 4xx, the routing pod's fast route fails, the fail-open retry on the thinking model likely also fails (since both are missing), and the only signal is `final_status: "fail"` on `_routing` entries in the brain.
|
||||
|
||||
## Phase 2 (planned)
|
||||
|
||||
|
||||
44
Taskfile.yml
44
Taskfile.yml
@@ -12,9 +12,6 @@ tasks:
|
||||
desc: Regenerate all harness-specific context files
|
||||
cmds:
|
||||
- bash scripts/context-sync.sh
|
||||
sources:
|
||||
- .context/PROJECT.md
|
||||
- .skills/*/SKILL.md
|
||||
|
||||
context:sync:claude:
|
||||
cmds: [bash scripts/context-sync.sh claude]
|
||||
@@ -42,6 +39,22 @@ tasks:
|
||||
cmds:
|
||||
- go run ./cmd/supervisor
|
||||
|
||||
hyperguild:dev:
|
||||
desc: Run hyperguild CLI from source (e.g. task hyperguild:dev -- tier)
|
||||
cmds:
|
||||
- go run ./cmd/hyperguild {{.CLI_ARGS}}
|
||||
|
||||
hyperguild:build:
|
||||
desc: Build the hyperguild binary into ./bin/hyperguild
|
||||
cmds:
|
||||
- mkdir -p bin
|
||||
- go build -o bin/hyperguild ./cmd/hyperguild
|
||||
|
||||
hyperguild:install:
|
||||
desc: Install hyperguild into $GOBIN
|
||||
cmds:
|
||||
- go install ./cmd/hyperguild
|
||||
|
||||
ingestion:dev:
|
||||
desc: Run ingestion server in development mode
|
||||
dir: ingestion
|
||||
@@ -57,7 +70,6 @@ tasks:
|
||||
desc: Build all binaries
|
||||
cmds:
|
||||
- task: supervisor:build
|
||||
- task: bridge:build
|
||||
- task: ingestion:build
|
||||
|
||||
supervisor:build:
|
||||
@@ -65,11 +77,6 @@ tasks:
|
||||
cmds:
|
||||
- go build -trimpath -ldflags="-s -w -X main.version={{.VERSION}}" -o bin/supervisor ./cmd/supervisor
|
||||
|
||||
bridge:build:
|
||||
desc: Build stdio↔HTTP bridge for Claude Code MCP integration
|
||||
cmds:
|
||||
- go build -trimpath -ldflags="-s -w" -o bin/supervisor-bridge ./cmd/bridge
|
||||
|
||||
ingestion:build:
|
||||
desc: Build ingestion server binary
|
||||
dir: ingestion
|
||||
@@ -79,8 +86,20 @@ tasks:
|
||||
# ── Quality ────────────────────────────────────────────────────────────────
|
||||
|
||||
check:
|
||||
desc: Run all checks (lint + test + vet) across all modules
|
||||
desc: Run all checks (context freshness + lint + test + vet) across all modules
|
||||
cmds:
|
||||
- task: context:sync
|
||||
- cmd: |
|
||||
drift=$(git status --porcelain -- AGENTS.md CLAUDE.md .cursorrules .aider.conventions.md .context/system-prompt.txt 2>/dev/null)
|
||||
if [ -n "$drift" ]; then
|
||||
echo "ERROR: derived adapters drifted from canonical context." >&2
|
||||
echo "$drift" >&2
|
||||
echo "" >&2
|
||||
echo "Run: git add AGENTS.md CLAUDE.md .cursorrules .aider.conventions.md .context/system-prompt.txt" >&2
|
||||
echo " git commit -m 'chore: re-sync context adapters'" >&2
|
||||
exit 1
|
||||
fi
|
||||
echo "✓ context: canonical and adapters are in sync"
|
||||
- task: lint
|
||||
- task: test
|
||||
- task: vet
|
||||
@@ -109,6 +128,11 @@ tasks:
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | jq .
|
||||
|
||||
smoke:routing:
|
||||
desc: Boot the routing pod against live LiteLLM + brain and verify _routing logs land
|
||||
cmds:
|
||||
- bash scripts/smoke-routing.sh
|
||||
|
||||
# ── Git / Release ──────────────────────────────────────────────────────────
|
||||
|
||||
tag:
|
||||
|
||||
@@ -1,59 +0,0 @@
|
||||
// bridge is a stdio↔HTTP adapter that lets Claude Code connect to the
|
||||
// supervisor MCP server via the stdio transport.
|
||||
//
|
||||
// Claude Code spawns this binary as a subprocess and communicates over
|
||||
// stdin/stdout. Each newline-delimited JSON-RPC message from stdin is
|
||||
// forwarded to the supervisor HTTP server and the response is written back.
|
||||
//
|
||||
// Usage:
|
||||
//
|
||||
// SUPERVISOR_URL=http://localhost:3200/mcp bridge
|
||||
package main
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"bytes"
|
||||
"fmt"
|
||||
"io"
|
||||
"net/http"
|
||||
"os"
|
||||
)
|
||||
|
||||
func main() {
|
||||
url := os.Getenv("SUPERVISOR_URL")
|
||||
if url == "" {
|
||||
url = "http://localhost:3200/mcp"
|
||||
}
|
||||
|
||||
client := &http.Client{}
|
||||
scanner := bufio.NewScanner(os.Stdin)
|
||||
scanner.Buffer(make([]byte, 1024*1024), 1024*1024)
|
||||
|
||||
for scanner.Scan() {
|
||||
line := scanner.Bytes()
|
||||
if len(bytes.TrimSpace(line)) == 0 {
|
||||
continue
|
||||
}
|
||||
|
||||
req, err := http.NewRequest(http.MethodPost, url, bytes.NewReader(line))
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "bridge: build request: %v\n", err)
|
||||
continue
|
||||
}
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
|
||||
resp, err := client.Do(req)
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "bridge: request failed: %v\n", err)
|
||||
continue
|
||||
}
|
||||
_, _ = io.Copy(os.Stdout, resp.Body)
|
||||
_ = resp.Body.Close()
|
||||
_, _ = os.Stdout.Write([]byte("\n"))
|
||||
}
|
||||
|
||||
if err := scanner.Err(); err != nil {
|
||||
fmt.Fprintf(os.Stderr, "bridge: scanner: %v\n", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
}
|
||||
140
cmd/hyperguild/README.md
Normal file
140
cmd/hyperguild/README.md
Normal file
@@ -0,0 +1,140 @@
|
||||
# hyperguild CLI
|
||||
|
||||
A small Go binary for tier probing, brain HTTP REST access, and
|
||||
`.mcp.json` mode bootstrap. Replaces the supervisor's `tier` MCP and
|
||||
gives shell scripts a stable interface to the brain.
|
||||
|
||||
## Install
|
||||
|
||||
```bash
|
||||
task hyperguild:install
|
||||
# or: go install ./cmd/hyperguild
|
||||
```
|
||||
|
||||
The binary lands at `$(go env GOBIN)/hyperguild` (typically
|
||||
`~/go/bin/hyperguild`). Make sure that's on your PATH.
|
||||
|
||||
## Subcommands
|
||||
|
||||
### `hyperguild tier`
|
||||
|
||||
Probes Anthropic and LiteLLM and reports the current operating tier.
|
||||
|
||||
```bash
|
||||
$ hyperguild tier
|
||||
tier 1 (full-online) managed_agents=true
|
||||
|
||||
$ hyperguild tier --json
|
||||
{
|
||||
"tier": 1,
|
||||
"label": "full-online",
|
||||
"available_models": null,
|
||||
"managed_agents": true
|
||||
}
|
||||
```
|
||||
|
||||
Probe URLs are read from environment:
|
||||
|
||||
| Var | Default |
|
||||
|-----------------------|-------------------------------|
|
||||
| `ANTHROPIC_PROBE_URL` | `https://api.anthropic.com` |
|
||||
| `LITELLM_BASE_URL` | (empty → falls through to airplane) |
|
||||
|
||||
### `hyperguild brain query <topic>`
|
||||
|
||||
BM25 search over the brain's knowledge + wiki entries.
|
||||
|
||||
```bash
|
||||
$ hyperguild brain query "find -H symlink"
|
||||
knowledge/2026-05-03-find-h-not-l-symlinked-root.md score=12 Use find -H, not find -L
|
||||
...
|
||||
```
|
||||
|
||||
Flags:
|
||||
|
||||
- `--limit N` — max results (default 5)
|
||||
- `--json` — emit the raw response envelope
|
||||
|
||||
### `hyperguild brain write <type> <slug>`
|
||||
|
||||
Reads markdown from stdin, writes a knowledge entry.
|
||||
|
||||
```bash
|
||||
$ cat <<EOF | hyperguild brain write knowledge example-lesson
|
||||
# Example lesson
|
||||
|
||||
## Lesson
|
||||
...
|
||||
EOF
|
||||
knowledge/example-lesson.md
|
||||
```
|
||||
|
||||
### `hyperguild brain pass-rate <skill>`
|
||||
|
||||
Returns the pass rate for a skill over a lookback window. Computed
|
||||
on-demand from `brain/sessions/*.jsonl`.
|
||||
|
||||
```bash
|
||||
$ hyperguild brain pass-rate tdd
|
||||
tdd: 47 / 50 = 94% (window: 7d)
|
||||
|
||||
$ hyperguild brain pass-rate tdd --window 30d --json
|
||||
{
|
||||
"skill": "tdd",
|
||||
"window": "30d",
|
||||
"pass": 142,
|
||||
"fail": 8,
|
||||
"skip": 5,
|
||||
"total": 155,
|
||||
"pass_rate": 0.9467
|
||||
}
|
||||
```
|
||||
|
||||
Flags:
|
||||
|
||||
- `--window` — lookback window (default `7d`; accepts `Nh`, `Nd`)
|
||||
- `--json` — emit the raw response envelope
|
||||
|
||||
Skills with no logged invocations return zero counts and `pass_rate: null`
|
||||
(indicating "no data", distinct from "always passes").
|
||||
|
||||
### `hyperguild mode <cloud|client-local|sovereign>`
|
||||
|
||||
Writes a `.mcp.json` template for the chosen operating mode.
|
||||
|
||||
```bash
|
||||
$ hyperguild mode cloud --out ./.mcp.json
|
||||
wrote ./.mcp.json (mode: cloud)
|
||||
```
|
||||
|
||||
Flags:
|
||||
|
||||
- `--out PATH` — output file (default `./.mcp.json`)
|
||||
- `--force` — overwrite an existing file
|
||||
|
||||
Modes:
|
||||
|
||||
- **cloud** — brain MCP only. Claude Code with no routing.
|
||||
- **client-local** — brain + routing pod. The `routing` entry points at
|
||||
`koala:30310/mcp` (the routing pod, deployed in Plan 6). The
|
||||
`X-Hyperguild-Mode: client-local` header is forward-compat for future
|
||||
modes; the pod treats absent or unknown values as `client-local`.
|
||||
- **sovereign** — brain only, with a `_mode_note` explaining that this
|
||||
mode primarily uses Crush + LiteLLM and the `.mcp.json` is a Claude
|
||||
Code fallback for emergency offline use.
|
||||
|
||||
## Environment
|
||||
|
||||
| Var | Default | Used by |
|
||||
|-----------------------|--------------------------|---------------------|
|
||||
| `BRAIN_URL` | `http://koala:30330` | `brain *`, `mode *` |
|
||||
| `ANTHROPIC_PROBE_URL` | `https://api.anthropic.com` | `tier` |
|
||||
| `LITELLM_BASE_URL` | (empty) | `tier` |
|
||||
|
||||
Override `BRAIN_URL` if your brain pod is at a different Tailscale name
|
||||
or port.
|
||||
|
||||
## See also
|
||||
|
||||
- `docs/superpowers/specs/2026-05-03-hyperguild-cli-design.md` — full spec
|
||||
- `docs/superpowers/plans/2026-05-03-hyperguild-cli.md` — implementation plan
|
||||
106
cmd/hyperguild/brain.go
Normal file
106
cmd/hyperguild/brain.go
Normal file
@@ -0,0 +1,106 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"flag"
|
||||
"fmt"
|
||||
"io"
|
||||
)
|
||||
|
||||
func runBrain(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io.Writer) error {
|
||||
if len(args) == 0 {
|
||||
return errors.New("subcommand required (query|write|pass-rate)")
|
||||
}
|
||||
switch args[0] {
|
||||
case "query":
|
||||
return runBrainQuery(ctx, args[1:], stdin, stdout, stderr)
|
||||
case "write":
|
||||
return runBrainWrite(ctx, args[1:], stdin, stdout, stderr)
|
||||
case "pass-rate":
|
||||
return runBrainPassRate(ctx, args[1:], stdin, stdout, stderr)
|
||||
default:
|
||||
return fmt.Errorf("unknown subcommand: %s (expected query|write|pass-rate)", args[0])
|
||||
}
|
||||
}
|
||||
|
||||
func runBrainQuery(ctx context.Context, args []string, _ io.Reader, stdout, stderr io.Writer) error {
|
||||
fs := flag.NewFlagSet("brain query", flag.ContinueOnError)
|
||||
fs.SetOutput(stderr)
|
||||
asJSON := fs.Bool("json", false, "output JSON instead of human-readable")
|
||||
limit := fs.Int("limit", 5, "maximum number of results")
|
||||
if err := fs.Parse(args); err != nil {
|
||||
return fmt.Errorf("parse flags: %w", err)
|
||||
}
|
||||
if fs.NArg() < 1 {
|
||||
return errors.New("topic required")
|
||||
}
|
||||
topic := fs.Arg(0)
|
||||
|
||||
res, err := newBrainClient().Query(ctx, topic, *limit)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
if *asJSON {
|
||||
enc := json.NewEncoder(stdout)
|
||||
enc.SetIndent("", " ")
|
||||
return enc.Encode(res)
|
||||
}
|
||||
for _, hit := range res.Results {
|
||||
fmt.Fprintf(stdout, "%s score=%d %s\n", hit.Path, hit.Score, hit.Title) //nolint:errcheck
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func runBrainWrite(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io.Writer) error {
|
||||
fs := flag.NewFlagSet("brain write", flag.ContinueOnError)
|
||||
fs.SetOutput(stderr)
|
||||
if err := fs.Parse(args); err != nil {
|
||||
return fmt.Errorf("parse flags: %w", err)
|
||||
}
|
||||
if fs.NArg() < 2 {
|
||||
return errors.New("type and slug required (e.g. brain write knowledge my-slug)")
|
||||
}
|
||||
kind := fs.Arg(0)
|
||||
slug := fs.Arg(1)
|
||||
|
||||
res, err := newBrainClient().Write(ctx, kind, slug, stdin)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
fmt.Fprintln(stdout, res.Path) //nolint:errcheck
|
||||
return nil
|
||||
}
|
||||
|
||||
func runBrainPassRate(ctx context.Context, args []string, _ io.Reader, stdout, stderr io.Writer) error {
|
||||
fs := flag.NewFlagSet("brain pass-rate", flag.ContinueOnError)
|
||||
fs.SetOutput(stderr)
|
||||
asJSON := fs.Bool("json", false, "output JSON instead of human-readable")
|
||||
window := fs.String("window", "7d", "lookback window (e.g. 1h, 24h, 7d, 30d)")
|
||||
if err := fs.Parse(args); err != nil {
|
||||
return fmt.Errorf("parse flags: %w", err)
|
||||
}
|
||||
if fs.NArg() < 1 {
|
||||
return errors.New("skill required")
|
||||
}
|
||||
skill := fs.Arg(0)
|
||||
|
||||
res, err := newBrainClient().PassRate(ctx, skill, *window)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
if *asJSON {
|
||||
enc := json.NewEncoder(stdout)
|
||||
enc.SetIndent("", " ")
|
||||
return enc.Encode(res)
|
||||
}
|
||||
if res.PassRate == nil {
|
||||
fmt.Fprintf(stdout, "%s: no data (window: %s)\n", res.Skill, res.Window) //nolint:errcheck
|
||||
return nil
|
||||
}
|
||||
fmt.Fprintf(stdout, "%s: %d / %d = %.0f%% (window: %s)\n", res.Skill, res.Pass, res.Total, *res.PassRate*100, res.Window) //nolint:errcheck
|
||||
return nil
|
||||
}
|
||||
220
cmd/hyperguild/brain_test.go
Normal file
220
cmd/hyperguild/brain_test.go
Normal file
@@ -0,0 +1,220 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"encoding/json"
|
||||
"io"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func brainQueryServer(t *testing.T, body string) *httptest.Server {
|
||||
t.Helper()
|
||||
return httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_, _ = w.Write([]byte(body))
|
||||
}))
|
||||
}
|
||||
|
||||
func TestRunBrainQuery_Human(t *testing.T) {
|
||||
srv := brainQueryServer(t, `{"results":[{"path":"knowledge/a.md","title":"A","excerpt":"...","score":9},{"path":"knowledge/b.md","title":"B","excerpt":"...","score":3}]}`)
|
||||
defer srv.Close()
|
||||
t.Setenv("BRAIN_URL", srv.URL)
|
||||
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runBrain(context.Background(), []string{"query", "topic"}, strings.NewReader(""), &out, &errBuf)
|
||||
require.NoError(t, err)
|
||||
got := out.String()
|
||||
assert.Contains(t, got, "knowledge/a.md")
|
||||
assert.Contains(t, got, "score=9")
|
||||
assert.Contains(t, got, "knowledge/b.md")
|
||||
}
|
||||
|
||||
func TestRunBrainQuery_JSON(t *testing.T) {
|
||||
srv := brainQueryServer(t, `{"results":[{"path":"x.md","title":"X","excerpt":"e","score":5}]}`)
|
||||
defer srv.Close()
|
||||
t.Setenv("BRAIN_URL", srv.URL)
|
||||
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runBrain(context.Background(), []string{"query", "--json", "topic"}, strings.NewReader(""), &out, &errBuf)
|
||||
require.NoError(t, err)
|
||||
assert.Contains(t, out.String(), `"path": "x.md"`)
|
||||
assert.Contains(t, out.String(), `"score": 5`)
|
||||
}
|
||||
|
||||
func TestRunBrainQuery_Limit(t *testing.T) {
|
||||
gotLimit := -1
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
body, _ := io.ReadAll(r.Body)
|
||||
var p struct {
|
||||
Query string `json:"query"`
|
||||
Limit int `json:"limit"`
|
||||
}
|
||||
_ = json.Unmarshal(body, &p)
|
||||
gotLimit = p.Limit
|
||||
_, _ = w.Write([]byte(`{"results":[]}`))
|
||||
}))
|
||||
defer srv.Close()
|
||||
t.Setenv("BRAIN_URL", srv.URL)
|
||||
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runBrain(context.Background(), []string{"query", "--limit", "12", "topic"}, strings.NewReader(""), &out, &errBuf)
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, 12, gotLimit)
|
||||
}
|
||||
|
||||
func TestRunBrainQuery_MissingTopic(t *testing.T) {
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runBrain(context.Background(), []string{"query"}, strings.NewReader(""), &out, &errBuf)
|
||||
assert.Error(t, err)
|
||||
}
|
||||
|
||||
func TestRunBrain_NoSubsubcommand(t *testing.T) {
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runBrain(context.Background(), []string{}, strings.NewReader(""), &out, &errBuf)
|
||||
assert.Error(t, err)
|
||||
assert.Contains(t, err.Error(), "subcommand required")
|
||||
}
|
||||
|
||||
func TestRunBrain_UnknownSubsubcommand(t *testing.T) {
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runBrain(context.Background(), []string{"bogus"}, strings.NewReader(""), &out, &errBuf)
|
||||
assert.Error(t, err)
|
||||
}
|
||||
|
||||
func TestRunBrainWrite_Success(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
assert.Equal(t, http.MethodPost, r.Method)
|
||||
assert.Equal(t, "/write", r.URL.Path)
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_, _ = w.Write([]byte(`{"path":"knowledge/test-slug.md"}`))
|
||||
}))
|
||||
defer srv.Close()
|
||||
t.Setenv("BRAIN_URL", srv.URL)
|
||||
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runBrain(
|
||||
context.Background(),
|
||||
[]string{"write", "knowledge", "test-slug"},
|
||||
strings.NewReader("# Test\n\nSome body content.\n"),
|
||||
&out, &errBuf,
|
||||
)
|
||||
require.NoError(t, err)
|
||||
assert.Contains(t, out.String(), "knowledge/test-slug.md")
|
||||
}
|
||||
|
||||
func TestRunBrainWrite_MissingArgs(t *testing.T) {
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runBrain(context.Background(), []string{"write", "knowledge"}, strings.NewReader("x"), &out, &errBuf)
|
||||
assert.Error(t, err)
|
||||
assert.Contains(t, err.Error(), "type and slug required")
|
||||
}
|
||||
|
||||
func TestRunBrainWrite_BackendError(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
w.WriteHeader(http.StatusBadRequest)
|
||||
_, _ = w.Write([]byte("invalid slug"))
|
||||
}))
|
||||
defer srv.Close()
|
||||
t.Setenv("BRAIN_URL", srv.URL)
|
||||
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runBrain(
|
||||
context.Background(),
|
||||
[]string{"write", "knowledge", "bad slug"},
|
||||
strings.NewReader("body"),
|
||||
&out, &errBuf,
|
||||
)
|
||||
assert.Error(t, err)
|
||||
assert.Contains(t, err.Error(), "400")
|
||||
}
|
||||
|
||||
func TestRunBrainWrite_EmptyStdin(t *testing.T) {
|
||||
gotLen := -1
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
body, _ := io.ReadAll(r.Body)
|
||||
var p struct {
|
||||
Content string `json:"content"`
|
||||
}
|
||||
_ = json.Unmarshal(body, &p)
|
||||
gotLen = len(p.Content)
|
||||
_, _ = w.Write([]byte(`{"path":"x.md"}`))
|
||||
}))
|
||||
defer srv.Close()
|
||||
t.Setenv("BRAIN_URL", srv.URL)
|
||||
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runBrain(context.Background(), []string{"write", "knowledge", "empty"}, strings.NewReader(""), &out, &errBuf)
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, 0, gotLen, "empty stdin should produce empty content payload")
|
||||
}
|
||||
|
||||
func TestRunBrainPassRate_Human(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
_, _ = w.Write([]byte(`{"skill":"tdd","window":"7d","pass":47,"fail":3,"skip":0,"total":50,"pass_rate":0.94}`))
|
||||
}))
|
||||
defer srv.Close()
|
||||
t.Setenv("BRAIN_URL", srv.URL)
|
||||
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runBrain(context.Background(), []string{"pass-rate", "tdd"}, strings.NewReader(""), &out, &errBuf)
|
||||
require.NoError(t, err)
|
||||
got := out.String()
|
||||
assert.Contains(t, got, "tdd")
|
||||
assert.Contains(t, got, "47 / 50")
|
||||
assert.Contains(t, got, "94%")
|
||||
}
|
||||
|
||||
func TestRunBrainPassRate_NoData(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
_, _ = w.Write([]byte(`{"skill":"tdd","window":"7d","pass":0,"fail":0,"skip":0,"total":0,"pass_rate":null}`))
|
||||
}))
|
||||
defer srv.Close()
|
||||
t.Setenv("BRAIN_URL", srv.URL)
|
||||
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runBrain(context.Background(), []string{"pass-rate", "tdd"}, strings.NewReader(""), &out, &errBuf)
|
||||
require.NoError(t, err)
|
||||
assert.Contains(t, out.String(), "no data")
|
||||
}
|
||||
|
||||
func TestRunBrainPassRate_JSON(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
_, _ = w.Write([]byte(`{"skill":"tdd","window":"7d","pass":47,"fail":3,"skip":0,"total":50,"pass_rate":0.94}`))
|
||||
}))
|
||||
defer srv.Close()
|
||||
t.Setenv("BRAIN_URL", srv.URL)
|
||||
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runBrain(context.Background(), []string{"pass-rate", "--json", "tdd"}, strings.NewReader(""), &out, &errBuf)
|
||||
require.NoError(t, err)
|
||||
assert.Contains(t, out.String(), `"pass_rate": 0.94`)
|
||||
}
|
||||
|
||||
func TestRunBrainPassRate_MissingSkill(t *testing.T) {
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runBrain(context.Background(), []string{"pass-rate"}, strings.NewReader(""), &out, &errBuf)
|
||||
assert.Error(t, err)
|
||||
assert.Contains(t, err.Error(), "skill required")
|
||||
}
|
||||
|
||||
func TestRunBrainPassRate_WindowFlag(t *testing.T) {
|
||||
gotWindow := ""
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
gotWindow = r.URL.Query().Get("window")
|
||||
_, _ = w.Write([]byte(`{"skill":"tdd","window":"30d","pass":0,"fail":0,"skip":0,"total":0,"pass_rate":null}`))
|
||||
}))
|
||||
defer srv.Close()
|
||||
t.Setenv("BRAIN_URL", srv.URL)
|
||||
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runBrain(context.Background(), []string{"pass-rate", "--window", "30d", "tdd"}, strings.NewReader(""), &out, &errBuf)
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, "30d", gotWindow)
|
||||
}
|
||||
159
cmd/hyperguild/http.go
Normal file
159
cmd/hyperguild/http.go
Normal file
@@ -0,0 +1,159 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"io"
|
||||
"net/http"
|
||||
"net/url"
|
||||
"os"
|
||||
"time"
|
||||
)
|
||||
|
||||
const defaultBrainURL = "http://koala:30330"
|
||||
|
||||
// brainClient calls the brain HTTP REST API exposed alongside the MCP
|
||||
// endpoint at the same host:port. /mcp serves MCP framing; /query and /write
|
||||
// serve plain REST. We use the REST surface because the CLI is a
|
||||
// shell-friendly client; MCP framing is unnecessary.
|
||||
type brainClient struct {
|
||||
baseURL string
|
||||
http *http.Client
|
||||
}
|
||||
|
||||
func newBrainClient() *brainClient {
|
||||
u := os.Getenv("BRAIN_URL")
|
||||
if u == "" {
|
||||
u = defaultBrainURL
|
||||
}
|
||||
return &brainClient{
|
||||
baseURL: u,
|
||||
http: &http.Client{Timeout: 5 * time.Second},
|
||||
}
|
||||
}
|
||||
|
||||
// QueryHit mirrors a single result from the brain's /query endpoint.
|
||||
type QueryHit struct {
|
||||
Path string `json:"path"`
|
||||
Title string `json:"title"`
|
||||
Excerpt string `json:"excerpt"`
|
||||
Score int `json:"score"`
|
||||
}
|
||||
|
||||
// QueryResult mirrors the /query response envelope.
|
||||
type QueryResult struct {
|
||||
Results []QueryHit `json:"results"`
|
||||
}
|
||||
|
||||
func (c *brainClient) Query(ctx context.Context, topic string, limit int) (*QueryResult, error) {
|
||||
payload, err := json.Marshal(struct {
|
||||
Query string `json:"query"`
|
||||
Limit int `json:"limit"`
|
||||
}{Query: topic, Limit: limit})
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("marshal payload: %w", err)
|
||||
}
|
||||
|
||||
u := c.baseURL + "/query"
|
||||
req, err := http.NewRequestWithContext(ctx, http.MethodPost, u, bytes.NewReader(payload))
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("build request: %w", err)
|
||||
}
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
|
||||
resp, err := c.http.Do(req)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("brain POST /query: %w", err)
|
||||
}
|
||||
defer func() { _ = resp.Body.Close() }()
|
||||
if resp.StatusCode != http.StatusOK {
|
||||
body, _ := io.ReadAll(resp.Body)
|
||||
return nil, fmt.Errorf("brain POST /query: status %d: %s", resp.StatusCode, string(body))
|
||||
}
|
||||
var out QueryResult
|
||||
if err := json.NewDecoder(resp.Body).Decode(&out); err != nil {
|
||||
return nil, fmt.Errorf("decode /query response: %w", err)
|
||||
}
|
||||
return &out, nil
|
||||
}
|
||||
|
||||
// WriteResult mirrors the /write response envelope.
|
||||
type WriteResult struct {
|
||||
Path string `json:"path"`
|
||||
}
|
||||
|
||||
func (c *brainClient) Write(ctx context.Context, kind, slug string, content io.Reader) (*WriteResult, error) {
|
||||
body, err := io.ReadAll(content)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("read content: %w", err)
|
||||
}
|
||||
payload, err := json.Marshal(struct {
|
||||
Type string `json:"type"`
|
||||
Slug string `json:"slug"`
|
||||
Content string `json:"content"`
|
||||
}{Type: kind, Slug: slug, Content: string(body)})
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("marshal payload: %w", err)
|
||||
}
|
||||
|
||||
u := c.baseURL + "/write"
|
||||
req, err := http.NewRequestWithContext(ctx, http.MethodPost, u, bytes.NewReader(payload))
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("build request: %w", err)
|
||||
}
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
|
||||
resp, err := c.http.Do(req)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("brain POST /write: %w", err)
|
||||
}
|
||||
defer func() { _ = resp.Body.Close() }()
|
||||
if resp.StatusCode != http.StatusOK {
|
||||
respBody, _ := io.ReadAll(resp.Body)
|
||||
return nil, fmt.Errorf("brain POST /write: status %d: %s", resp.StatusCode, string(respBody))
|
||||
}
|
||||
var out WriteResult
|
||||
if err := json.NewDecoder(resp.Body).Decode(&out); err != nil {
|
||||
return nil, fmt.Errorf("decode /write response: %w", err)
|
||||
}
|
||||
return &out, nil
|
||||
}
|
||||
|
||||
// PassRateResult mirrors the /pass-rate response envelope.
|
||||
type PassRateResult struct {
|
||||
Skill string `json:"skill"`
|
||||
Window string `json:"window"`
|
||||
Pass int `json:"pass"`
|
||||
Fail int `json:"fail"`
|
||||
Skip int `json:"skip"`
|
||||
Total int `json:"total"`
|
||||
PassRate *float64 `json:"pass_rate"`
|
||||
}
|
||||
|
||||
func (c *brainClient) PassRate(ctx context.Context, skill, window string) (*PassRateResult, error) {
|
||||
q := url.Values{}
|
||||
q.Set("skill", skill)
|
||||
q.Set("window", window)
|
||||
u := c.baseURL + "/pass-rate?" + q.Encode()
|
||||
|
||||
req, err := http.NewRequestWithContext(ctx, http.MethodGet, u, nil)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("build request: %w", err)
|
||||
}
|
||||
resp, err := c.http.Do(req)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("brain GET /pass-rate: %w", err)
|
||||
}
|
||||
defer func() { _ = resp.Body.Close() }()
|
||||
if resp.StatusCode != http.StatusOK {
|
||||
body, _ := io.ReadAll(resp.Body)
|
||||
return nil, fmt.Errorf("brain GET /pass-rate: status %d: %s", resp.StatusCode, string(body))
|
||||
}
|
||||
var out PassRateResult
|
||||
if err := json.NewDecoder(resp.Body).Decode(&out); err != nil {
|
||||
return nil, fmt.Errorf("decode /pass-rate response: %w", err)
|
||||
}
|
||||
return &out, nil
|
||||
}
|
||||
131
cmd/hyperguild/http_test.go
Normal file
131
cmd/hyperguild/http_test.go
Normal file
@@ -0,0 +1,131 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"io"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func TestBrainClient_Query_Success(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
assert.Equal(t, http.MethodPost, r.Method)
|
||||
assert.Equal(t, "/query", r.URL.Path)
|
||||
|
||||
body, _ := io.ReadAll(r.Body)
|
||||
var got struct {
|
||||
Query string `json:"query"`
|
||||
Limit int `json:"limit"`
|
||||
}
|
||||
require.NoError(t, json.Unmarshal(body, &got))
|
||||
assert.Equal(t, "find-h", got.Query)
|
||||
assert.Equal(t, 3, got.Limit)
|
||||
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_, _ = w.Write([]byte(`{"results":[{"path":"knowledge/x.md","title":"x","excerpt":"...","score":7}]}`))
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
c := &brainClient{baseURL: srv.URL, http: srv.Client()}
|
||||
res, err := c.Query(context.Background(), "find-h", 3)
|
||||
require.NoError(t, err)
|
||||
require.Len(t, res.Results, 1)
|
||||
assert.Equal(t, "knowledge/x.md", res.Results[0].Path)
|
||||
assert.Equal(t, 7, res.Results[0].Score)
|
||||
}
|
||||
|
||||
func TestBrainClient_Query_TransportError(t *testing.T) {
|
||||
c := &brainClient{baseURL: "http://127.0.0.1:1", http: http.DefaultClient}
|
||||
_, err := c.Query(context.Background(), "x", 5)
|
||||
assert.Error(t, err)
|
||||
}
|
||||
|
||||
func TestBrainClient_Query_Non200(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
w.WriteHeader(http.StatusInternalServerError)
|
||||
_, _ = w.Write([]byte("boom"))
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
c := &brainClient{baseURL: srv.URL, http: srv.Client()}
|
||||
_, err := c.Query(context.Background(), "x", 5)
|
||||
require.Error(t, err)
|
||||
assert.Contains(t, err.Error(), "500")
|
||||
}
|
||||
|
||||
func TestBrainClient_Write_Success(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
assert.Equal(t, "/write", r.URL.Path)
|
||||
assert.Equal(t, http.MethodPost, r.Method)
|
||||
body, _ := io.ReadAll(r.Body)
|
||||
var got struct {
|
||||
Type string `json:"type"`
|
||||
Slug string `json:"slug"`
|
||||
Content string `json:"content"`
|
||||
}
|
||||
require.NoError(t, json.Unmarshal(body, &got))
|
||||
assert.Equal(t, "knowledge", got.Type)
|
||||
assert.Equal(t, "find-h", got.Slug)
|
||||
assert.Equal(t, "# body\n", got.Content)
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_, _ = w.Write([]byte(`{"path":"knowledge/find-h.md"}`))
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
c := &brainClient{baseURL: srv.URL, http: srv.Client()}
|
||||
res, err := c.Write(context.Background(), "knowledge", "find-h", strings.NewReader("# body\n"))
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, "knowledge/find-h.md", res.Path)
|
||||
}
|
||||
|
||||
func TestBrainClient_PassRate_Success(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
assert.Equal(t, http.MethodGet, r.Method)
|
||||
assert.Equal(t, "/pass-rate", r.URL.Path)
|
||||
assert.Equal(t, "tdd", r.URL.Query().Get("skill"))
|
||||
assert.Equal(t, "7d", r.URL.Query().Get("window"))
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_, _ = w.Write([]byte(`{"skill":"tdd","window":"7d","pass":47,"fail":3,"skip":0,"total":50,"pass_rate":0.94}`))
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
c := &brainClient{baseURL: srv.URL, http: srv.Client()}
|
||||
res, err := c.PassRate(context.Background(), "tdd", "7d")
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, "tdd", res.Skill)
|
||||
assert.Equal(t, 47, res.Pass)
|
||||
assert.Equal(t, 3, res.Fail)
|
||||
require.NotNil(t, res.PassRate)
|
||||
assert.InDelta(t, 0.94, *res.PassRate, 0.001)
|
||||
}
|
||||
|
||||
func TestBrainClient_PassRate_NullRate(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_, _ = w.Write([]byte(`{"skill":"tdd","window":"7d","pass":0,"fail":0,"skip":0,"total":0,"pass_rate":null}`))
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
c := &brainClient{baseURL: srv.URL, http: srv.Client()}
|
||||
res, err := c.PassRate(context.Background(), "tdd", "7d")
|
||||
require.NoError(t, err)
|
||||
assert.Nil(t, res.PassRate)
|
||||
}
|
||||
|
||||
func TestNewBrainClient_DefaultURL(t *testing.T) {
|
||||
t.Setenv("BRAIN_URL", "")
|
||||
c := newBrainClient()
|
||||
assert.Equal(t, "http://koala:30330", c.baseURL)
|
||||
}
|
||||
|
||||
func TestNewBrainClient_OverrideURL(t *testing.T) {
|
||||
t.Setenv("BRAIN_URL", "http://localhost:9999")
|
||||
c := newBrainClient()
|
||||
assert.Equal(t, "http://localhost:9999", c.baseURL)
|
||||
}
|
||||
71
cmd/hyperguild/main.go
Normal file
71
cmd/hyperguild/main.go
Normal file
@@ -0,0 +1,71 @@
|
||||
// Package main implements the hyperguild CLI: tier probe, brain HTTP REST
|
||||
// access, and .mcp.json mode bootstrap. See docs/superpowers/specs/.
|
||||
package main
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"io"
|
||||
"os"
|
||||
)
|
||||
|
||||
// subcommand is the contract every hyperguild subcommand satisfies.
|
||||
// Functions take an explicit context, args (without the subcommand name
|
||||
// itself), and explicit IO so tests can exercise full flows without
|
||||
// touching os.Stdin / os.Stdout / os.Exit.
|
||||
type subcommand func(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io.Writer) error
|
||||
|
||||
func subcommands() map[string]subcommand {
|
||||
return map[string]subcommand{
|
||||
"tier": runTier,
|
||||
"brain": runBrain,
|
||||
"mode": runMode,
|
||||
}
|
||||
}
|
||||
|
||||
const usage = `Usage: hyperguild <subcommand> [options]
|
||||
|
||||
Subcommands:
|
||||
tier Probe Anthropic + LiteLLM, print current operating tier.
|
||||
brain query <q> BM25 search the brain (HTTP REST).
|
||||
brain write <t> <s>
|
||||
Write stdin as a knowledge entry of type <t>, slug <s>.
|
||||
mode <name> Bootstrap .mcp.json for a chosen mode:
|
||||
cloud | client-local | sovereign
|
||||
|
||||
Environment:
|
||||
BRAIN_URL Brain HTTP REST + MCP base URL.
|
||||
Default: http://koala:30330
|
||||
ANTHROPIC_PROBE_URL Tier probe URL for the Anthropic API.
|
||||
Default: https://api.anthropic.com
|
||||
LITELLM_BASE_URL Tier probe URL for the LiteLLM gateway.
|
||||
Optional; if empty, falls through to airplane tier.
|
||||
`
|
||||
|
||||
// dispatch routes args to a subcommand and returns the process exit code.
|
||||
// Split from main() so tests can drive it without process exit.
|
||||
func dispatch(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io.Writer) int {
|
||||
if len(args) == 0 {
|
||||
fmt.Fprint(stderr, usage) //nolint:errcheck
|
||||
return 2
|
||||
}
|
||||
switch args[0] {
|
||||
case "-h", "--help", "help":
|
||||
fmt.Fprint(stdout, usage) //nolint:errcheck
|
||||
return 0
|
||||
}
|
||||
cmd, ok := subcommands()[args[0]]
|
||||
if !ok {
|
||||
fmt.Fprintf(stderr, "hyperguild: unknown subcommand: %s\n%s", args[0], usage) //nolint:errcheck
|
||||
return 2
|
||||
}
|
||||
if err := cmd(ctx, args[1:], stdin, stdout, stderr); err != nil {
|
||||
fmt.Fprintf(stderr, "hyperguild %s: %v\n", args[0], err) //nolint:errcheck
|
||||
return 1
|
||||
}
|
||||
return 0
|
||||
}
|
||||
|
||||
func main() {
|
||||
os.Exit(dispatch(context.Background(), os.Args[1:], os.Stdin, os.Stdout, os.Stderr))
|
||||
}
|
||||
45
cmd/hyperguild/main_test.go
Normal file
45
cmd/hyperguild/main_test.go
Normal file
@@ -0,0 +1,45 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"github.com/stretchr/testify/assert"
|
||||
)
|
||||
|
||||
func TestDispatch_Help_PrintsUsageAndReturnsZero(t *testing.T) {
|
||||
var out, errBuf bytes.Buffer
|
||||
code := dispatch(context.Background(), []string{"--help"}, strings.NewReader(""), &out, &errBuf)
|
||||
assert.Equal(t, 0, code)
|
||||
assert.Contains(t, out.String(), "Usage: hyperguild")
|
||||
assert.Contains(t, out.String(), "tier")
|
||||
assert.Contains(t, out.String(), "brain")
|
||||
assert.Contains(t, out.String(), "mode")
|
||||
}
|
||||
|
||||
func TestDispatch_NoArgs_PrintsUsageAndReturnsTwo(t *testing.T) {
|
||||
var out, errBuf bytes.Buffer
|
||||
code := dispatch(context.Background(), []string{}, strings.NewReader(""), &out, &errBuf)
|
||||
assert.Equal(t, 2, code)
|
||||
assert.Contains(t, errBuf.String(), "Usage: hyperguild")
|
||||
}
|
||||
|
||||
func TestDispatch_UnknownSubcommand_ReturnsTwo(t *testing.T) {
|
||||
var out, errBuf bytes.Buffer
|
||||
code := dispatch(context.Background(), []string{"bogus"}, strings.NewReader(""), &out, &errBuf)
|
||||
assert.Equal(t, 2, code)
|
||||
assert.Contains(t, errBuf.String(), "unknown subcommand: bogus")
|
||||
}
|
||||
|
||||
func TestDispatch_KnownSubcommand_RoutesToHandler(t *testing.T) {
|
||||
// "mode" without args fails → exit 1, message on stderr.
|
||||
// (Confirms dispatch reached the handler rather than printing "unknown
|
||||
// subcommand: mode".)
|
||||
var out, errBuf bytes.Buffer
|
||||
code := dispatch(context.Background(), []string{"mode"}, strings.NewReader(""), &out, &errBuf)
|
||||
assert.Equal(t, 1, code)
|
||||
assert.Contains(t, errBuf.String(), "name required")
|
||||
assert.NotContains(t, errBuf.String(), "unknown subcommand")
|
||||
}
|
||||
101
cmd/hyperguild/mode.go
Normal file
101
cmd/hyperguild/mode.go
Normal file
@@ -0,0 +1,101 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"flag"
|
||||
"fmt"
|
||||
"io"
|
||||
"os"
|
||||
)
|
||||
|
||||
func runMode(ctx context.Context, args []string, _ io.Reader, stdout, stderr io.Writer) error {
|
||||
fs := flag.NewFlagSet("mode", flag.ContinueOnError)
|
||||
fs.SetOutput(stderr)
|
||||
out := fs.String("out", ".mcp.json", "output file path")
|
||||
force := fs.Bool("force", false, "overwrite an existing file")
|
||||
// Pull the first positional (mode name) out so flags after it still parse
|
||||
// with stdlib flag (which stops at the first non-flag arg).
|
||||
if len(args) < 1 {
|
||||
return errors.New("name required (cloud|client-local|sovereign)")
|
||||
}
|
||||
name := args[0]
|
||||
if err := fs.Parse(args[1:]); err != nil {
|
||||
return fmt.Errorf("parse flags: %w", err)
|
||||
}
|
||||
|
||||
brainURL := os.Getenv("BRAIN_URL")
|
||||
if brainURL == "" {
|
||||
brainURL = defaultBrainURL
|
||||
}
|
||||
|
||||
var doc map[string]any
|
||||
switch name {
|
||||
case "cloud":
|
||||
doc = modeCloud(brainURL)
|
||||
case "client-local":
|
||||
doc = modeClientLocal(brainURL)
|
||||
case "sovereign":
|
||||
doc = modeSovereign(brainURL)
|
||||
default:
|
||||
return fmt.Errorf("unknown mode: %s (expected cloud|client-local|sovereign)", name)
|
||||
}
|
||||
|
||||
if !*force {
|
||||
if _, err := os.Stat(*out); err == nil {
|
||||
return fmt.Errorf("%s exists (use --force to overwrite)", *out)
|
||||
}
|
||||
}
|
||||
|
||||
body, err := json.MarshalIndent(doc, "", " ")
|
||||
if err != nil {
|
||||
return fmt.Errorf("marshal mode doc: %w", err)
|
||||
}
|
||||
if err := os.WriteFile(*out, append(body, '\n'), 0o644); err != nil {
|
||||
return fmt.Errorf("write %s: %w", *out, err)
|
||||
}
|
||||
fmt.Fprintf(stdout, "wrote %s (mode: %s)\n", *out, name) //nolint:errcheck
|
||||
return nil
|
||||
}
|
||||
|
||||
func modeCloud(brainURL string) map[string]any {
|
||||
return map[string]any{
|
||||
"mcpServers": map[string]any{
|
||||
"brain": map[string]any{
|
||||
"url": brainURL + "/mcp",
|
||||
"description": "Brain MCP — knowledge query, write, ingestion, session log",
|
||||
},
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
func modeClientLocal(brainURL string) map[string]any {
|
||||
return map[string]any{
|
||||
"mcpServers": map[string]any{
|
||||
"brain": map[string]any{
|
||||
"url": brainURL + "/mcp",
|
||||
"description": "Brain MCP — knowledge query, write, ingestion, session log",
|
||||
},
|
||||
"routing": map[string]any{
|
||||
"url": "http://koala:30310/mcp",
|
||||
"description": "Mode 2 routing pod — routes skill calls to LiteLLM/local",
|
||||
"headers": map[string]any{
|
||||
"X-Hyperguild-Mode": "client-local",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
func modeSovereign(brainURL string) map[string]any {
|
||||
return map[string]any{
|
||||
"_mode_note": "Sovereign mode primarily uses Crush + LiteLLM. This .mcp.json is provided as Claude Code fallback (e.g. emergency offline editing).",
|
||||
"mcpServers": map[string]any{
|
||||
"brain": map[string]any{
|
||||
"url": brainURL + "/mcp",
|
||||
"description": "Brain MCP — knowledge query, write, ingestion, session log",
|
||||
},
|
||||
},
|
||||
}
|
||||
}
|
||||
148
cmd/hyperguild/mode_test.go
Normal file
148
cmd/hyperguild/mode_test.go
Normal file
@@ -0,0 +1,148 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"encoding/json"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func readJSON(t *testing.T, path string) map[string]any {
|
||||
t.Helper()
|
||||
b, err := os.ReadFile(path)
|
||||
require.NoError(t, err)
|
||||
var out map[string]any
|
||||
require.NoError(t, json.Unmarshal(b, &out))
|
||||
return out
|
||||
}
|
||||
|
||||
func TestRunMode_Cloud_Default(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
outPath := filepath.Join(dir, ".mcp.json")
|
||||
t.Setenv("BRAIN_URL", "http://koala:30330")
|
||||
|
||||
var stdout, stderr bytes.Buffer
|
||||
err := runMode(context.Background(), []string{"cloud", "--out", outPath}, strings.NewReader(""), &stdout, &stderr)
|
||||
require.NoError(t, err)
|
||||
|
||||
got := readJSON(t, outPath)
|
||||
servers, ok := got["mcpServers"].(map[string]any)
|
||||
require.True(t, ok, "mcpServers must be a JSON object")
|
||||
assert.Contains(t, servers, "brain")
|
||||
assert.NotContains(t, servers, "routing")
|
||||
assert.NotContains(t, got, "_mode_note")
|
||||
}
|
||||
|
||||
func TestRunMode_ClientLocal_HasRoutingEntry(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
outPath := filepath.Join(dir, ".mcp.json")
|
||||
t.Setenv("BRAIN_URL", "http://koala:30330")
|
||||
|
||||
var stdout, stderr bytes.Buffer
|
||||
err := runMode(context.Background(), []string{"client-local", "--out", outPath}, strings.NewReader(""), &stdout, &stderr)
|
||||
require.NoError(t, err)
|
||||
|
||||
got := readJSON(t, outPath)
|
||||
servers := got["mcpServers"].(map[string]any)
|
||||
require.Contains(t, servers, "brain")
|
||||
require.Contains(t, servers, "routing")
|
||||
|
||||
routing := servers["routing"].(map[string]any)
|
||||
assert.NotContains(t, routing, "_routing_pending", "placeholder should be removed once Plan 6 ships")
|
||||
|
||||
headers, ok := routing["headers"].(map[string]any)
|
||||
require.True(t, ok, "routing entry should have headers block")
|
||||
assert.Equal(t, "client-local", headers["X-Hyperguild-Mode"])
|
||||
}
|
||||
|
||||
func TestModeClientLocalHasRoutingHeader(t *testing.T) {
|
||||
tmp := t.TempDir() + "/mcp.json"
|
||||
out := &bytes.Buffer{}
|
||||
stderr := &bytes.Buffer{}
|
||||
require.NoError(t, runMode(context.Background(), []string{"client-local", "--out", tmp}, nil, out, stderr))
|
||||
|
||||
body, err := os.ReadFile(tmp)
|
||||
require.NoError(t, err)
|
||||
var doc map[string]any
|
||||
require.NoError(t, json.Unmarshal(body, &doc))
|
||||
|
||||
servers := doc["mcpServers"].(map[string]any)
|
||||
routing := servers["routing"].(map[string]any)
|
||||
assert.Equal(t, "http://koala:30310/mcp", routing["url"])
|
||||
assert.NotContains(t, routing, "_routing_pending", "placeholder should be removed once Plan 6 ships")
|
||||
|
||||
headers, ok := routing["headers"].(map[string]any)
|
||||
require.True(t, ok, "routing entry should have headers block")
|
||||
assert.Equal(t, "client-local", headers["X-Hyperguild-Mode"])
|
||||
}
|
||||
|
||||
func TestRunMode_Sovereign_HasModeNote(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
outPath := filepath.Join(dir, ".mcp.json")
|
||||
|
||||
var stdout, stderr bytes.Buffer
|
||||
err := runMode(context.Background(), []string{"sovereign", "--out", outPath}, strings.NewReader(""), &stdout, &stderr)
|
||||
require.NoError(t, err)
|
||||
|
||||
got := readJSON(t, outPath)
|
||||
assert.Contains(t, got, "_mode_note")
|
||||
servers := got["mcpServers"].(map[string]any)
|
||||
assert.Contains(t, servers, "brain")
|
||||
assert.NotContains(t, servers, "routing")
|
||||
}
|
||||
|
||||
func TestRunMode_DefaultsOutToCwd(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
t.Chdir(dir) // Go 1.24+ — replaces the older os.Chdir-with-cleanup pattern
|
||||
|
||||
var stdout, stderr bytes.Buffer
|
||||
err := runMode(context.Background(), []string{"cloud"}, strings.NewReader(""), &stdout, &stderr)
|
||||
require.NoError(t, err)
|
||||
_, statErr := os.Stat(filepath.Join(dir, ".mcp.json"))
|
||||
assert.NoError(t, statErr, ".mcp.json should exist in cwd")
|
||||
}
|
||||
|
||||
func TestRunMode_UnknownMode(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
outPath := filepath.Join(dir, ".mcp.json")
|
||||
var stdout, stderr bytes.Buffer
|
||||
err := runMode(context.Background(), []string{"bogus", "--out", outPath}, strings.NewReader(""), &stdout, &stderr)
|
||||
assert.Error(t, err)
|
||||
assert.Contains(t, err.Error(), "unknown mode")
|
||||
}
|
||||
|
||||
func TestRunMode_NoArgs(t *testing.T) {
|
||||
var stdout, stderr bytes.Buffer
|
||||
err := runMode(context.Background(), []string{}, strings.NewReader(""), &stdout, &stderr)
|
||||
assert.Error(t, err)
|
||||
}
|
||||
|
||||
func TestRunMode_RefusesToOverwrite(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
outPath := filepath.Join(dir, ".mcp.json")
|
||||
require.NoError(t, os.WriteFile(outPath, []byte(`{"existing":"file"}`), 0o644))
|
||||
|
||||
var stdout, stderr bytes.Buffer
|
||||
err := runMode(context.Background(), []string{"cloud", "--out", outPath}, strings.NewReader(""), &stdout, &stderr)
|
||||
require.Error(t, err)
|
||||
assert.Contains(t, err.Error(), "exists")
|
||||
}
|
||||
|
||||
func TestRunMode_Force(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
outPath := filepath.Join(dir, ".mcp.json")
|
||||
require.NoError(t, os.WriteFile(outPath, []byte(`{"existing":"file"}`), 0o644))
|
||||
|
||||
var stdout, stderr bytes.Buffer
|
||||
err := runMode(context.Background(), []string{"cloud", "--out", outPath, "--force"}, strings.NewReader(""), &stdout, &stderr)
|
||||
require.NoError(t, err)
|
||||
got := readJSON(t, outPath)
|
||||
assert.Contains(t, got, "mcpServers")
|
||||
assert.NotContains(t, got, "existing")
|
||||
}
|
||||
42
cmd/hyperguild/tier.go
Normal file
42
cmd/hyperguild/tier.go
Normal file
@@ -0,0 +1,42 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"flag"
|
||||
"fmt"
|
||||
"io"
|
||||
"os"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/tier"
|
||||
)
|
||||
|
||||
const defaultAnthropicProbe = "https://api.anthropic.com"
|
||||
|
||||
func runTier(ctx context.Context, args []string, _ io.Reader, stdout, stderr io.Writer) error {
|
||||
fs := flag.NewFlagSet("tier", flag.ContinueOnError)
|
||||
fs.SetOutput(stderr)
|
||||
asJSON := fs.Bool("json", false, "output JSON instead of human-readable")
|
||||
if err := fs.Parse(args); err != nil {
|
||||
return fmt.Errorf("parse flags: %w", err)
|
||||
}
|
||||
|
||||
anthropicURL := os.Getenv("ANTHROPIC_PROBE_URL")
|
||||
if anthropicURL == "" {
|
||||
anthropicURL = defaultAnthropicProbe
|
||||
}
|
||||
liteLLMURL := os.Getenv("LITELLM_BASE_URL") // empty → tier falls through to airplane
|
||||
|
||||
info := tier.Detect(ctx, anthropicURL, liteLLMURL)
|
||||
|
||||
if *asJSON {
|
||||
enc := json.NewEncoder(stdout)
|
||||
enc.SetIndent("", " ")
|
||||
if err := enc.Encode(info); err != nil {
|
||||
return fmt.Errorf("encode json: %w", err)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
fmt.Fprintf(stdout, "tier %d (%s) managed_agents=%t\n", int(info.Tier), info.Label, info.ManagedAgents) //nolint:errcheck
|
||||
return nil
|
||||
}
|
||||
77
cmd/hyperguild/tier_test.go
Normal file
77
cmd/hyperguild/tier_test.go
Normal file
@@ -0,0 +1,77 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func okServer(t *testing.T) *httptest.Server {
|
||||
t.Helper()
|
||||
return httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
w.WriteHeader(http.StatusOK)
|
||||
}))
|
||||
}
|
||||
|
||||
func TestRunTier_Full_Human(t *testing.T) {
|
||||
anthropic := okServer(t)
|
||||
defer anthropic.Close()
|
||||
litellm := okServer(t)
|
||||
defer litellm.Close()
|
||||
|
||||
t.Setenv("ANTHROPIC_PROBE_URL", anthropic.URL)
|
||||
t.Setenv("LITELLM_BASE_URL", litellm.URL)
|
||||
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runTier(context.Background(), []string{}, strings.NewReader(""), &out, &errBuf)
|
||||
require.NoError(t, err)
|
||||
assert.Contains(t, out.String(), "tier 1")
|
||||
assert.Contains(t, out.String(), "full-online")
|
||||
assert.Contains(t, out.String(), "managed_agents=true")
|
||||
}
|
||||
|
||||
func TestRunTier_LANOnly_JSON(t *testing.T) {
|
||||
litellm := okServer(t)
|
||||
defer litellm.Close()
|
||||
|
||||
t.Setenv("ANTHROPIC_PROBE_URL", "http://127.0.0.1:1") // unreachable
|
||||
t.Setenv("LITELLM_BASE_URL", litellm.URL)
|
||||
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runTier(context.Background(), []string{"--json"}, strings.NewReader(""), &out, &errBuf)
|
||||
require.NoError(t, err)
|
||||
|
||||
var got struct {
|
||||
Tier int `json:"tier"`
|
||||
Label string `json:"label"`
|
||||
ManagedAgents bool `json:"managed_agents"`
|
||||
}
|
||||
require.NoError(t, json.Unmarshal(out.Bytes(), &got))
|
||||
assert.Equal(t, 2, got.Tier)
|
||||
assert.Equal(t, "lan-only", got.Label)
|
||||
assert.False(t, got.ManagedAgents)
|
||||
}
|
||||
|
||||
func TestRunTier_Airplane_NoLiteLLMBaseURL(t *testing.T) {
|
||||
t.Setenv("ANTHROPIC_PROBE_URL", "http://127.0.0.1:1")
|
||||
t.Setenv("LITELLM_BASE_URL", "")
|
||||
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runTier(context.Background(), []string{}, strings.NewReader(""), &out, &errBuf)
|
||||
require.NoError(t, err)
|
||||
assert.Contains(t, out.String(), "tier 3")
|
||||
assert.Contains(t, out.String(), "airplane")
|
||||
}
|
||||
|
||||
func TestRunTier_UnknownFlag_ReturnsError(t *testing.T) {
|
||||
var out, errBuf bytes.Buffer
|
||||
err := runTier(context.Background(), []string{"--bogus"}, strings.NewReader(""), &out, &errBuf)
|
||||
assert.Error(t, err)
|
||||
}
|
||||
165
cmd/routing/main.go
Normal file
165
cmd/routing/main.go
Normal file
@@ -0,0 +1,165 @@
|
||||
package main
|
||||
|
||||
// The internal/skills/{debug,retrospective,review,trainer} packages imported
|
||||
// below are also imported by cmd/supervisor. Plan 7 (supervisor retirement)
|
||||
// MUST NOT delete these four packages — the routing pod is their second
|
||||
// consumer. Plan 7 deletes only internal/skills/{tdd,spec,tier} (the skills
|
||||
// that don't route to local), the supervisor binary, and supervisor manifests.
|
||||
// See docs/superpowers/specs/2026-05-04-mode-2-routing-pod-design.md (Constraints).
|
||||
|
||||
import (
|
||||
"context"
|
||||
"log/slog"
|
||||
"net/http"
|
||||
"os"
|
||||
"time"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/auth"
|
||||
"github.com/mathiasbq/supervisor/internal/config"
|
||||
iexec "github.com/mathiasbq/supervisor/internal/exec"
|
||||
"github.com/mathiasbq/supervisor/internal/githubclient"
|
||||
"github.com/mathiasbq/supervisor/internal/mcp"
|
||||
"github.com/mathiasbq/supervisor/internal/mcpclient"
|
||||
"github.com/mathiasbq/supervisor/internal/registry"
|
||||
"github.com/mathiasbq/supervisor/internal/routing"
|
||||
"github.com/mathiasbq/supervisor/internal/skills/debug"
|
||||
"github.com/mathiasbq/supervisor/internal/skills/project"
|
||||
"github.com/mathiasbq/supervisor/internal/skills/retrospective"
|
||||
"github.com/mathiasbq/supervisor/internal/skills/review"
|
||||
"github.com/mathiasbq/supervisor/internal/skills/trainer"
|
||||
)
|
||||
|
||||
func main() {
|
||||
logger := slog.New(slog.NewTextHandler(os.Stderr, nil))
|
||||
slog.SetDefault(logger)
|
||||
|
||||
cfg, err := config.LoadRouting()
|
||||
if err != nil {
|
||||
logger.Error("config load failed", "err", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
configDir := envOr("SUPERVISOR_CONFIG_DIR", "/app/config/supervisor")
|
||||
mustRead := func(path string) string {
|
||||
b, err := os.ReadFile(configDir + "/" + path)
|
||||
if err != nil {
|
||||
logger.Error("read prompt failed", "path", path, "err", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
return string(b)
|
||||
}
|
||||
|
||||
llm := iexec.NewLiteLLM(cfg.LiteLLMBaseURL, cfg.LiteLLMAPIKey, 0)
|
||||
|
||||
router := &routing.Router{
|
||||
Fetcher: routing.NewFetcher(cfg.BrainURL, "7d", time.Duration(cfg.PassRateTTLSeconds)*time.Second),
|
||||
Logger: routing.NewLogger(cfg.BrainURL),
|
||||
Policy: routing.Policy{Floor: cfg.RouteLocalFloor, Ceil: cfg.RouteLocalCeil},
|
||||
FastModel: cfg.FastModel,
|
||||
ThinkingModel: cfg.ThinkingModel,
|
||||
Complete: llm.Complete,
|
||||
}
|
||||
|
||||
// Skill packages call CompleteFunc(ctx, model, system, user) — no session_id
|
||||
// or project_root in the signature. Rather than modifying every skill's API
|
||||
// (and inflating Plan 6's blast radius), the routing pod logs every decision
|
||||
// under a fixed session_id "_routing". Operators query
|
||||
// `GET /pass-rate?skill=_routing&window=...` to inspect routing health.
|
||||
const routingSessionID = "_routing"
|
||||
wrap := func(skillName string) routing.CompleteFunc {
|
||||
return func(ctx context.Context, _, system, user string) (string, int64, error) {
|
||||
// The model param is ignored: the router picks the model based on policy.
|
||||
return router.Run(ctx, routing.RunInput{
|
||||
Skill: skillName,
|
||||
System: system,
|
||||
User: user,
|
||||
SessionID: routingSessionID,
|
||||
ProjectRoot: "",
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
reg := registry.New()
|
||||
reg.Register(review.New(review.Config{
|
||||
SkillPrompt: mustRead("review.md"),
|
||||
DefaultModel: cfg.FastModel,
|
||||
CompleteFunc: review.CompleteFunc(wrap("review")),
|
||||
}))
|
||||
reg.Register(debug.New(debug.Config{
|
||||
SkillPrompt: mustRead("debug.md"),
|
||||
DefaultModel: cfg.FastModel,
|
||||
CompleteFunc: debug.CompleteFunc(wrap("debug")),
|
||||
}))
|
||||
reg.Register(retrospective.New(retrospective.Config{
|
||||
SkillPrompt: mustRead("retrospective.md"),
|
||||
DefaultModel: cfg.FastModel,
|
||||
CompleteFunc: retrospective.CompleteFunc(wrap("retrospective")),
|
||||
}))
|
||||
reg.Register(trainer.New(trainer.Config{
|
||||
ReaderPrompt: mustRead("trainer-reader.md"),
|
||||
WriterPrompt: mustRead("trainer-writer.md"),
|
||||
DefaultModel: cfg.FastModel,
|
||||
CompleteFunc: trainer.CompleteFunc(wrap("trainer")),
|
||||
}))
|
||||
|
||||
if cfg.GiteaMCPURL != "" {
|
||||
var ghClient *githubclient.Client
|
||||
if cfg.GitHubPAT != "" {
|
||||
ghClient = githubclient.New(cfg.GitHubPAT)
|
||||
}
|
||||
reg.Register(project.New(project.Config{
|
||||
Client: mcpclient.New(cfg.GiteaMCPURL, cfg.GiteaMCPToken),
|
||||
GitHub: ghClient,
|
||||
GiteaOwner: cfg.GiteaOwner,
|
||||
GitHubOwner: cfg.GitHubOwner,
|
||||
GitHubPAT: cfg.GitHubPAT,
|
||||
InfraRepo: cfg.InfraRepo,
|
||||
}))
|
||||
logger.Info("project_create registered", "gitea_mcp_url", cfg.GiteaMCPURL,
|
||||
"gitea_owner", cfg.GiteaOwner, "github_owner", cfg.GitHubOwner,
|
||||
"infra_repo", cfg.InfraRepo, "github_pat_set", cfg.GitHubPAT != "")
|
||||
} else {
|
||||
logger.Info("project_create skipped — GITEA_MCP_URL not set")
|
||||
}
|
||||
|
||||
var validator *auth.Validator
|
||||
if dexURL := os.Getenv("DEX_ISSUER_URL"); dexURL != "" {
|
||||
audience := os.Getenv("MCP_AUDIENCE")
|
||||
v, err := auth.NewValidator(dexURL, audience)
|
||||
if err != nil {
|
||||
logger.Error("build jwt validator", "err", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
validator = v
|
||||
logger.Info("jwt auth enabled", "issuer", dexURL)
|
||||
}
|
||||
|
||||
srv := mcp.NewServer(reg, cfg.MCPAuthToken, validator)
|
||||
mux := http.NewServeMux()
|
||||
mux.Handle("/mcp", srv)
|
||||
mux.HandleFunc("/healthz", func(w http.ResponseWriter, _ *http.Request) {
|
||||
w.WriteHeader(http.StatusOK)
|
||||
})
|
||||
|
||||
if dexURL := os.Getenv("DEX_ISSUER_URL"); dexURL != "" {
|
||||
resourceURL := os.Getenv("MCP_RESOURCE_URL")
|
||||
mux.HandleFunc("GET /.well-known/oauth-protected-resource",
|
||||
auth.ProtectedResourceHandler(resourceURL, dexURL))
|
||||
}
|
||||
|
||||
addr := ":" + cfg.Port
|
||||
logger.Info("routing pod starting", "addr", addr,
|
||||
"fast", cfg.FastModel, "thinking", cfg.ThinkingModel,
|
||||
"floor", cfg.RouteLocalFloor, "ceil", cfg.RouteLocalCeil)
|
||||
if err := http.ListenAndServe(addr, mux); err != nil { //nolint:gosec
|
||||
logger.Error("server stopped", "err", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
}
|
||||
|
||||
func envOr(key, def string) string {
|
||||
if v := os.Getenv(key); v != "" {
|
||||
return v
|
||||
}
|
||||
return def
|
||||
}
|
||||
123
cmd/routing/main_test.go
Normal file
123
cmd/routing/main_test.go
Normal file
@@ -0,0 +1,123 @@
|
||||
package main_test
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"io"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"os/exec"
|
||||
"strings"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
// TestRoutingPodEndToEnd boots the binary against fake LiteLLM + brain servers,
|
||||
// calls tools/list and one tools/call, and verifies the brain saw a session_log POST.
|
||||
func TestRoutingPodEndToEnd(t *testing.T) {
|
||||
if testing.Short() {
|
||||
t.Skip("end-to-end binary boot")
|
||||
}
|
||||
|
||||
var brainHits int
|
||||
llm := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||
_ = json.NewEncoder(w).Encode(map[string]any{
|
||||
"choices": []map[string]any{{"message": map[string]any{"role": "assistant", "content": "stub"}}},
|
||||
})
|
||||
}))
|
||||
defer llm.Close()
|
||||
|
||||
brain := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
switch r.URL.Path {
|
||||
case "/pass-rate":
|
||||
brainHits++
|
||||
_ = json.NewEncoder(w).Encode(map[string]any{"pass_rate": 0.95})
|
||||
case "/mcp":
|
||||
brainHits++
|
||||
_ = json.NewEncoder(w).Encode(map[string]any{"jsonrpc": "2.0", "id": 1, "result": map[string]any{}})
|
||||
}
|
||||
}))
|
||||
defer brain.Close()
|
||||
|
||||
bin := buildRouting(t)
|
||||
cmd := exec.Command(bin)
|
||||
cmd.Env = append(cmd.Env,
|
||||
"ROUTING_PORT=33310",
|
||||
"LITELLM_BASE_URL="+llm.URL,
|
||||
"LITELLM_API_KEY=stub",
|
||||
"BRAIN_URL="+brain.URL,
|
||||
"SUPERVISOR_CONFIG_DIR=../../config/supervisor",
|
||||
"PATH="+osPath(),
|
||||
)
|
||||
require.NoError(t, cmd.Start())
|
||||
t.Cleanup(func() { _ = cmd.Process.Kill() })
|
||||
|
||||
require.NoError(t, waitForPort(t, "127.0.0.1:33310", 5*time.Second))
|
||||
|
||||
resp := mcpCall(t, "http://127.0.0.1:33310/mcp", `{"jsonrpc":"2.0","id":1,"method":"tools/list"}`)
|
||||
assert.Contains(t, resp, `"review"`)
|
||||
assert.Contains(t, resp, `"debug"`)
|
||||
assert.Contains(t, resp, `"retrospective"`)
|
||||
assert.Contains(t, resp, `"trainer"`)
|
||||
|
||||
resp = mcpCall(t, "http://127.0.0.1:33310/mcp", `{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"review","arguments":{"project_root":"/tmp","files":["README.md"]}}}`)
|
||||
_ = resp // shape varies by skill; we only need a 200
|
||||
|
||||
// Wait briefly for the async session_log to land.
|
||||
deadline := time.Now().Add(2 * time.Second)
|
||||
for time.Now().Before(deadline) && brainHits < 2 {
|
||||
time.Sleep(50 * time.Millisecond)
|
||||
}
|
||||
assert.GreaterOrEqual(t, brainHits, 2, "expected at least one /pass-rate hit and one /mcp session_log hit")
|
||||
}
|
||||
|
||||
func buildRouting(t *testing.T) string {
|
||||
t.Helper()
|
||||
bin := t.TempDir() + "/routing"
|
||||
out, err := exec.Command("go", "build", "-o", bin, "github.com/mathiasbq/supervisor/cmd/routing").CombinedOutput()
|
||||
require.NoError(t, err, "build failed: %s", out)
|
||||
return bin
|
||||
}
|
||||
|
||||
func waitForPort(_ *testing.T, addr string, dur time.Duration) error {
|
||||
deadline := time.Now().Add(dur)
|
||||
for time.Now().Before(deadline) {
|
||||
c, err := http.Get("http://" + addr + "/healthz") //nolint:noctx
|
||||
if err == nil {
|
||||
_ = c.Body.Close()
|
||||
return nil
|
||||
}
|
||||
conn, err := http.NewRequest(http.MethodPost, "http://"+addr+"/mcp", strings.NewReader(`{}`))
|
||||
if err == nil {
|
||||
r, err := http.DefaultClient.Do(conn)
|
||||
if err == nil {
|
||||
_ = r.Body.Close()
|
||||
return nil
|
||||
}
|
||||
}
|
||||
time.Sleep(50 * time.Millisecond)
|
||||
}
|
||||
return context.DeadlineExceeded
|
||||
}
|
||||
|
||||
func mcpCall(t *testing.T, url, body string) string {
|
||||
t.Helper()
|
||||
r, err := http.Post(url, "application/json", strings.NewReader(body)) //nolint:noctx
|
||||
require.NoError(t, err)
|
||||
defer func() { _ = r.Body.Close() }()
|
||||
raw, err := io.ReadAll(r.Body)
|
||||
require.NoError(t, err)
|
||||
return string(raw)
|
||||
}
|
||||
|
||||
func osPath() string {
|
||||
for _, e := range append([]string{}, exec.Command("env").Env...) {
|
||||
if strings.HasPrefix(e, "PATH=") {
|
||||
return strings.TrimPrefix(e, "PATH=")
|
||||
}
|
||||
}
|
||||
return "/usr/bin:/bin"
|
||||
}
|
||||
@@ -1,163 +0,0 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"context"
|
||||
"log/slog"
|
||||
"net/http"
|
||||
"os"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/config"
|
||||
iexec "github.com/mathiasbq/supervisor/internal/exec"
|
||||
"github.com/mathiasbq/supervisor/internal/mcp"
|
||||
"github.com/mathiasbq/supervisor/internal/registry"
|
||||
"github.com/mathiasbq/supervisor/internal/skills/brain"
|
||||
"github.com/mathiasbq/supervisor/internal/skills/org"
|
||||
"github.com/mathiasbq/supervisor/internal/skills/retrospective"
|
||||
skilldebug "github.com/mathiasbq/supervisor/internal/skills/debug"
|
||||
"github.com/mathiasbq/supervisor/internal/skills/review"
|
||||
"github.com/mathiasbq/supervisor/internal/skills/spec"
|
||||
"github.com/mathiasbq/supervisor/internal/skills/trainer"
|
||||
"github.com/mathiasbq/supervisor/internal/skills/sessionlog"
|
||||
"github.com/mathiasbq/supervisor/internal/skills/tdd"
|
||||
"github.com/mathiasbq/supervisor/internal/tier"
|
||||
)
|
||||
|
||||
func main() {
|
||||
logger := slog.New(slog.NewJSONHandler(os.Stdout, nil))
|
||||
|
||||
cfg, err := config.Load()
|
||||
if err != nil {
|
||||
logger.Error("load config", "err", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
models, err := config.LoadModels(cfg.ModelsFile)
|
||||
if err != nil {
|
||||
logger.Error("load models", "err", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
protocolsPrompt, err := os.ReadFile(cfg.ConfigDir + "/protocols.md")
|
||||
if err != nil {
|
||||
logger.Error("read protocols.md", "path", cfg.ConfigDir+"/protocols.md", "err", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
// prependProtocols prepends the shared protocols to a skill discipline file.
|
||||
prependProtocols := func(skillPrompt []byte) string {
|
||||
return string(protocolsPrompt) + "\n---\n\n" + string(skillPrompt)
|
||||
}
|
||||
|
||||
tddPrompt, err := os.ReadFile(cfg.ConfigDir + "/tdd.md")
|
||||
if err != nil {
|
||||
logger.Error("read tdd.md", "path", cfg.ConfigDir+"/tdd.md", "err", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
retroPrompt, err := os.ReadFile(cfg.ConfigDir + "/retrospective.md")
|
||||
if err != nil {
|
||||
logger.Error("read retrospective.md", "path", cfg.ConfigDir+"/retrospective.md", "err", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
reviewPrompt, err := os.ReadFile(cfg.ConfigDir + "/review.md")
|
||||
if err != nil {
|
||||
logger.Error("read review.md", "path", cfg.ConfigDir+"/review.md", "err", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
debugPrompt, err := os.ReadFile(cfg.ConfigDir + "/debug.md")
|
||||
if err != nil {
|
||||
logger.Error("read debug.md", "path", cfg.ConfigDir+"/debug.md", "err", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
specPrompt, err := os.ReadFile(cfg.ConfigDir + "/spec.md")
|
||||
if err != nil {
|
||||
logger.Error("read spec.md", "path", cfg.ConfigDir+"/spec.md", "err", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
trainerReaderPrompt, err := os.ReadFile(cfg.ConfigDir + "/trainer-reader.md")
|
||||
if err != nil {
|
||||
logger.Error("read trainer-reader.md", "path", cfg.ConfigDir+"/trainer-reader.md", "err", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
trainerWriterPrompt, err := os.ReadFile(cfg.ConfigDir + "/trainer-writer.md")
|
||||
if err != nil {
|
||||
logger.Error("read trainer-writer.md", "path", cfg.ConfigDir+"/trainer-writer.md", "err", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
litellm := iexec.NewLiteLLM(cfg.LiteLLMBaseURL, cfg.LiteLLMAPIKey, 0)
|
||||
|
||||
tierFn := func(ctx context.Context) tier.Info {
|
||||
return tier.Detect(ctx, "https://api.anthropic.com", cfg.LiteLLMBaseURL)
|
||||
}
|
||||
|
||||
reg := registry.New()
|
||||
reg.Register(tdd.New(tdd.Config{
|
||||
SkillPrompt: prependProtocols(tddPrompt),
|
||||
DefaultModel: models.ModelFor("tdd", ""),
|
||||
CompleteFunc: litellm.Complete,
|
||||
SessionsDir: cfg.SessionsDir,
|
||||
IngestBaseURL: cfg.IngestBaseURL,
|
||||
}))
|
||||
reg.Register(brain.New(brain.Config{
|
||||
IngestBaseURL: cfg.IngestBaseURL,
|
||||
IngestSvcURL: cfg.IngestSvcURL,
|
||||
KBRetrievalURL: cfg.KBRetrievalURL,
|
||||
}))
|
||||
reg.Register(org.New(org.Config{
|
||||
TierFn: tierFn,
|
||||
}))
|
||||
reg.Register(sessionlog.New(sessionlog.Config{
|
||||
SessionsDir: cfg.SessionsDir,
|
||||
}))
|
||||
reg.Register(retrospective.New(retrospective.Config{
|
||||
SkillPrompt: prependProtocols(retroPrompt),
|
||||
DefaultModel: models.ModelFor("retrospective", ""),
|
||||
SessionsDir: cfg.SessionsDir,
|
||||
CompleteFunc: litellm.Complete,
|
||||
}))
|
||||
reg.Register(review.New(review.Config{
|
||||
SkillPrompt: prependProtocols(reviewPrompt),
|
||||
DefaultModel: models.ModelFor("review", ""),
|
||||
CompleteFunc: litellm.Complete,
|
||||
SessionsDir: cfg.SessionsDir,
|
||||
IngestBaseURL: cfg.IngestBaseURL,
|
||||
}))
|
||||
reg.Register(skilldebug.New(skilldebug.Config{
|
||||
SkillPrompt: prependProtocols(debugPrompt),
|
||||
DefaultModel: models.ModelFor("debug", ""),
|
||||
CompleteFunc: litellm.Complete,
|
||||
SessionsDir: cfg.SessionsDir,
|
||||
IngestBaseURL: cfg.IngestBaseURL,
|
||||
}))
|
||||
reg.Register(spec.New(spec.Config{
|
||||
SkillPrompt: prependProtocols(specPrompt),
|
||||
DefaultModel: models.ModelFor("spec", ""),
|
||||
CompleteFunc: litellm.Complete,
|
||||
SessionsDir: cfg.SessionsDir,
|
||||
IngestBaseURL: cfg.IngestBaseURL,
|
||||
}))
|
||||
reg.Register(trainer.New(trainer.Config{
|
||||
ReaderPrompt: prependProtocols(trainerReaderPrompt),
|
||||
WriterPrompt: prependProtocols(trainerWriterPrompt),
|
||||
DefaultModel: models.ModelFor("trainer", ""),
|
||||
CompleteFunc: litellm.Complete,
|
||||
SessionsDir: cfg.SessionsDir,
|
||||
BrainDir: cfg.BrainDir,
|
||||
}))
|
||||
|
||||
srv := mcp.NewServer(reg)
|
||||
mux := http.NewServeMux()
|
||||
mux.Handle("/mcp", srv)
|
||||
|
||||
addr := ":" + cfg.Port
|
||||
logger.Info("supervisor starting", "addr", addr, "version", "v0.5.0")
|
||||
if err := http.ListenAndServe(addr, mux); err != nil {
|
||||
logger.Error("server stopped", "err", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
}
|
||||
@@ -1,14 +0,0 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"os/exec"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestBinaryCompiles(t *testing.T) {
|
||||
cmd := exec.Command("go", "build", "./...")
|
||||
out, err := cmd.CombinedOutput()
|
||||
if err != nil {
|
||||
t.Fatalf("build failed: %s\n%s", err, out)
|
||||
}
|
||||
}
|
||||
1826
docs/superpowers/plans/2026-04-29-brain-mcp-migration.md
Normal file
1826
docs/superpowers/plans/2026-04-29-brain-mcp-migration.md
Normal file
File diff suppressed because it is too large
Load Diff
1102
docs/superpowers/plans/2026-05-03-pass-rate-logging.md
Normal file
1102
docs/superpowers/plans/2026-05-03-pass-rate-logging.md
Normal file
File diff suppressed because it is too large
Load Diff
2449
docs/superpowers/plans/2026-05-04-mode-2-routing-pod.md
Normal file
2449
docs/superpowers/plans/2026-05-04-mode-2-routing-pod.md
Normal file
File diff suppressed because it is too large
Load Diff
79
docs/superpowers/specs/2026-05-03-hyperguild-cli-design.md
Normal file
79
docs/superpowers/specs/2026-05-03-hyperguild-cli-design.md
Normal file
@@ -0,0 +1,79 @@
|
||||
# Spec: hyperguild CLI
|
||||
|
||||
> Plan 4 of 7 — Hyperguild Skill Migration. Loaded after `feature-spec` skill.
|
||||
|
||||
## Problem Statement
|
||||
|
||||
Three needs converge on a single small Go binary:
|
||||
|
||||
1. **Tier probing as MCP is overkill.** The supervisor's `tier` MCP runs on `koala:30320` and answers a one-shot question (which models are reachable right now?). Pulling Claude Code through MCP startup, tool listing, and a JSON-RPC call for a 2-second probe is wasteful and adds a network hop the answer doesn't need.
|
||||
2. **Brain access from shell scripts has no good front door.** The brain's HTTP REST API exists (Plan 1) at `koala:3300` for non-MCP clients, but every shell script that wants to query or write to the brain re-implements the curl invocation. A CLI gives shell pipelines, ad-hoc agent prompts, and quick-debug scenarios a stable interface.
|
||||
3. **Mode bootstrap is manual.** Each new project that wants to operate in a chosen mode (cloud / client-local / sovereign) needs a `.mcp.json` written by hand. Without automation, mode adoption is gated on remembering the right MCP server URLs.
|
||||
|
||||
**Why now:** Plans 1–3 are merged. The CLI is the next building block in shrinking the supervisor pod toward a thin Mode-2 routing layer. Plans 5 and 6 build on the CLI's tier and brain helpers.
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- [ ] `hyperguild tier` returns the same `tier.Info` that `internal/tier.Detect` produces for the same probe URLs, in < 3 s under all three tier conditions, with both human-readable and `--json` output.
|
||||
- [ ] `hyperguild brain query <topic>` returns BM25 results from the brain HTTP REST `/query` endpoint, exit 0 on success and non-zero on transport failure.
|
||||
- [ ] `hyperguild brain write <type> <slug>` reads markdown content from stdin, posts to `/write` with the type and slug, and creates `brain/knowledge/<slug>.md`. A round-trip (`hyperguild brain query <slug>` immediately after) finds the entry.
|
||||
- [ ] `hyperguild mode <cloud|client-local|sovereign>` writes a parseable JSON file at the target path with the per-mode `mcpServers` entries; `jq -e .mcpServers` succeeds on the output.
|
||||
- [ ] All commands print usage on `--help`, exit 2 on unknown flags, exit non-zero on operational errors.
|
||||
- [ ] `task check` passes (lint + test + vet) on each task and on the merged branch.
|
||||
|
||||
## Constraints
|
||||
|
||||
- **Stdlib only.** No `cobra`, `urfave/cli`, `viper`, etc. CLI router and flag parsing use `flag.NewFlagSet`.
|
||||
- **Go 1.26.1**, project default.
|
||||
- **Module:** `github.com/mathiasbq/supervisor`, peer to `cmd/supervisor/`. New code at `cmd/hyperguild/`. The module name keeps its historical `supervisor` value — renaming the module is out of scope and would touch every import.
|
||||
- **Reuse `internal/tier`** unchanged. The CLI is a thin wrapper around `tier.Detect`.
|
||||
- **Brain endpoint configurable** via `BRAIN_URL` env var (default `http://koala:30330` — Tailscale-exposed NodePort, both MCP at `/mcp` and HTTP REST at `/query`, `/write`, etc., share the port). No hostname literals embedded in the CLI body — sourced from env per the existing "logical-addresses-in-instructions" memory.
|
||||
- **Test discipline:** table-driven, testify, fakes for HTTP and tier probing. No live network in tests.
|
||||
- **Errors:** wrapped via `fmt.Errorf("op: %w", err)`. No naked returns. Stderr for errors, stdout for results.
|
||||
|
||||
## Out of Scope
|
||||
|
||||
- The Mode 6 routing pod itself — `mode client-local` writes a placeholder entry pointing at the future routing URL with a `_routing_pending` annotation; the CLI does not provision the pod.
|
||||
- Pass-rate logging (Plan 5) — the CLI's `brain write` does not emit `session_log` events.
|
||||
- Skill worker CLIs (`hyperguild tdd_red`, `hyperguild review`, etc.) — those stay on the supervisor MCP until Plan 7.
|
||||
- Brain HTTP server changes — the REST endpoints already exist.
|
||||
- Authentication / TLS — Tailscale provides network isolation; no auth currently.
|
||||
- Windows/Linux binaries — macOS-only per the user's setup. `go build` is portable but no cross-compilation in CI.
|
||||
- A `crush` config writer for Mode 3 — Mode 3 (sovereign) writes a Claude-Code-compatible `.mcp.json` with brain-only MCP, on the assumption that even Crush-primary users may fall back to Claude Code with brain access. Crush's own config is owned by the user manually.
|
||||
- A unified `--config` file for the CLI — env var + flags is enough today.
|
||||
|
||||
## Technical Approach
|
||||
|
||||
- **Single binary, inline subcommand router.** `cmd/hyperguild/main.go` dispatches on `os.Args[1]` to per-subcommand functions, each owning its own `flag.NewFlagSet`. Rationale: 4 top-level subcommands (`tier`, `brain`, `mode`, plus `--help`) and one nested level (`brain query`, `brain write`); ~80 lines of routing plumbing in stdlib beats pulling cobra's ~3 KLOC of dependencies for a tiny CLI. The router is testable by injecting `args []string` instead of reading `os.Args` directly.
|
||||
|
||||
- **`tier` subcommand reuses `internal/tier.Detect` verbatim.** Probe URLs (`https://api.anthropic.com` and the LiteLLM base URL) come from environment: `ANTHROPIC_PROBE_URL` (default the literal Anthropic URL) and `LITELLM_BASE_URL` (no default — error if `--mode-needs-llm` and unset). Rationale: matching the supervisor's existing wiring means the CLI cannot disagree with the supervisor about tier; a single source of truth.
|
||||
|
||||
- **`brain` subcommand calls the HTTP REST API.** Two nested subcommands:
|
||||
- `brain query <topic>` issues `POST /query` with JSON body `{query, limit}` (default `--limit 5`), prints results in human-readable form by default and with `--json` for machine consumption.
|
||||
- `brain write <type> <slug>` reads stdin, posts `POST /write` with JSON body `{type, slug, content}`, prints the resulting path on success.
|
||||
Rationale: HTTP REST is simpler than MCP framing for a CLI. Per CLAUDE.md, the REST endpoints are documented as the official non-MCP interface.
|
||||
|
||||
- **`mode <name>` writes a per-mode `.mcp.json` template.** Defaults to writing `./.mcp.json` (cwd); accepts `--out <path>`. Per-mode bodies:
|
||||
- `cloud` — `mcpServers` contains only `brain` at `http://koala:30330/mcp`.
|
||||
- `client-local` — `mcpServers` contains `brain` at `http://koala:30330/mcp` and a `routing` placeholder entry with `url` set to a marker (`http://koala:30310/mcp`) and an extra field `"_routing_pending": "Plan 6 — routing pod not deployed yet"`. Rationale: keeping strict-JSON parseable means using a placeholder field rather than a JSON comment, which the spec parser would reject.
|
||||
- `sovereign` — `mcpServers` contains only `brain`, plus a top-level `"_mode_note": "Sovereign mode primarily uses Crush + LiteLLM. This .mcp.json is provided as Claude Code fallback."`.
|
||||
All three are valid JSON and all three round-trip through `jq` for verification.
|
||||
Rationale: a single subcommand with three clearly-different outputs is easier to evolve than three nearly-duplicate subcommands. The placeholder fields are intentional documentation in the file itself, which the user actually opens and edits.
|
||||
|
||||
- **No global state.** Each subcommand is a function `(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io.Writer) error`, allowing table-driven tests to exercise full subcommand flows without `os.Exit` or fd capture.
|
||||
|
||||
- **HTTP client injection.** A package-level `http.Client` with 5s timeout for `brain` calls, overridable in tests via a constructor. Real client for `main`, `httptest.Server` for tests.
|
||||
|
||||
## Risks
|
||||
|
||||
- **`.mcp.json` schema may evolve.** Claude Code's MCP config format is defined by the harness, and Anthropic could change it. Mitigation: document the format in the CLI's `--help` text and in the spec; if it breaks, the fix is local to one template function.
|
||||
|
||||
- **Brain endpoint hostname drift.** If the brain moves off `koala`, the env-var override avoids breaking the CLI but the `mode` template's hardcoded `koala:30330` becomes stale. Mitigation: source the URL in the `mode` template from the same env var (`BRAIN_URL`) so all three subcommands stay in lockstep with the user's actual environment.
|
||||
|
||||
- **`tier` probe URL gap.** The CLI inherits the supervisor's hardcoded `https://api.anthropic.com` probe URL via `internal/tier`. If Anthropic changes the URL, both supervisor and CLI break together. Mitigation: env-var override `ANTHROPIC_PROBE_URL`; default unchanged.
|
||||
|
||||
- **No HTTP retry logic.** The CLI returns first-error to the user. For ad-hoc shell use this is fine; for automation a future `--retry` flag may be needed. Out of scope for this iteration.
|
||||
|
||||
- **Tests don't cover live network.** Pure-fake tests catch regression but not "does the brain pod actually answer." Mitigation: add a smoke-test `task hyperguild:smoke` in a follow-up that runs against the real brain — separate concern, not in Plan 4.
|
||||
|
||||
- **Mode 3 sovereign output may surprise users** who expect Mode 3 to skip writing a `.mcp.json` entirely (since Crush is the primary harness). Mitigation: the `_mode_note` field explains the choice; the `--out /dev/null` escape hatch lets users skip the write if they want.
|
||||
125
docs/superpowers/specs/2026-05-03-pass-rate-logging-design.md
Normal file
125
docs/superpowers/specs/2026-05-03-pass-rate-logging-design.md
Normal file
@@ -0,0 +1,125 @@
|
||||
# Spec: Pass-rate logging
|
||||
|
||||
> Plan 5 of 7 — Hyperguild Skill Migration. Loaded after `feature-spec` skill.
|
||||
|
||||
## Problem Statement
|
||||
|
||||
Plan 6 (Mode 2 routing pod) needs a per-skill signal to decide whether to route a call to the local model or keep it on Claude. The natural signal is recent pass rate: a skill that succeeds 95% of the time on local is safe to route; a skill that succeeds 60% is not. Today there is no such signal — the `session_log` MCP exists (shipped in Plan 1) but skills don't reliably call it, and no endpoint computes pass rate from the resulting logs.
|
||||
|
||||
Two consequences:
|
||||
1. **Plan 6 cannot be trusted without baseline data.** Routing decisions made on guesses will produce regressions that erode confidence in Mode 2 entirely.
|
||||
2. **The skill library has no observability.** When a skill regresses (model swap, prompt drift, environment change), there's no way to notice until a downstream task explicitly fails.
|
||||
|
||||
**Why now:** Plans 1–4 are merged. Plan 5 instruments the discipline that Plan 6 will consume. Several weeks of usage data between Plan 5 merge and Plan 6 deploy will mean Plan 6 lands on real numbers, not synthetic.
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- [ ] After Plan 5 merges, every invocation of `tdd` (pilot skill) calls `session_log` at the end of each phase (red, green, refactor) with `final_status` ∈ {pass, fail, skip}.
|
||||
- [ ] At least 6 of the remaining "binary-outcome" skills get the same treatment: `code-review`, `debug`, `feature-spec`, `session-retrospective`, `trainer`, `spec-driven-dev`. (Skills with no clear pass/fail — `clean-code`, `cognitive-load`, `solid`, `refactoring`, `test-design`, `problem-analysis`, `user-stories`, `planning`, `atdd`, `gitea-ci` — are out of scope.)
|
||||
- [ ] A new HTTP REST endpoint `GET /pass-rate?skill=X&window=7d` on the brain pod returns valid JSON `{skill, window, pass, fail, skip, total, pass_rate}` for any skill name. Skills with no logged invocations return zeros (not 404, not error). Pass rate is `pass / (pass + fail)`; if `pass + fail == 0`, returns `pass_rate: null`.
|
||||
- [ ] The endpoint's aggregator normalizes legacy values: `pass` ≡ `ok`, `fail` ≡ `error`, `skip` ≡ `skipped`. No data loss when scanning historical logs.
|
||||
- [ ] An optional CLI subcommand `hyperguild brain pass-rate <skill> [--window 7d] [--json]` calls the endpoint and prints either human-readable (`tdd: 47 / 50 = 94% (window: 7d)`) or JSON.
|
||||
- [ ] `task check` passes (lint + test + vet + drift + govulncheck) on each task and on the merged branch.
|
||||
- [ ] One week post-merge, `GET /pass-rate?skill=tdd&window=7d` returns non-zero counts and a real `pass_rate`.
|
||||
|
||||
## Constraints
|
||||
|
||||
- **Stdlib + existing deps only.** The endpoint adds to the existing ingestion pod's HTTP handler (Go, `net/http`). No new service, no new pod, no new persistence layer.
|
||||
- **No auth on `/pass-rate`.** Same model as the rest of the brain HTTP REST API: Tailscale-only network, no token.
|
||||
- **Schema:** the SKILL.md template uses `pass | fail | skip` for `final_status`. The aggregator treats `pass` and `ok` as equivalent, `fail` and `error` as equivalent, `skip` and `skipped` as equivalent. New writes from skills MUST use the new vocabulary; the aggregator handles both for read-back.
|
||||
- **Storage:** continues to use the existing JSONL files at `<pod>/brain/sessions/*.jsonl`. No format change. No materialized aggregates. If on-demand scans become slow (>500ms p99), revisit in a follow-up; not now.
|
||||
- **Backwards compatibility:** the existing `session_log` MCP tool's signature does not change. Its docstring should be updated to reflect the new vocabulary, but argument types stay the same.
|
||||
- **Pilot-before-rollout:** the first SKILL.md instrumentation (`tdd`) must dogfood successfully — at least one real `tdd` invocation post-instrumentation produces a session log entry — before the other six skills get their updates.
|
||||
|
||||
## Out of Scope
|
||||
|
||||
- Plan 6 routing pod itself (the consumer of `/pass-rate`).
|
||||
- Materialized rolling counters (compute on-demand for now).
|
||||
- Auth, rate limiting, or per-user filtering on `/pass-rate`.
|
||||
- Dashboards or visualization (`hyperguild brain pass-rate` text/JSON is the only UI).
|
||||
- Real-time streaming or push notifications (`/pass-rate` is poll-only).
|
||||
- Skills with no clear binary outcome (the 10 skills listed in Success Criteria).
|
||||
- Per-model or per-mode breakdown (`session_log` already records `model_used`; the endpoint aggregates across all models for now). Plan 6 may want sharper aggregation; we'll add fields when it lands.
|
||||
- Migration of the one historical entry in `2026-04-17-validate-hyperguild.jsonl` from `pass` (which is the new vocabulary, by accident) — no migration needed.
|
||||
|
||||
## Technical Approach
|
||||
|
||||
### Component A — SKILL.md instrumentation pattern
|
||||
|
||||
Each instrumented skill gets a standardized "Logging" subsection under its existing "Brain MCP Integration" section. The subsection names the required `session_log` fields with explicit copy-paste examples:
|
||||
|
||||
```
|
||||
**At each phase end:** call `session_log` with:
|
||||
- `skill`: "<this-skill-name>"
|
||||
- `phase`: "<the-phase>"
|
||||
- `final_status`: "pass" | "fail" | "skip"
|
||||
- `message`: "<one-line summary>"
|
||||
- `duration_ms`: <wall clock>
|
||||
- `project_root`: "<absolute path to the project under work>"
|
||||
```
|
||||
|
||||
The pilot SKILL.md (`~/dev/.skills/tdd/SKILL.md`) gets instrumented first. The implementation defines the contract; the rollout commits replicate the pattern across the other six SKILL.md files.
|
||||
|
||||
Rationale: SKILL.md as the source of truth means the contract is visible to every agent that loads the skill — no hidden middleware. Mode-agnostic: the agent calls `session_log` whether it's Claude (Mode 1), Claude+routing (Mode 2), or Crush (Mode 3). The pattern is uniform; only the skill name + phase set differ.
|
||||
|
||||
### Component B — `/pass-rate` HTTP endpoint
|
||||
|
||||
New handler at the existing ingestion pod, peer to `/query`, `/write`, `/ingest`, etc.
|
||||
|
||||
```
|
||||
GET /pass-rate?skill=<name>&window=<duration>
|
||||
→ 200 { "skill": "tdd", "window": "7d", "pass": 47, "fail": 3, "skip": 0, "total": 50, "pass_rate": 0.94 }
|
||||
```
|
||||
|
||||
Algorithm:
|
||||
1. Parse `skill` (required) and `window` (default `7d`, accept Go-style `1h`, `12h`, `7d`, `30d`).
|
||||
2. Walk `brain/sessions/*.jsonl` in the pod's volume. For each line: parse JSON, filter by `skill == query.skill` and `timestamp >= now - window`.
|
||||
3. Tally `pass` (counts both `pass` and `ok`), `fail` (`fail` and `error`), `skip` (`skip` and `skipped`).
|
||||
4. Compute `pass_rate = pass / (pass + fail)`; if `pass + fail == 0`, return `pass_rate: null`.
|
||||
5. Return JSON.
|
||||
|
||||
Rationale for on-demand: the JSONL files are append-only and small (one entry per skill phase, kilobytes per session at most). For the first months of Plan 5 usage, scanning all sessions for a single query is fast enough. If it ever isn't, a materialized index is a follow-up — the endpoint shape doesn't change.
|
||||
|
||||
### Component C — Optional CLI subcommand
|
||||
|
||||
`hyperguild brain pass-rate <skill> [--window 7d] [--json]`. Adds a third nested verb under `brain` (sibling to `query` and `write`). Calls `GET /pass-rate?skill=<>&window=<>` via the existing `brainClient` infrastructure. Default human output: `tdd: 47 / 50 = 94% (window: 7d)`. `--json` passes through the response envelope.
|
||||
|
||||
Rationale: shell access to pass-rate without curl + jq. Optional in the strict sense — Plan 6's routing pod will call the endpoint directly, not via the CLI — but cheap to add (one new method on `brainClient`, one new dispatch case in `runBrain`).
|
||||
|
||||
### Schema and normalization
|
||||
|
||||
`session_log` JSONL line shape (unchanged today, codified by this plan):
|
||||
|
||||
```json
|
||||
{
|
||||
"session_id": "<id>",
|
||||
"timestamp": "2026-05-03T20:30:00Z",
|
||||
"skill": "tdd",
|
||||
"phase": "red",
|
||||
"project_root": "/abs/path",
|
||||
"final_status": "pass",
|
||||
"duration_ms": 12345,
|
||||
"message": "Test written, function undefined, red confirmed."
|
||||
}
|
||||
```
|
||||
|
||||
`final_status` values:
|
||||
- New writes (this plan onward): `pass | fail | skip`
|
||||
- Read aggregator accepts both new and legacy: `pass`/`ok` → pass, `fail`/`error` → fail, `skip`/`skipped` → skip
|
||||
- Anything else → counted as `skip` for safety (don't pollute pass/fail with malformed entries)
|
||||
|
||||
### Tests
|
||||
|
||||
- Endpoint: table-driven tests with a temp `brain/sessions/` directory containing JSONL files spanning multiple skills, multiple statuses (both vocabularies), edge cases (empty file, malformed line, timestamp outside window, future timestamp). Tests run via `httptest.NewServer` against the real handler.
|
||||
- CLI: tests for `runBrainPassRate` against `httptest.Server` fake of `/pass-rate`. Human and `--json` output paths.
|
||||
- Pilot dogfood: after instrumenting `tdd/SKILL.md`, one real TDD task in this plan exercises the logging path. The corresponding session log entry verifies end-to-end.
|
||||
- `task check` per task.
|
||||
|
||||
## Risks
|
||||
|
||||
- **Skills that don't reliably log produce missing data.** The aggregator returns zero counts for those, which Plan 6 may misread as "this skill always passes" or "this skill is broken". Mitigation: the endpoint returns `pass_rate: null` when `pass + fail == 0`, signalling "no data" distinct from "always passes". Plan 6 must check for null.
|
||||
- **Agents may forget to call `session_log` mid-skill.** No way to enforce in cloud Mode 1 — Claude may skip the call if instructions are unclear. Mitigation: SKILL.md template makes the call literal and copy-pasteable. After 1 week, if instrumentation rate is < 80% of expected calls, escalate; consider a wrapper at the routing-pod layer in Plan 6 as belt-and-suspenders.
|
||||
- **Schema drift between legacy `ok` and new `pass`.** Mitigation: the aggregator's normalization rule. Documented in the endpoint's response and in the `session_log` tool docstring update.
|
||||
- **`/pass-rate` walks all session files for each request.** With ~1 file per session and tens of sessions per week, this is microseconds today. At hundreds of files per day, may need a date-bounded directory layout. Mitigation: monitor; if scan time > 100ms p99, revisit. Not in this plan.
|
||||
- **The pilot may fail on the first dogfood.** If `tdd` instrumentation doesn't produce a log entry (e.g. agent didn't call `session_log`, JSON shape wrong, file permissions), the rollout to the other six skills is blocked until the pilot succeeds. Mitigation: explicit "pilot validates end-to-end" gate as the last step of Component A.
|
||||
- **Adding a third verb under `brain` slightly stretches the inline-router pattern.** Three verbs in a switch is still simple; if it grows to five, the CLI may want a per-verb registration map. Mitigation: deferred — three is fine.
|
||||
240
docs/superpowers/specs/2026-05-04-mode-2-routing-pod-design.md
Normal file
240
docs/superpowers/specs/2026-05-04-mode-2-routing-pod-design.md
Normal file
@@ -0,0 +1,240 @@
|
||||
# Spec: Mode 2 routing pod
|
||||
|
||||
> Plan 6 of 7 — Hyperguild Skill Migration. Loaded after `feature-spec` skill.
|
||||
|
||||
## Problem Statement
|
||||
|
||||
Mode 2 (`client-local`) is the cost-and-sovereignty mode for paid client work — keep skill calls inside Tailscale, save tokens, but stay reliable. Plans 1–5 produced everything Mode 2 needs except the consumer: the brain MCP at `:30330` is live, four skills are instrumented to log `pass | fail | skip`, and `GET /pass-rate?skill=X&window=Y` returns honest numbers (or `null` when there is no data). What is still missing is the policy layer that reads pass-rate and acts on it.
|
||||
|
||||
The supervisor pod (`:30320`) historically hosted full skill workers (`tdd_red/green/refactor`, `code_review`, `debug`, `spec`, `retrospective`, `trainer`, `tier`) but with no routing — every call ran local regardless of skill quality, and Claude Code in client-local mode silently lost access to Claude-quality work even when local was wrong. That's the regression Plan 6 fixes.
|
||||
|
||||
**Why now:** the supervisor pod is scheduled for retirement (Plan 7) and the data plumbing for routing decisions exists but has no consumer. Without Plan 6, Plan 7 cannot land.
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- [ ] A new pod `routing` is deployed via Flux at NodePort `:30310`, alongside (not replacing) the supervisor and ingestion pods. Image built by gitea CI, deployment manifest under `infra/k3s/apps/routing/`. `kubectl -n routing get deployment` shows `1/1 Ready`.
|
||||
- [ ] `POST http://koala:30310/mcp` responds to `tools/list` with exactly four tools: `code_review`, `debug`, `retrospective`, `trainer`. Each tool's name + JSON schema is byte-identical to the supervisor's current advertisement (verified by snapshot test).
|
||||
- [ ] Bearer-token auth via env var `ROUTING_MCP_TOKEN` (same opt-in pattern as `SUPERVISOR_MCP_TOKEN` shipped in `f49850d`). Empty token = no auth; populated token = `Authorization: Bearer <token>` required, otherwise HTTP 401 + JSON-RPC `-32001`.
|
||||
- [ ] On every tool call, the pod queries `${BRAIN_URL}/pass-rate?skill=<tool>&window=7d` and applies a configurable policy:
|
||||
- `pass_rate == null` → route to local (default-to-local)
|
||||
- `pass_rate ≥ HYPERGUILD_ROUTE_LOCAL_FLOOR` (default `0.90`) → route to local
|
||||
- `HYPERGUILD_ROUTE_LOCAL_CEIL ≤ pass_rate < FLOOR` (CEIL default `0.70`) → 50/50 deterministic sample (hash of canonical request body)
|
||||
- `pass_rate < CEIL` → route to Claude
|
||||
- [ ] Both routes resolve to a LiteLLM call: local route uses `HYPERGUILD_LOCAL_MODEL` (default `qwen35`), Claude route uses `HYPERGUILD_CLAUDE_MODEL` (default `claude-sonnet-4-6`). LiteLLM at `${LITELLM_BASE_URL}` (default `http://piguard:4000`) handles provider routing. The routing pod has no direct Anthropic SDK.
|
||||
- [ ] Every routing decision is logged via `session_log` to the brain pod with `{skill: "_routing", phase: "decide", final_status: "skip", message: "<tool>: <decision>", duration_ms, project_root}`. `final_status: "skip"` keeps these entries out of any skill's pass-rate aggregation.
|
||||
- [ ] LiteLLM unreachable → fail open to a Claude decision *and* log `final_status: "fail"` for `_routing`. The pod must still serve requests even if LiteLLM is down for hours.
|
||||
- [ ] `cmd/hyperguild/mode.go` updated: `mode client-local` writes the routing entry with `"headers": {"X-Hyperguild-Mode": "client-local"}` and the `_routing_pending` placeholder field is removed. The pod accepts but does not branch on the header (forward-compat only).
|
||||
- [ ] `task check` (lint + test + vet + drift + govulncheck) passes on each task and on the merged branch. The CI gate that bit Plan 1 must not bite Plan 6 (per `feedback_per_task_verification` memory).
|
||||
- [ ] A new `task smoke:routing` target boots the binary against the live LiteLLM at `piguard:4000` and the live `/pass-rate` at `koala:30330`, calls each of the four advertised tools once, and verifies a `_routing` entry appears in the brain via `GET /pass-rate?skill=_routing&window=1h`. This is the live-contract test (per `2026-05-03-fake-tests-vs-real-contract` brain entry); fake-server unit tests verify policy logic, the smoke step verifies the contract.
|
||||
- [ ] Mode 1 (`cloud`) and Mode 3 (`sovereign`) are byte-identically unchanged. Verified by `git diff` showing no changes to `mode.go`'s `modeCloud` or `modeSovereign` functions.
|
||||
|
||||
## Constraints
|
||||
|
||||
- **Stdlib + existing deps only.** The routing pod reuses `internal/exec/litellm.go` (`NewLiteLLM`, `Complete`), `internal/registry`, and `internal/skills/{review,debug,retrospective,trainer}/`. No new third-party dependency. Auth code may be duplicated from `internal/mcp/server.go` or extracted to a shared helper — implementer's call.
|
||||
- **No new persistence.** Pass-rate data lives in the brain pod's session JSONL files (Plan 5). Routing-decision logs land in the same place via `session_log`. Routing pod has no DB, no cache, no on-disk state beyond an optional in-memory pass-rate cache (TTL = 60 seconds — protects the brain from per-call hammering during an active session).
|
||||
- **MCP wire format identical to supervisor's.** Tools have the same names and JSON schemas as today. A consumer switches modes by changing only the URL in `.mcp.json` — no schema-level differences. Snapshot tests pin this.
|
||||
- **Pod must start and serve degraded.** If LiteLLM is down at startup, the pod still binds to `:3210`, advertises tools, and serves requests with the fail-open-to-Claude behavior described in success criteria.
|
||||
- **`internal/skills/{review,debug,retrospective,trainer}/` survives Plan 6.** Plan 7's note about deleting them is amended: those four packages are reused by the routing pod and must NOT be deleted in Plan 7. Plan 7 deletes only `internal/skills/{tdd,spec}/`, the supervisor binary, the supervisor manifests, and frees NodePort `:30320`. This spec calls out the change so Plan 7's author doesn't delete needed code (per `2026-05-03-implicit-cleanup-third-category` brain entry).
|
||||
- **No retries beyond fail-open.** A LiteLLM call that errors becomes a Claude decision and a `final_status: "fail"` log. No exponential backoff, no circuit breaker — that's policy for a future plan once the failure shape is observed.
|
||||
- **Determinism in sampling.** When pass-rate is in the sample band (`CEIL ≤ pr < FLOOR`), the local-vs-Claude choice for a given request is reproducible: hash a canonical JSON of the request body, low bit picks local. Same input → same decision. Avoids per-call variance confusing the operator during a debugging session.
|
||||
|
||||
## Out of Scope
|
||||
|
||||
- **Plan 7 (supervisor retirement).** Separate plan, executed after Plan 6 stabilizes. Plan 6 leaves the supervisor pod running; nothing about supervisor changes in this plan.
|
||||
- **Routing for `tdd_red/green/refactor`, `spec`, `tier`.** Per `project_per_skill_routing.md`, these are SKILL.md or CLI, not routing-pod tools. They never appear in the routing pod's `tools/list`. If a future plan changes that decision, it adds them then.
|
||||
- **Routing for `brain_ingest`.** Already routed at the brain pod (Plan 1). No change.
|
||||
- **Per-mode policy branching.** The pod accepts `X-Hyperguild-Mode` for forward-compat but treats absent or unknown values as `client-local`. No code path differs on the header value yet.
|
||||
- **OAuth, IP allowlisting, rate limiting, audit logging.** Bearer-token only; same risk model as the supervisor MCP after `f49850d`.
|
||||
- **Decision-log read endpoints.** Routing decisions land in the brain via `session_log`. Reads happen via the existing `GET /pass-rate` endpoint and JSONL inspection. No new read API.
|
||||
- **Materialized routing-decision aggregates.** Out of scope for the same reason Plan 5 deferred materialized counters: on-demand scans are fast enough at current data volumes.
|
||||
- **Tunable per-skill thresholds.** `FLOOR` and `CEIL` are global. If the operator decides `debug` needs a different floor than `code_review`, that's a follow-up plan with real data behind the choice.
|
||||
- **Sampling beyond a 50/50 hash split.** No epsilon-decay schedules, no Thompson sampling, no per-skill exploration policies. Add when data justifies.
|
||||
- **Migration of any existing supervisor-skill `.mcp.json` registrations.** Consumers update their `.mcp.json` (via `hyperguild mode client-local`) when they want Mode 2 behavior. No silent redirect.
|
||||
- **Routing-pod-side prompt customization.** The four skill packages already own their prompts; the routing pod just calls into them via the existing `Skill` interface. Prompt edits remain a SKILL.md or `internal/skills/<x>/` concern.
|
||||
|
||||
## Technical Approach
|
||||
|
||||
### A. Binary layout: `cmd/routing/`
|
||||
|
||||
A new Go binary at `cmd/routing/main.go`. Stdlib + `internal/*`. Wires:
|
||||
1. Config from env (typed struct in `internal/config/routing.go` — peer to `Config` for the supervisor; deliberately a separate type because the surfaces are different and merging would force every routing-pod field onto the supervisor and vice versa).
|
||||
2. `internal/exec/litellm.NewLiteLLM(...)` — same client the supervisor uses.
|
||||
3. `internal/skills/{review,debug,retrospective,trainer}.New(...)` constructors, each receiving a `CompleteFunc` that wraps the routing decision (see C below).
|
||||
4. `internal/registry.New()` populated with the four skills.
|
||||
5. `internal/mcp.NewServer(reg, cfg.MCPAuthToken)` — reuse the existing handler with bearer auth from `f49850d`. The handler is generic; nothing in it is supervisor-specific.
|
||||
|
||||
**Rationale:** the supervisor's runtime is already 80% of what the routing pod needs. Reusing it saves the routing pod from re-implementing skill dispatch, MCP protocol handling, and bearer auth. The only new code is the routing decision itself (C below) and the deployment manifests (G).
|
||||
|
||||
### B. Configuration via env
|
||||
|
||||
Typed struct, parsed at startup. New env vars introduced by Plan 6:
|
||||
|
||||
| Env var | Default | Purpose |
|
||||
|---|---|---|
|
||||
| `ROUTING_PORT` | `3210` | Pod's HTTP port (NodePort `:30310` maps to this) |
|
||||
| `ROUTING_MCP_TOKEN` | — | Bearer token, opt-in (empty = no auth) |
|
||||
| `LITELLM_BASE_URL` | `http://piguard:4000` | LiteLLM proxy (reused) |
|
||||
| `LITELLM_API_KEY` | — | Reused, sourced from `routing-secrets` Secret |
|
||||
| `BRAIN_URL` | `http://ingestion.supervisor:3300` | In-cluster brain pod for `/pass-rate` and `session_log` |
|
||||
| `HYPERGUILD_LOCAL_MODEL` | `qwen35` | Model name passed to LiteLLM for the local decision |
|
||||
| `HYPERGUILD_CLAUDE_MODEL` | `claude-sonnet-4-6` | Model name for the Claude decision |
|
||||
| `HYPERGUILD_ROUTE_LOCAL_FLOOR` | `0.90` | At/above this pass-rate, always local |
|
||||
| `HYPERGUILD_ROUTE_LOCAL_CEIL` | `0.70` | Below this, always Claude. Between CEIL and FLOOR is the sample band. |
|
||||
| `HYPERGUILD_PASS_RATE_TTL_SECONDS` | `60` | Per-skill in-memory cache TTL |
|
||||
|
||||
**Rationale:** every value an operator might want to tune is an env var, not a hardcoded constant. Defaults are the recommendations from the kickoff and the per-skill-routing memory; sensible cluster values flow in via the Flux-managed Secret. No config file to manage.
|
||||
|
||||
### C. Decision policy (`internal/routing/policy.go`)
|
||||
|
||||
Pure function, no I/O:
|
||||
|
||||
```go
|
||||
type Decision int
|
||||
const (
|
||||
DecideLocal Decision = iota
|
||||
DecideClaude
|
||||
)
|
||||
|
||||
type Policy struct{ Floor, Ceil float64 }
|
||||
|
||||
// Decide returns the routing decision. passRate may be nil when the brain has no data.
|
||||
// requestHash is a deterministic 64-bit hash of the canonical request body — used only
|
||||
// when passRate is in the sample band; same hash → same decision.
|
||||
func (p Policy) Decide(passRate *float64, requestHash uint64) Decision { ... }
|
||||
```
|
||||
|
||||
Rules (in order):
|
||||
1. `passRate == nil` → `DecideLocal` (default-to-local)
|
||||
2. `*passRate >= p.Floor` → `DecideLocal`
|
||||
3. `*passRate < p.Ceil` → `DecideClaude`
|
||||
4. Otherwise (sample band) → `requestHash & 1` picks local on `0`, claude on `1`
|
||||
|
||||
**Rationale:** no I/O in the policy means the function is trivially testable (table-driven, no fixtures, no servers). Network calls happen in a wrapping layer that calls `Decide` — same separation as `internal/skills/*/skill.go` keeps prompt strings separate from `Complete` calls. Default-to-local rule is justified in `project_per_skill_routing.md`: the four advertised skills are exactly the skills marked "MCP→local" in that target architecture.
|
||||
|
||||
### D. Pass-rate fetcher (`internal/routing/passrate.go`)
|
||||
|
||||
```go
|
||||
type Fetcher struct {
|
||||
BaseURL string
|
||||
HTTPClient *http.Client // 1s timeout
|
||||
Cache *ttlCache // map[string]*float64 with 60s TTL, struct-internal
|
||||
}
|
||||
|
||||
func (f *Fetcher) Get(ctx context.Context, skill string) (*float64, error)
|
||||
```
|
||||
|
||||
Calls `GET ${BaseURL}/pass-rate?skill=<skill>&window=7d`. On success, caches the parsed `pass_rate` (which may be `null`) for `HYPERGUILD_PASS_RATE_TTL_SECONDS`. On error, returns `(nil, err)`; the dispatch wrapper treats this as `*passRate == nil` and routes to local (the default-to-local fallback also covers brain-pod-down).
|
||||
|
||||
**Rationale:** GET is correct REST per `2026-05-03-rest-semantics-vs-precedent` (this is a pure read with query params; it shouldn't follow the legacy POST-everywhere precedent). Cache TTL of 60s prevents per-call hammering during a tight Claude Code loop while staying fresh enough that a flapping pass-rate visibly affects routing within a minute. No persistence — restart loses cache, that's fine.
|
||||
|
||||
### E. Dispatch wrapper
|
||||
|
||||
The four skills are constructed with their existing `CompleteFunc` signature (`(ctx, model, system, user) (string, int64, error)`). The routing pod wraps it:
|
||||
|
||||
```go
|
||||
func (r *Router) Complete(ctx context.Context, skill, model, system, user string) (string, int64, error) {
|
||||
pr, _ := r.fetcher.Get(ctx, skill)
|
||||
decision := r.policy.Decide(pr, hashCanonical(system, user))
|
||||
chosenModel := r.cfg.ClaudeModel
|
||||
if decision == DecideLocal {
|
||||
chosenModel = r.cfg.LocalModel
|
||||
}
|
||||
out, ms, err := r.litellm.Complete(ctx, chosenModel, system, user)
|
||||
r.logDecision(skill, decision, err, ms)
|
||||
if err != nil {
|
||||
// fail open: try Claude once if we routed local; if Claude also fails, return error.
|
||||
if decision == DecideLocal {
|
||||
chosenModel = r.cfg.ClaudeModel
|
||||
out, ms, err = r.litellm.Complete(ctx, chosenModel, system, user)
|
||||
r.logDecision(skill, DecideClaude, err, ms) // second log entry, marked fail if still erroring
|
||||
}
|
||||
return out, ms, err
|
||||
}
|
||||
return out, ms, nil
|
||||
}
|
||||
```
|
||||
|
||||
The skill packages don't know about routing — they receive a `CompleteFunc` and call it. The wrapper substitutes routing logic at construction time.
|
||||
|
||||
**Rationale:** keeps the skill packages oblivious to mode. Same `internal/skills/review/` works under the supervisor (no routing) and under the routing pod (routed) without any conditional logic in the skill itself. Plan 7's deletion of the supervisor leaves the skills' shape intact for the routing pod.
|
||||
|
||||
### F. Decision logging (`internal/routing/log.go`)
|
||||
|
||||
After every decision, POST a session log entry to `${BRAIN_URL}/write` (the brain pod's existing endpoint, which appends to `brain/sessions/<session>.jsonl`). Entry shape:
|
||||
|
||||
```json
|
||||
{
|
||||
"skill": "_routing",
|
||||
"phase": "decide",
|
||||
"final_status": "skip",
|
||||
"message": "<original_skill>: <decision> (pass_rate=<value or 'null'>, model=<chosen>)",
|
||||
"duration_ms": <litellm_round_trip>,
|
||||
"project_root": "<path from request, or 'unknown'>",
|
||||
"timestamp": "<RFC3339>",
|
||||
"session_id": "<from request, or generated>"
|
||||
}
|
||||
```
|
||||
|
||||
`final_status: "skip"` keeps these entries out of any real skill's pass-rate aggregation (Plan 5's aggregator counts only `pass`/`fail`). Operators can still query `GET /pass-rate?skill=_routing&window=7d` for routing-failure visibility (when LiteLLM down → `final_status: "fail"` in the second log entry).
|
||||
|
||||
**Rationale:** closes the observability loop without adding a new endpoint or schema. `_routing` namespaces routing entries away from skill names. `skip` is the only honest classification — routing isn't itself a pass/fail event in the skill sense.
|
||||
|
||||
### G. Deployment
|
||||
|
||||
New manifest directory `infra/k3s/apps/routing/` mirroring `infra/k3s/apps/supervisor/`'s shape:
|
||||
|
||||
- `namespace.yaml` — namespace `routing` (peer to `supervisor`)
|
||||
- `deployment.yaml` — single replica, nodeSelector koala, image from gitea registry, `envFrom: secretRef: routing-secrets`
|
||||
- `service.yaml` — ClusterIP on port 3210
|
||||
- `nodeport.yaml` — NodePort 30310 → service 3210
|
||||
- `secrets.enc.yaml` — SOPS-encrypted, contains `LITELLM_API_KEY` and (optionally) `ROUTING_MCP_TOKEN`
|
||||
- `kustomization.yaml` — bundles the above
|
||||
|
||||
The supervisor pod's CI image build pattern (gitea Actions → `gitea.d-ma.be/mathias/supervisor:<sha>`) is replicated for `gitea.d-ma.be/mathias/routing:<sha>`. Flux's existing image-automation will bump the manifest's image tag on each push.
|
||||
|
||||
**Rationale:** copying the supervisor pod's manifest shape (rather than designing from scratch) is the YAGNI move. Flux + image automation already proven on supervisor; same pattern, same operator mental model. Mode 2 setup is now a Flux change, not a one-off `kubectl` ritual.
|
||||
|
||||
### H. Live smoke test
|
||||
|
||||
`task smoke:routing` (in the project Taskfile) does:
|
||||
1. Boot the binary locally with `LITELLM_BASE_URL=http://piguard:4000` and `BRAIN_URL=http://koala:30330`. Bind to a random localhost port (so it doesn't conflict with anything else).
|
||||
2. Send `tools/list` and assert four tool names.
|
||||
3. For each tool, send a minimal valid `tools/call`. Don't assert on response content — assert response shape (no error, has content).
|
||||
4. After all four calls, query `GET http://koala:30330/pass-rate?skill=_routing&window=1h` and assert `total >= 4`.
|
||||
5. Tear down.
|
||||
|
||||
Skipped automatically when LiteLLM is unreachable or when run outside Tailscale (tier 3) — emits a `SKIP` line and exits 0. `task check` does NOT include `task smoke:routing` (CI runner doesn't have Tailscale); operator runs it manually before bumping production.
|
||||
|
||||
**Rationale:** unit tests with `httptest.Server` fakes verify the policy and the dispatch wrapper logic. The smoke test is the only thing that will catch a contract drift between the routing pod's `Complete` calls and the actual LiteLLM API, or a schema drift between `/pass-rate` and what the fetcher expects (per `2026-05-03-fake-tests-vs-real-contract`).
|
||||
|
||||
### I. Mode-template update (`cmd/hyperguild/mode.go`)
|
||||
|
||||
`modeClientLocal` is amended:
|
||||
- The `routing` entry's `url` stays at `http://koala:30310/mcp`.
|
||||
- A new key `headers` is added with `{"X-Hyperguild-Mode": "client-local"}`.
|
||||
- The placeholder `_routing_pending` field is **removed**, since the routing pod now exists.
|
||||
|
||||
Tests in `cmd/hyperguild/mode_test.go` are updated to assert the new structure. README in `cmd/hyperguild/README.md` updated to drop the "not deployed yet" note.
|
||||
|
||||
**Rationale:** Plan 4 deliberately scaffolded the placeholder for Plan 6 to fill in. This is the fill-in. Removing `_routing_pending` is the implicit cleanup the kickoff anticipates — making it explicit in the spec avoids a Plan-completeness gap (per `2026-05-03-implicit-cleanup-third-category`).
|
||||
|
||||
## Risks
|
||||
|
||||
- **Empty pass-rate window in the first weeks.** Plans 3–5 merged on 2026-05-03; usage data has not accumulated. With default-to-local active for all four routed skills, the first weeks of Mode 2 = "everything goes local." If local quality is rough on `code_review` or `debug`, the operator's first impression of Mode 2 is bad, and confidence in Plan 6 erodes before data lands. **Mitigation:** the FLOOR / CEIL are env-tunable. If local quality is unworkable in the first week, set `HYPERGUILD_ROUTE_LOCAL_FLOOR=2.0` (impossible threshold) and the pod becomes default-to-Claude with no code change. This is a deliberate kill switch for the early window.
|
||||
|
||||
- **LiteLLM-as-single-dependency.** The routing pod has exactly one upstream LLM provider: `piguard:4000`. If LiteLLM is misconfigured (wrong model name routed to wrong provider, expired Anthropic key in LiteLLM's config), every routing-pod call returns garbage. **Mitigation:** the smoke test catches gross misconfig before deploy; once deployed, LiteLLM's own `/health` endpoint is the canary (the pod doesn't probe it — operator monitors LiteLLM separately). If a deeper failure mode emerges, add a routing-pod liveness probe in a follow-up.
|
||||
|
||||
- **Skill-schema drift.** The routing pod's `tools/list` is asserted byte-identical to the supervisor's via snapshot test. If someone evolves the supervisor's schemas between Plan 6 merge and Plan 7 (a long window), the snapshot drifts. **Mitigation:** the spec documents that Plan 6 freezes the schemas; supervisor edits to skill schemas are out of scope until Plan 7 deletes the supervisor. This is a soft constraint enforced by the spec, not by code. If the supervisor genuinely needs a schema change before Plan 7, that's a separate plan.
|
||||
|
||||
- **Flux drift on `kubectl rollout restart`.** Demonstrated during the bearer-auth rollout earlier today: Flux server-side-applies the deployment every 30s and strips the `kubectl.kubernetes.io/restartedAt` annotation, which deletes the new ReplicaSet's pod. **Mitigation:** the Plan 6 implementer prompt and the README note that `kubectl delete pod -l app=routing` is the correct way to force a restart on Flux-managed deployments — the existing ReplicaSet recreates without an annotation Flux can revert. (This finding is worth a brain entry; capture in retrospective.)
|
||||
|
||||
- **Mode header not forwarded by Claude Code.** Plan 6 assumes Claude Code propagates `headers` from `.mcp.json`. The bearer-auth rollout proved this works for `Authorization`. The same path should work for `X-Hyperguild-Mode`. **Mitigation:** the pod treats absent header as `client-local` (the only mode that registers the pod). If forwarding silently breaks, behavior is identical — header is forward-compat only.
|
||||
|
||||
- **Sample-band hash collision producing skewed routing.** Hash inputs are `(system, user)` strings. If skill prompts produce highly similar bodies (debug bug A vs debug bug B with similar wording), low-bit hash distribution might cluster on one side. **Mitigation:** at the volumes Plan 6 expects (single operator, ~10s of routed calls/hour at peak), bias is statistically invisible. If volume ever rises, swap `hash & 1` for a stronger split. Not the first failure mode worth pre-engineering.
|
||||
|
||||
## Cross-references
|
||||
|
||||
- Spec for Plan 5 (consumer of `/pass-rate`): `docs/superpowers/specs/2026-05-03-pass-rate-logging-design.md`
|
||||
- Spec for Plan 4 (which scaffolded the `:30310` placeholder): `docs/superpowers/specs/2026-05-03-hyperguild-cli-design.md`
|
||||
- Auto-memory entries `project_three_modes`, `project_skill_migration_plans`, `project_per_skill_routing`, `feedback_per_task_verification`, `feedback_sudo`
|
||||
- Brain entries `2026-05-03-rest-semantics-vs-precedent`, `2026-05-03-aggregator-normalization-backwards-compat`, `2026-05-03-fake-tests-vs-real-contract`, `2026-05-03-implicit-cleanup-third-category`, `2026-05-03-code-reviewer-output-as-candidates`, `2026-05-03-done-with-concerns-vs-blocked`, `2026-05-03-verification-depth-formula`, `2026-05-03-plan-canonical-dispatch-ephemeral`
|
||||
17
go.mod
17
go.mod
@@ -2,10 +2,23 @@ module github.com/mathiasbq/supervisor
|
||||
|
||||
go 1.26.1
|
||||
|
||||
require github.com/stretchr/testify v1.11.1
|
||||
require (
|
||||
github.com/lestrrat-go/jwx/v2 v2.1.6
|
||||
github.com/stretchr/testify v1.11.1
|
||||
gopkg.in/yaml.v3 v3.0.1
|
||||
)
|
||||
|
||||
require (
|
||||
github.com/davecgh/go-spew v1.1.1 // indirect
|
||||
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0 // indirect
|
||||
github.com/goccy/go-json v0.10.3 // indirect
|
||||
github.com/lestrrat-go/blackmagic v1.0.3 // indirect
|
||||
github.com/lestrrat-go/httpcc v1.0.1 // indirect
|
||||
github.com/lestrrat-go/httprc v1.0.6 // indirect
|
||||
github.com/lestrrat-go/iter v1.0.2 // indirect
|
||||
github.com/lestrrat-go/option v1.0.1 // indirect
|
||||
github.com/pmezard/go-difflib v1.0.0 // indirect
|
||||
gopkg.in/yaml.v3 v3.0.1 // indirect
|
||||
github.com/segmentio/asm v1.2.0 // indirect
|
||||
golang.org/x/crypto v0.32.0 // indirect
|
||||
golang.org/x/sys v0.31.0 // indirect
|
||||
)
|
||||
|
||||
27
go.sum
27
go.sum
@@ -1,10 +1,37 @@
|
||||
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
|
||||
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
|
||||
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
|
||||
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0 h1:NMZiJj8QnKe1LgsbDayM4UoHwbvwDRwnI3hwNaAHRnc=
|
||||
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0/go.mod h1:ZXNYxsqcloTdSy/rNShjYzMhyjf0LaoftYK0p+A3h40=
|
||||
github.com/goccy/go-json v0.10.3 h1:KZ5WoDbxAIgm2HNbYckL0se1fHD6rz5j4ywS6ebzDqA=
|
||||
github.com/goccy/go-json v0.10.3/go.mod h1:oq7eo15ShAhp70Anwd5lgX2pLfOS3QCiwU/PULtXL6M=
|
||||
github.com/lestrrat-go/blackmagic v1.0.3 h1:94HXkVLxkZO9vJI/w2u1T0DAoprShFd13xtnSINtDWs=
|
||||
github.com/lestrrat-go/blackmagic v1.0.3/go.mod h1:6AWFyKNNj0zEXQYfTMPfZrAXUWUfTIZ5ECEUEJaijtw=
|
||||
github.com/lestrrat-go/httpcc v1.0.1 h1:ydWCStUeJLkpYyjLDHihupbn2tYmZ7m22BGkcvZZrIE=
|
||||
github.com/lestrrat-go/httpcc v1.0.1/go.mod h1:qiltp3Mt56+55GPVCbTdM9MlqhvzyuL6W/NMDA8vA5E=
|
||||
github.com/lestrrat-go/httprc v1.0.6 h1:qgmgIRhpvBqexMJjA/PmwSvhNk679oqD1RbovdCGW8k=
|
||||
github.com/lestrrat-go/httprc v1.0.6/go.mod h1:mwwz3JMTPBjHUkkDv/IGJ39aALInZLrhBp0X7KGUZlo=
|
||||
github.com/lestrrat-go/iter v1.0.2 h1:gMXo1q4c2pHmC3dn8LzRhJfP1ceCbgSiT9lUydIzltI=
|
||||
github.com/lestrrat-go/iter v1.0.2/go.mod h1:Momfcq3AnRlRjI5b5O8/G5/BvpzrhoFTZcn06fEOPt4=
|
||||
github.com/lestrrat-go/jwx/v2 v2.1.6 h1:hxM1gfDILk/l5ylers6BX/Eq1m/pnxe9NBwW6lVfecA=
|
||||
github.com/lestrrat-go/jwx/v2 v2.1.6/go.mod h1:Y722kU5r/8mV7fYDifjug0r8FK8mZdw0K0GpJw/l8pU=
|
||||
github.com/lestrrat-go/option v1.0.1 h1:oAzP2fvZGQKWkvHa1/SAcFolBEca1oN+mQ7eooNBEYU=
|
||||
github.com/lestrrat-go/option v1.0.1/go.mod h1:5ZHFbivi4xwXxhxY9XHDe2FHo6/Z7WWmtT7T5nBBp3I=
|
||||
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
|
||||
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
|
||||
github.com/segmentio/asm v1.2.0 h1:9BQrFxC+YOHJlTlHGkTrFWf59nbL3XnCoFLTwDCI7ys=
|
||||
github.com/segmentio/asm v1.2.0/go.mod h1:BqMnlJP91P8d+4ibuonYZw9mfnzI9HfxselHZr5aAcs=
|
||||
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
|
||||
github.com/stretchr/testify v1.6.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
|
||||
github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
|
||||
github.com/stretchr/testify v1.11.1 h1:7s2iGBzp5EwR7/aIZr8ao5+dra3wiQyKjjFuvgVKu7U=
|
||||
github.com/stretchr/testify v1.11.1/go.mod h1:wZwfW3scLgRK+23gO65QZefKpKQRnfz6sD981Nm4B6U=
|
||||
golang.org/x/crypto v0.32.0 h1:euUpcYgM8WcP71gNpTqQCn6rC2t6ULUPiOzfWaXVVfc=
|
||||
golang.org/x/crypto v0.32.0/go.mod h1:ZnnJkOaASj8g0AjIduWNlq2NRxL0PlBrbKVyZ6V/Ugc=
|
||||
golang.org/x/sys v0.31.0 h1:ioabZlmFYtWhL+TRYpcnNlLwhyxaM9kWTDEmfnprqik=
|
||||
golang.org/x/sys v0.31.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
|
||||
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
|
||||
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
|
||||
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
|
||||
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
|
||||
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
|
||||
|
||||
@@ -11,7 +11,9 @@ import (
|
||||
"time"
|
||||
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/api"
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/auth"
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/llm"
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/mcp"
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/watcher"
|
||||
)
|
||||
@@ -54,6 +56,32 @@ func main() {
|
||||
|
||||
h := api.NewHandler(brainDir, logger, pipelineCfg)
|
||||
|
||||
var answerComplete pipeline.CompleteFunc
|
||||
if primaryURL := os.Getenv("BRAIN_LLM_PRIMARY_URL"); primaryURL != "" {
|
||||
primaryModel := envOr("BRAIN_LLM_PRIMARY_MODEL", "gemma4:31b")
|
||||
primaryKey := os.Getenv("BERGET_API_KEY")
|
||||
timeoutMS := envInt("BRAIN_LLM_TIMEOUT_MS", 10000)
|
||||
timeout := time.Duration(timeoutMS) * time.Millisecond
|
||||
|
||||
primary := llm.New(primaryURL, primaryKey, primaryModel, timeout)
|
||||
router := &llm.Router{Primary: primary}
|
||||
|
||||
if fallbackURL := os.Getenv("BRAIN_LLM_FALLBACK_URL"); fallbackURL != "" {
|
||||
fallbackModel := envOr("BRAIN_LLM_FALLBACK_MODEL", "gemma4:31b")
|
||||
router.Fallback = llm.New(fallbackURL, "", fallbackModel, timeout)
|
||||
}
|
||||
answerComplete = router.Complete
|
||||
logger.Info("brain answer LLM configured", "primary", primaryURL, "model", primaryModel)
|
||||
}
|
||||
|
||||
mcpSrv := mcp.NewServer(brainDir, &pipelineCfg, llmClient.Complete, answerComplete)
|
||||
|
||||
mcpToken := os.Getenv("BRAIN_MCP_TOKEN")
|
||||
if mcpToken == "" {
|
||||
logger.Error("BRAIN_MCP_TOKEN not set")
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
ctx := context.Background()
|
||||
if watchInterval > 0 {
|
||||
watcher.Start(ctx, watcher.Config{
|
||||
@@ -68,7 +96,28 @@ func main() {
|
||||
mux.HandleFunc("POST /write", h.Write)
|
||||
mux.HandleFunc("POST /ingest", h.Ingest)
|
||||
mux.HandleFunc("POST /ingest-path", h.IngestPath)
|
||||
mux.HandleFunc("POST /ingest-raw", h.IngestRaw)
|
||||
mux.HandleFunc("POST /backfill-refs", h.BackfillRefs)
|
||||
mux.HandleFunc("GET /pass-rate", h.PassRate)
|
||||
var jwtValidator *auth.Validator
|
||||
if dexURL := os.Getenv("DEX_ISSUER_URL"); dexURL != "" {
|
||||
audience := os.Getenv("MCP_AUDIENCE")
|
||||
v, err := auth.NewValidator(dexURL, audience)
|
||||
if err != nil {
|
||||
logger.Error("build jwt validator", "err", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
jwtValidator = v
|
||||
logger.Info("jwt auth enabled", "issuer", dexURL)
|
||||
}
|
||||
|
||||
mux.Handle("/mcp", mcp.BearerAuth(mcpToken, jwtValidator, mcpSrv))
|
||||
|
||||
if dexURL := os.Getenv("DEX_ISSUER_URL"); dexURL != "" {
|
||||
resourceURL := os.Getenv("MCP_RESOURCE_URL")
|
||||
mux.HandleFunc("GET /.well-known/oauth-protected-resource",
|
||||
auth.ProtectedResourceHandler(resourceURL, os.Getenv("DEX_ISSUER_URL")))
|
||||
}
|
||||
|
||||
addr := ":" + port
|
||||
watchIntervalLog := "disabled"
|
||||
@@ -82,6 +131,7 @@ func main() {
|
||||
"llm_model", llmModel,
|
||||
"chunk_size", chunkSize,
|
||||
"watch_interval", watchIntervalLog,
|
||||
"mcp_enabled", true,
|
||||
)
|
||||
if err := http.ListenAndServe(addr, mux); err != nil {
|
||||
logger.Error("server stopped", "err", err)
|
||||
|
||||
@@ -2,10 +2,23 @@ module github.com/mathiasbq/hyperguild/ingestion
|
||||
|
||||
go 1.26.1
|
||||
|
||||
require github.com/stretchr/testify v1.11.1
|
||||
require (
|
||||
github.com/lestrrat-go/jwx/v2 v2.1.6
|
||||
github.com/stretchr/testify v1.11.1
|
||||
)
|
||||
|
||||
require (
|
||||
github.com/davecgh/go-spew v1.1.1 // indirect
|
||||
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0 // indirect
|
||||
github.com/goccy/go-json v0.10.3 // indirect
|
||||
github.com/lestrrat-go/blackmagic v1.0.3 // indirect
|
||||
github.com/lestrrat-go/httpcc v1.0.1 // indirect
|
||||
github.com/lestrrat-go/httprc v1.0.6 // indirect
|
||||
github.com/lestrrat-go/iter v1.0.2 // indirect
|
||||
github.com/lestrrat-go/option v1.0.1 // indirect
|
||||
github.com/pmezard/go-difflib v1.0.0 // indirect
|
||||
github.com/segmentio/asm v1.2.0 // indirect
|
||||
golang.org/x/crypto v0.32.0 // indirect
|
||||
golang.org/x/sys v0.31.0 // indirect
|
||||
gopkg.in/yaml.v3 v3.0.1 // indirect
|
||||
)
|
||||
|
||||
@@ -1,9 +1,37 @@
|
||||
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
|
||||
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
|
||||
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
|
||||
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0 h1:NMZiJj8QnKe1LgsbDayM4UoHwbvwDRwnI3hwNaAHRnc=
|
||||
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0/go.mod h1:ZXNYxsqcloTdSy/rNShjYzMhyjf0LaoftYK0p+A3h40=
|
||||
github.com/goccy/go-json v0.10.3 h1:KZ5WoDbxAIgm2HNbYckL0se1fHD6rz5j4ywS6ebzDqA=
|
||||
github.com/goccy/go-json v0.10.3/go.mod h1:oq7eo15ShAhp70Anwd5lgX2pLfOS3QCiwU/PULtXL6M=
|
||||
github.com/lestrrat-go/blackmagic v1.0.3 h1:94HXkVLxkZO9vJI/w2u1T0DAoprShFd13xtnSINtDWs=
|
||||
github.com/lestrrat-go/blackmagic v1.0.3/go.mod h1:6AWFyKNNj0zEXQYfTMPfZrAXUWUfTIZ5ECEUEJaijtw=
|
||||
github.com/lestrrat-go/httpcc v1.0.1 h1:ydWCStUeJLkpYyjLDHihupbn2tYmZ7m22BGkcvZZrIE=
|
||||
github.com/lestrrat-go/httpcc v1.0.1/go.mod h1:qiltp3Mt56+55GPVCbTdM9MlqhvzyuL6W/NMDA8vA5E=
|
||||
github.com/lestrrat-go/httprc v1.0.6 h1:qgmgIRhpvBqexMJjA/PmwSvhNk679oqD1RbovdCGW8k=
|
||||
github.com/lestrrat-go/httprc v1.0.6/go.mod h1:mwwz3JMTPBjHUkkDv/IGJ39aALInZLrhBp0X7KGUZlo=
|
||||
github.com/lestrrat-go/iter v1.0.2 h1:gMXo1q4c2pHmC3dn8LzRhJfP1ceCbgSiT9lUydIzltI=
|
||||
github.com/lestrrat-go/iter v1.0.2/go.mod h1:Momfcq3AnRlRjI5b5O8/G5/BvpzrhoFTZcn06fEOPt4=
|
||||
github.com/lestrrat-go/jwx/v2 v2.1.6 h1:hxM1gfDILk/l5ylers6BX/Eq1m/pnxe9NBwW6lVfecA=
|
||||
github.com/lestrrat-go/jwx/v2 v2.1.6/go.mod h1:Y722kU5r/8mV7fYDifjug0r8FK8mZdw0K0GpJw/l8pU=
|
||||
github.com/lestrrat-go/option v1.0.1 h1:oAzP2fvZGQKWkvHa1/SAcFolBEca1oN+mQ7eooNBEYU=
|
||||
github.com/lestrrat-go/option v1.0.1/go.mod h1:5ZHFbivi4xwXxhxY9XHDe2FHo6/Z7WWmtT7T5nBBp3I=
|
||||
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
|
||||
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
|
||||
github.com/segmentio/asm v1.2.0 h1:9BQrFxC+YOHJlTlHGkTrFWf59nbL3XnCoFLTwDCI7ys=
|
||||
github.com/segmentio/asm v1.2.0/go.mod h1:BqMnlJP91P8d+4ibuonYZw9mfnzI9HfxselHZr5aAcs=
|
||||
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
|
||||
github.com/stretchr/testify v1.6.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
|
||||
github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
|
||||
github.com/stretchr/testify v1.11.1 h1:7s2iGBzp5EwR7/aIZr8ao5+dra3wiQyKjjFuvgVKu7U=
|
||||
github.com/stretchr/testify v1.11.1/go.mod h1:wZwfW3scLgRK+23gO65QZefKpKQRnfz6sD981Nm4B6U=
|
||||
golang.org/x/crypto v0.32.0 h1:euUpcYgM8WcP71gNpTqQCn6rC2t6ULUPiOzfWaXVVfc=
|
||||
golang.org/x/crypto v0.32.0/go.mod h1:ZnnJkOaASj8g0AjIduWNlq2NRxL0PlBrbKVyZ6V/Ugc=
|
||||
golang.org/x/sys v0.31.0 h1:ioabZlmFYtWhL+TRYpcnNlLwhyxaM9kWTDEmfnprqik=
|
||||
golang.org/x/sys v0.31.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
|
||||
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
|
||||
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
|
||||
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
|
||||
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
|
||||
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
|
||||
|
||||
@@ -85,6 +85,57 @@ func (h *Handler) Query(w http.ResponseWriter, r *http.Request) {
|
||||
writeJSON(w, map[string]any{"results": results})
|
||||
}
|
||||
|
||||
// WriteNote writes a markdown file to brainDir/knowledge/<filename>, optionally
|
||||
// prefixed with YAML frontmatter built from typ and domain. Returns the path
|
||||
// relative to brainDir (forward-slashed). Filename traversal is rejected.
|
||||
func WriteNote(brainDir, content, filename, typ, domain string) (string, error) {
|
||||
if content == "" {
|
||||
return "", fmt.Errorf("content is required")
|
||||
}
|
||||
if filename == "" {
|
||||
filename = fmt.Sprintf("%s-auto.md", time.Now().UTC().Format("2006-01-02-150405"))
|
||||
}
|
||||
|
||||
rawDir := filepath.Join(brainDir, "knowledge")
|
||||
if err := os.MkdirAll(rawDir, 0o755); err != nil {
|
||||
return "", fmt.Errorf("create raw dir: %w", err)
|
||||
}
|
||||
|
||||
finalContent := content
|
||||
if typ != "" || domain != "" {
|
||||
var fm strings.Builder
|
||||
fm.WriteString("---\n")
|
||||
if typ != "" {
|
||||
fmt.Fprintf(&fm, "type: %s\n", typ)
|
||||
}
|
||||
if domain != "" {
|
||||
fmt.Fprintf(&fm, "domain: %s\n", domain)
|
||||
}
|
||||
fm.WriteString("---\n")
|
||||
finalContent = fm.String() + content
|
||||
}
|
||||
|
||||
// Reject path separators outright; any non-flat filename is misuse.
|
||||
if strings.ContainsAny(filename, `/\`) {
|
||||
return "", fmt.Errorf("invalid filename")
|
||||
}
|
||||
base := filepath.Base(filename)
|
||||
// After Base, "." and ".." remain. Reject those before adding .md.
|
||||
if base == "." || base == ".." || base == "" {
|
||||
return "", fmt.Errorf("invalid filename")
|
||||
}
|
||||
if !strings.HasSuffix(base, ".md") {
|
||||
base += ".md"
|
||||
}
|
||||
dest := filepath.Join(rawDir, base)
|
||||
if err := os.WriteFile(dest, []byte(finalContent), 0o644); err != nil {
|
||||
return "", fmt.Errorf("write: %w", err)
|
||||
}
|
||||
|
||||
rel, _ := filepath.Rel(brainDir, dest)
|
||||
return filepath.ToSlash(rel), nil
|
||||
}
|
||||
|
||||
// Write handles POST /write — write raw content to brain/knowledge/.
|
||||
func (h *Handler) Write(w http.ResponseWriter, r *http.Request) {
|
||||
var req writeRequest
|
||||
@@ -92,53 +143,13 @@ func (h *Handler) Write(w http.ResponseWriter, r *http.Request) {
|
||||
writeError(w, http.StatusBadRequest, "invalid JSON")
|
||||
return
|
||||
}
|
||||
if req.Content == "" {
|
||||
writeError(w, http.StatusBadRequest, "content is required")
|
||||
return
|
||||
}
|
||||
|
||||
filename := req.Filename
|
||||
if filename == "" {
|
||||
filename = fmt.Sprintf("%s-auto.md", time.Now().UTC().Format("2006-01-02-150405"))
|
||||
}
|
||||
|
||||
rawDir := filepath.Join(h.brainDir, "knowledge")
|
||||
if err := os.MkdirAll(rawDir, 0o755); err != nil {
|
||||
writeError(w, http.StatusInternalServerError, "failed to create raw dir")
|
||||
return
|
||||
}
|
||||
|
||||
finalContent := req.Content
|
||||
if req.Type != "" || req.Domain != "" {
|
||||
var fm strings.Builder
|
||||
fm.WriteString("---\n")
|
||||
if req.Type != "" {
|
||||
fmt.Fprintf(&fm, "type: %s\n", req.Type)
|
||||
}
|
||||
if req.Domain != "" {
|
||||
fmt.Fprintf(&fm, "domain: %s\n", req.Domain)
|
||||
}
|
||||
fm.WriteString("---\n")
|
||||
finalContent = fm.String() + req.Content
|
||||
}
|
||||
|
||||
base := filepath.Base(filename)
|
||||
if !strings.HasSuffix(base, ".md") {
|
||||
base += ".md"
|
||||
}
|
||||
dest := filepath.Join(rawDir, base)
|
||||
if !strings.HasPrefix(filepath.Clean(dest)+string(os.PathSeparator), filepath.Clean(rawDir)+string(os.PathSeparator)) {
|
||||
writeError(w, http.StatusBadRequest, "invalid filename")
|
||||
return
|
||||
}
|
||||
if err := os.WriteFile(dest, []byte(finalContent), 0o644); err != nil {
|
||||
relPath, err := WriteNote(h.brainDir, req.Content, req.Filename, req.Type, req.Domain)
|
||||
if err != nil {
|
||||
h.logger.Error("write failed", "err", err)
|
||||
writeError(w, http.StatusInternalServerError, "write error")
|
||||
writeError(w, http.StatusBadRequest, err.Error())
|
||||
return
|
||||
}
|
||||
|
||||
rel, _ := filepath.Rel(h.brainDir, dest)
|
||||
writeJSON(w, map[string]string{"path": filepath.ToSlash(rel)})
|
||||
writeJSON(w, map[string]string{"path": relPath})
|
||||
}
|
||||
|
||||
// Ingest handles POST /ingest — run the pipeline on provided content.
|
||||
@@ -272,6 +283,48 @@ func (h *Handler) IngestPath(w http.ResponseWriter, r *http.Request) {
|
||||
writeJSON(w, ingestResponse{Pages: allPages, Warnings: allWarnings})
|
||||
}
|
||||
|
||||
type ingestRawRequest struct {
|
||||
Source string `json:"source"`
|
||||
Pages []pipeline.RawPage `json:"pages"`
|
||||
DryRun bool `json:"dry_run"`
|
||||
}
|
||||
|
||||
// IngestRaw handles POST /ingest-raw — run the pipeline on pre-parsed RawPages,
|
||||
// skipping the LLM extraction step. Use when the caller has already produced
|
||||
// structured page data (e.g. from a more capable model or manual curation).
|
||||
func (h *Handler) IngestRaw(w http.ResponseWriter, r *http.Request) {
|
||||
var req ingestRawRequest
|
||||
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
|
||||
writeError(w, http.StatusBadRequest, "invalid JSON")
|
||||
return
|
||||
}
|
||||
if strings.TrimSpace(req.Source) == "" {
|
||||
writeError(w, http.StatusBadRequest, "source is required")
|
||||
return
|
||||
}
|
||||
if len(req.Pages) == 0 {
|
||||
writeError(w, http.StatusBadRequest, "pages is required and must be non-empty")
|
||||
return
|
||||
}
|
||||
|
||||
result, err := pipeline.RunRaw(h.brainDir, req.Source, req.Pages, req.DryRun)
|
||||
if err != nil {
|
||||
h.logger.Error("ingest-raw failed", "source", req.Source, "err", err)
|
||||
writeError(w, http.StatusInternalServerError, "ingest error")
|
||||
return
|
||||
}
|
||||
|
||||
pages := result.Pages
|
||||
if pages == nil {
|
||||
pages = []string{}
|
||||
}
|
||||
warnings := result.Warnings
|
||||
if warnings == nil {
|
||||
warnings = []string{}
|
||||
}
|
||||
writeJSON(w, ingestResponse{Pages: pages, Warnings: warnings})
|
||||
}
|
||||
|
||||
// BackfillRefs handles POST /backfill-refs — injects source back-references
|
||||
// into all concept and entity pages based on existing wiki/sources/ pages.
|
||||
func (h *Handler) BackfillRefs(w http.ResponseWriter, r *http.Request) {
|
||||
|
||||
@@ -226,6 +226,85 @@ func TestIngestPath_File(t *testing.T) {
|
||||
assert.NotEmpty(t, pagesSlice)
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// POST /ingest-raw
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
func TestIngestRaw_Validation(t *testing.T) {
|
||||
cases := []struct {
|
||||
name string
|
||||
body map[string]any
|
||||
}{
|
||||
{"missing source", map[string]any{"pages": []any{map[string]any{"title": "X", "type": "concept", "content": "x"}}}},
|
||||
{"missing pages", map[string]any{"source": "test-source"}},
|
||||
{"empty pages", map[string]any{"source": "test-source", "pages": []any{}}},
|
||||
}
|
||||
for _, tc := range cases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
_, h := setup(t)
|
||||
body, _ := json.Marshal(tc.body)
|
||||
req := httptest.NewRequest(http.MethodPost, "/ingest-raw", bytes.NewReader(body))
|
||||
rec := httptest.NewRecorder()
|
||||
|
||||
h.IngestRaw(rec, req)
|
||||
|
||||
assert.Equal(t, http.StatusBadRequest, rec.Code)
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestIngestRaw_Success(t *testing.T) {
|
||||
dir, h := setup(t)
|
||||
body, _ := json.Marshal(map[string]any{
|
||||
"source": "test-article",
|
||||
"pages": []any{
|
||||
map[string]any{"title": "Test Article", "type": "source", "subtype": "article", "domain": "Testing", "content": "## Summary\n\nThis is a test article about [[Test Concept]].\n"},
|
||||
map[string]any{"title": "Test Concept", "type": "concept", "domain": "Testing", "content": "A concept for testing.\n"},
|
||||
},
|
||||
})
|
||||
req := httptest.NewRequest(http.MethodPost, "/ingest-raw", bytes.NewReader(body))
|
||||
rec := httptest.NewRecorder()
|
||||
|
||||
h.IngestRaw(rec, req)
|
||||
|
||||
require.Equal(t, http.StatusOK, rec.Code)
|
||||
var resp map[string]any
|
||||
require.NoError(t, json.Unmarshal(rec.Body.Bytes(), &resp))
|
||||
pages := resp["pages"].([]any)
|
||||
assert.Len(t, pages, 2)
|
||||
|
||||
// Verify files were written
|
||||
sourcePath := filepath.Join(dir, "wiki", "sources", "test-article.md")
|
||||
assert.FileExists(t, sourcePath)
|
||||
conceptPath := filepath.Join(dir, "wiki", "concepts", "test-concept.md")
|
||||
assert.FileExists(t, conceptPath)
|
||||
}
|
||||
|
||||
func TestIngestRaw_DryRun(t *testing.T) {
|
||||
dir, h := setup(t)
|
||||
body, _ := json.Marshal(map[string]any{
|
||||
"source": "dry-run-test",
|
||||
"pages": []any{
|
||||
map[string]any{"title": "Dry Run Source", "type": "source", "subtype": "article", "content": "Content."},
|
||||
},
|
||||
"dry_run": true,
|
||||
})
|
||||
req := httptest.NewRequest(http.MethodPost, "/ingest-raw", bytes.NewReader(body))
|
||||
rec := httptest.NewRecorder()
|
||||
|
||||
h.IngestRaw(rec, req)
|
||||
|
||||
require.Equal(t, http.StatusOK, rec.Code)
|
||||
var resp map[string]any
|
||||
require.NoError(t, json.Unmarshal(rec.Body.Bytes(), &resp))
|
||||
pages := resp["pages"].([]any)
|
||||
assert.NotEmpty(t, pages)
|
||||
|
||||
// Verify no files were written
|
||||
sourcePath := filepath.Join(dir, "wiki", "sources", "dry-run-test.md")
|
||||
assert.NoFileExists(t, sourcePath)
|
||||
}
|
||||
|
||||
func TestIngestPath_Directory(t *testing.T) {
|
||||
_, h := setup(t)
|
||||
|
||||
|
||||
140
ingestion/internal/api/passrate.go
Normal file
140
ingestion/internal/api/passrate.go
Normal file
@@ -0,0 +1,140 @@
|
||||
package api
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"time"
|
||||
)
|
||||
|
||||
type passRateResponse struct {
|
||||
Skill string `json:"skill"`
|
||||
Window string `json:"window"`
|
||||
Pass int `json:"pass"`
|
||||
Fail int `json:"fail"`
|
||||
Skip int `json:"skip"`
|
||||
Total int `json:"total"`
|
||||
PassRate *float64 `json:"pass_rate"`
|
||||
}
|
||||
|
||||
// PassRate handles GET /pass-rate?skill=X&window=Y.
|
||||
// Walks brainDir/sessions/*.jsonl, filters by skill name and timestamp,
|
||||
// returns aggregated counts and pass rate.
|
||||
func (h *Handler) PassRate(w http.ResponseWriter, r *http.Request) {
|
||||
skill := r.URL.Query().Get("skill")
|
||||
if skill == "" {
|
||||
writeError(w, http.StatusBadRequest, "skill is required")
|
||||
return
|
||||
}
|
||||
|
||||
windowStr := r.URL.Query().Get("window")
|
||||
if windowStr == "" {
|
||||
windowStr = "7d"
|
||||
}
|
||||
window, err := parseWindow(windowStr)
|
||||
if err != nil {
|
||||
writeError(w, http.StatusBadRequest, "invalid window: "+err.Error())
|
||||
return
|
||||
}
|
||||
|
||||
cutoff := time.Now().UTC().Add(-window)
|
||||
pass, fail, skip := 0, 0, 0
|
||||
|
||||
sessionsDir := filepath.Join(h.brainDir, "sessions")
|
||||
entries, err := os.ReadDir(sessionsDir)
|
||||
if err != nil && !os.IsNotExist(err) {
|
||||
writeError(w, http.StatusInternalServerError, "read sessions dir: "+err.Error())
|
||||
return
|
||||
}
|
||||
|
||||
for _, entry := range entries {
|
||||
if entry.IsDir() || !strings.HasSuffix(entry.Name(), ".jsonl") {
|
||||
continue
|
||||
}
|
||||
body, err := os.ReadFile(filepath.Join(sessionsDir, entry.Name()))
|
||||
if err != nil {
|
||||
continue // skip unreadable files
|
||||
}
|
||||
for _, line := range strings.Split(string(body), "\n") {
|
||||
line = strings.TrimSpace(line)
|
||||
if line == "" {
|
||||
continue
|
||||
}
|
||||
var rec struct {
|
||||
Timestamp string `json:"timestamp"`
|
||||
Skill string `json:"skill"`
|
||||
FinalStatus string `json:"final_status"`
|
||||
}
|
||||
if err := json.Unmarshal([]byte(line), &rec); err != nil {
|
||||
continue // malformed — skip
|
||||
}
|
||||
if rec.Skill != skill {
|
||||
continue
|
||||
}
|
||||
ts, err := time.Parse(time.RFC3339, rec.Timestamp)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
if ts.Before(cutoff) {
|
||||
continue
|
||||
}
|
||||
switch normalizeStatus(rec.FinalStatus) {
|
||||
case "pass":
|
||||
pass++
|
||||
case "fail":
|
||||
fail++
|
||||
case "skip":
|
||||
skip++
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
total := pass + fail + skip
|
||||
resp := passRateResponse{
|
||||
Skill: skill,
|
||||
Window: windowStr,
|
||||
Pass: pass,
|
||||
Fail: fail,
|
||||
Skip: skip,
|
||||
Total: total,
|
||||
}
|
||||
if pass+fail > 0 {
|
||||
rate := float64(pass) / float64(pass+fail)
|
||||
resp.PassRate = &rate
|
||||
}
|
||||
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_ = json.NewEncoder(w).Encode(resp)
|
||||
}
|
||||
|
||||
// normalizeStatus maps both new (pass/fail/skip) and legacy (ok/error/skipped)
|
||||
// vocabularies to the canonical pass/fail/skip set. Unknown values are treated
|
||||
// as skip for safety.
|
||||
func normalizeStatus(s string) string {
|
||||
switch s {
|
||||
case "pass", "ok":
|
||||
return "pass"
|
||||
case "fail", "error":
|
||||
return "fail"
|
||||
case "skip", "skipped":
|
||||
return "skip"
|
||||
default:
|
||||
return "skip"
|
||||
}
|
||||
}
|
||||
|
||||
// parseWindow accepts Go-style durations plus "Nd" for days.
|
||||
func parseWindow(s string) (time.Duration, error) {
|
||||
if strings.HasSuffix(s, "d") {
|
||||
// Replace "d" with "h" * 24
|
||||
days := strings.TrimSuffix(s, "d")
|
||||
d, err := time.ParseDuration(days + "h")
|
||||
if err != nil {
|
||||
return 0, err
|
||||
}
|
||||
return d * 24, nil
|
||||
}
|
||||
return time.ParseDuration(s)
|
||||
}
|
||||
172
ingestion/internal/api/passrate_test.go
Normal file
172
ingestion/internal/api/passrate_test.go
Normal file
@@ -0,0 +1,172 @@
|
||||
package api
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
// writeSession writes one or more JSONL entries to <dir>/sessions/<sessionID>.jsonl.
|
||||
// The handler scans <brainDir>/sessions/, so test fixtures must mirror that layout.
|
||||
func writeSession(t *testing.T, dir, sessionID string, entries ...string) {
|
||||
t.Helper()
|
||||
sessionsDir := filepath.Join(dir, "sessions")
|
||||
require.NoError(t, os.MkdirAll(sessionsDir, 0o755))
|
||||
path := filepath.Join(sessionsDir, sessionID+".jsonl")
|
||||
body := ""
|
||||
for _, e := range entries {
|
||||
body += e + "\n"
|
||||
}
|
||||
require.NoError(t, os.WriteFile(path, []byte(body), 0o644))
|
||||
}
|
||||
|
||||
func TestPassRate_HappyPath(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
now := time.Now().UTC()
|
||||
recent := now.Add(-1 * time.Hour).Format(time.RFC3339)
|
||||
|
||||
writeSession(t, dir, "s1",
|
||||
`{"timestamp":"`+recent+`","skill":"tdd","phase":"red","final_status":"pass"}`,
|
||||
`{"timestamp":"`+recent+`","skill":"tdd","phase":"green","final_status":"pass"}`,
|
||||
`{"timestamp":"`+recent+`","skill":"tdd","phase":"refactor","final_status":"fail"}`,
|
||||
)
|
||||
writeSession(t, dir, "s2",
|
||||
`{"timestamp":"`+recent+`","skill":"code-review","phase":"review","final_status":"pass"}`,
|
||||
)
|
||||
|
||||
h := &Handler{brainDir: dir}
|
||||
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd&window=24h", nil)
|
||||
w := httptest.NewRecorder()
|
||||
h.PassRate(w, req)
|
||||
|
||||
resp := w.Result()
|
||||
require.Equal(t, http.StatusOK, resp.StatusCode)
|
||||
|
||||
var got passRateResponse
|
||||
require.NoError(t, json.NewDecoder(resp.Body).Decode(&got))
|
||||
assert.Equal(t, "tdd", got.Skill)
|
||||
assert.Equal(t, "24h", got.Window)
|
||||
assert.Equal(t, 2, got.Pass)
|
||||
assert.Equal(t, 1, got.Fail)
|
||||
assert.Equal(t, 0, got.Skip)
|
||||
assert.Equal(t, 3, got.Total)
|
||||
require.NotNil(t, got.PassRate)
|
||||
assert.InDelta(t, 0.6667, *got.PassRate, 0.001)
|
||||
}
|
||||
|
||||
func TestPassRate_LegacyVocabulary(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
now := time.Now().UTC().Format(time.RFC3339)
|
||||
writeSession(t, dir, "s1",
|
||||
`{"timestamp":"`+now+`","skill":"tdd","final_status":"ok"}`,
|
||||
`{"timestamp":"`+now+`","skill":"tdd","final_status":"error"}`,
|
||||
`{"timestamp":"`+now+`","skill":"tdd","final_status":"skipped"}`,
|
||||
)
|
||||
|
||||
h := &Handler{brainDir: dir}
|
||||
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd&window=24h", nil)
|
||||
w := httptest.NewRecorder()
|
||||
h.PassRate(w, req)
|
||||
|
||||
var got passRateResponse
|
||||
require.NoError(t, json.NewDecoder(w.Result().Body).Decode(&got))
|
||||
assert.Equal(t, 1, got.Pass, "ok→pass")
|
||||
assert.Equal(t, 1, got.Fail, "error→fail")
|
||||
assert.Equal(t, 1, got.Skip, "skipped→skip")
|
||||
}
|
||||
|
||||
func TestPassRate_OutsideWindow_Excluded(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
old := time.Now().UTC().Add(-30 * 24 * time.Hour).Format(time.RFC3339)
|
||||
recent := time.Now().UTC().Add(-1 * time.Hour).Format(time.RFC3339)
|
||||
writeSession(t, dir, "s1",
|
||||
`{"timestamp":"`+old+`","skill":"tdd","final_status":"pass"}`,
|
||||
`{"timestamp":"`+recent+`","skill":"tdd","final_status":"pass"}`,
|
||||
)
|
||||
|
||||
h := &Handler{brainDir: dir}
|
||||
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd&window=24h", nil)
|
||||
w := httptest.NewRecorder()
|
||||
h.PassRate(w, req)
|
||||
|
||||
var got passRateResponse
|
||||
require.NoError(t, json.NewDecoder(w.Result().Body).Decode(&got))
|
||||
assert.Equal(t, 1, got.Pass)
|
||||
assert.Equal(t, 1, got.Total)
|
||||
}
|
||||
|
||||
func TestPassRate_NoData_ReturnsZerosAndNullRate(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
h := &Handler{brainDir: dir}
|
||||
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd&window=24h", nil)
|
||||
w := httptest.NewRecorder()
|
||||
h.PassRate(w, req)
|
||||
|
||||
var got passRateResponse
|
||||
require.NoError(t, json.NewDecoder(w.Result().Body).Decode(&got))
|
||||
assert.Equal(t, 0, got.Pass)
|
||||
assert.Equal(t, 0, got.Fail)
|
||||
assert.Equal(t, 0, got.Skip)
|
||||
assert.Equal(t, 0, got.Total)
|
||||
assert.Nil(t, got.PassRate, "pass_rate must be null when pass+fail == 0")
|
||||
}
|
||||
|
||||
func TestPassRate_DefaultsTo7d(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
now := time.Now().UTC().Format(time.RFC3339)
|
||||
writeSession(t, dir, "s1", `{"timestamp":"`+now+`","skill":"tdd","final_status":"pass"}`)
|
||||
|
||||
h := &Handler{brainDir: dir}
|
||||
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd", nil) // no window
|
||||
w := httptest.NewRecorder()
|
||||
h.PassRate(w, req)
|
||||
|
||||
var got passRateResponse
|
||||
require.NoError(t, json.NewDecoder(w.Result().Body).Decode(&got))
|
||||
assert.Equal(t, "7d", got.Window)
|
||||
assert.Equal(t, 1, got.Pass)
|
||||
}
|
||||
|
||||
func TestPassRate_MissingSkill_ReturnsBadRequest(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
h := &Handler{brainDir: dir}
|
||||
req := httptest.NewRequest(http.MethodGet, "/pass-rate", nil)
|
||||
w := httptest.NewRecorder()
|
||||
h.PassRate(w, req)
|
||||
assert.Equal(t, http.StatusBadRequest, w.Result().StatusCode)
|
||||
}
|
||||
|
||||
func TestPassRate_BadWindow_ReturnsBadRequest(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
h := &Handler{brainDir: dir}
|
||||
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd&window=foo", nil)
|
||||
w := httptest.NewRecorder()
|
||||
h.PassRate(w, req)
|
||||
assert.Equal(t, http.StatusBadRequest, w.Result().StatusCode)
|
||||
}
|
||||
|
||||
func TestPassRate_MalformedLine_Skipped(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
now := time.Now().UTC().Format(time.RFC3339)
|
||||
writeSession(t, dir, "s1",
|
||||
`{"timestamp":"`+now+`","skill":"tdd","final_status":"pass"}`,
|
||||
`not valid json`,
|
||||
`{"timestamp":"`+now+`","skill":"tdd","final_status":"pass"}`,
|
||||
)
|
||||
|
||||
h := &Handler{brainDir: dir}
|
||||
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd&window=24h", nil)
|
||||
w := httptest.NewRecorder()
|
||||
h.PassRate(w, req)
|
||||
|
||||
var got passRateResponse
|
||||
require.NoError(t, json.NewDecoder(w.Result().Body).Decode(&got))
|
||||
assert.Equal(t, 2, got.Pass, "the malformed line is silently skipped")
|
||||
}
|
||||
84
ingestion/internal/auth/jwt.go
Normal file
84
ingestion/internal/auth/jwt.go
Normal file
@@ -0,0 +1,84 @@
|
||||
package auth
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"net/http"
|
||||
"time"
|
||||
|
||||
"github.com/lestrrat-go/jwx/v2/jwk"
|
||||
"github.com/lestrrat-go/jwx/v2/jwt"
|
||||
)
|
||||
|
||||
// Validator validates Bearer JWTs issued by a Dex (OIDC) authorization server.
|
||||
// Audience is optional; leave empty to skip audience validation.
|
||||
type Validator struct {
|
||||
issuer string
|
||||
audience string
|
||||
jwksURI string
|
||||
cache *jwk.Cache
|
||||
}
|
||||
|
||||
// NewValidator fetches the OIDC discovery document from issuerURL, extracts
|
||||
// jwks_uri, seeds the JWKS cache, and returns a ready Validator.
|
||||
// If DEX_ISSUER_URL is not set the caller should pass "" and skip construction.
|
||||
func NewValidator(issuerURL, audience string) (*Validator, error) {
|
||||
resp, err := http.Get(issuerURL + "/.well-known/openid-configuration") //nolint:noctx
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("fetch oidc discovery: %w", err)
|
||||
}
|
||||
defer resp.Body.Close() //nolint:errcheck
|
||||
if resp.StatusCode != http.StatusOK {
|
||||
return nil, fmt.Errorf("oidc discovery: status %d", resp.StatusCode)
|
||||
}
|
||||
|
||||
var doc struct {
|
||||
JWKSURI string `json:"jwks_uri"`
|
||||
}
|
||||
if err := json.NewDecoder(resp.Body).Decode(&doc); err != nil {
|
||||
return nil, fmt.Errorf("decode oidc discovery: %w", err)
|
||||
}
|
||||
if doc.JWKSURI == "" {
|
||||
return nil, fmt.Errorf("oidc discovery: empty jwks_uri")
|
||||
}
|
||||
|
||||
ctx := context.Background()
|
||||
cache := jwk.NewCache(ctx)
|
||||
if err := cache.Register(doc.JWKSURI, jwk.WithMinRefreshInterval(time.Hour)); err != nil {
|
||||
return nil, fmt.Errorf("register jwks cache: %w", err)
|
||||
}
|
||||
if _, err := cache.Refresh(ctx, doc.JWKSURI); err != nil {
|
||||
return nil, fmt.Errorf("initial jwks fetch: %w", err)
|
||||
}
|
||||
|
||||
return &Validator{
|
||||
issuer: issuerURL,
|
||||
audience: audience,
|
||||
jwksURI: doc.JWKSURI,
|
||||
cache: cache,
|
||||
}, nil
|
||||
}
|
||||
|
||||
// Validate parses and validates rawToken. Returns the subject claim on success.
|
||||
func (v *Validator) Validate(ctx context.Context, rawToken string) (string, error) {
|
||||
keySet, err := v.cache.Get(ctx, v.jwksURI)
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("get jwks: %w", err)
|
||||
}
|
||||
|
||||
opts := []jwt.ParseOption{
|
||||
jwt.WithKeySet(keySet),
|
||||
jwt.WithValidate(true),
|
||||
jwt.WithIssuer(v.issuer),
|
||||
}
|
||||
if v.audience != "" {
|
||||
opts = append(opts, jwt.WithAudience(v.audience))
|
||||
}
|
||||
|
||||
tok, err := jwt.ParseString(rawToken, opts...)
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("validate jwt: %w", err)
|
||||
}
|
||||
return tok.Subject(), nil
|
||||
}
|
||||
169
ingestion/internal/auth/jwt_test.go
Normal file
169
ingestion/internal/auth/jwt_test.go
Normal file
@@ -0,0 +1,169 @@
|
||||
package auth_test
|
||||
|
||||
import (
|
||||
"context"
|
||||
"crypto/rand"
|
||||
"crypto/rsa"
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/lestrrat-go/jwx/v2/jwa"
|
||||
"github.com/lestrrat-go/jwx/v2/jwk"
|
||||
"github.com/lestrrat-go/jwx/v2/jwt"
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/auth"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
type testKeys struct {
|
||||
priv jwk.Key
|
||||
pub jwk.Key
|
||||
}
|
||||
|
||||
func generateRSAKeys(t *testing.T) testKeys {
|
||||
t.Helper()
|
||||
raw, err := rsa.GenerateKey(rand.Reader, 2048)
|
||||
require.NoError(t, err)
|
||||
|
||||
priv, err := jwk.FromRaw(raw)
|
||||
require.NoError(t, err)
|
||||
require.NoError(t, priv.Set(jwk.KeyIDKey, "test-kid"))
|
||||
require.NoError(t, priv.Set(jwk.AlgorithmKey, jwa.RS256))
|
||||
|
||||
pub, err := jwk.PublicKeyOf(priv)
|
||||
require.NoError(t, err)
|
||||
|
||||
return testKeys{priv: priv, pub: pub}
|
||||
}
|
||||
|
||||
func mockOIDCServer(t *testing.T, keys testKeys) *httptest.Server {
|
||||
t.Helper()
|
||||
set := jwk.NewSet()
|
||||
require.NoError(t, set.AddKey(keys.pub))
|
||||
jwksBytes, err := json.Marshal(set)
|
||||
require.NoError(t, err)
|
||||
|
||||
mux := http.NewServeMux()
|
||||
var srv *httptest.Server
|
||||
mux.HandleFunc("/.well-known/openid-configuration", func(w http.ResponseWriter, _ *http.Request) {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_ = json.NewEncoder(w).Encode(map[string]string{
|
||||
"issuer": srv.URL,
|
||||
"jwks_uri": srv.URL + "/jwks",
|
||||
})
|
||||
})
|
||||
mux.HandleFunc("/jwks", func(w http.ResponseWriter, _ *http.Request) {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_, _ = w.Write(jwksBytes)
|
||||
})
|
||||
srv = httptest.NewServer(mux)
|
||||
t.Cleanup(srv.Close)
|
||||
return srv
|
||||
}
|
||||
|
||||
func signToken(t *testing.T, keys testKeys, issuer, audience, subject string, exp time.Time) string {
|
||||
t.Helper()
|
||||
b := jwt.NewBuilder().
|
||||
Issuer(issuer).
|
||||
Subject(subject).
|
||||
Expiration(exp)
|
||||
if audience != "" {
|
||||
b = b.Audience([]string{audience})
|
||||
}
|
||||
tok, err := b.Build()
|
||||
require.NoError(t, err)
|
||||
signed, err := jwt.Sign(tok, jwt.WithKey(jwa.RS256, keys.priv))
|
||||
require.NoError(t, err)
|
||||
return string(signed)
|
||||
}
|
||||
|
||||
func TestValidator(t *testing.T) {
|
||||
keys := generateRSAKeys(t)
|
||||
srv := mockOIDCServer(t, keys)
|
||||
ctx := context.Background()
|
||||
|
||||
v, err := auth.NewValidator(srv.URL, "brain")
|
||||
require.NoError(t, err)
|
||||
|
||||
tests := []struct {
|
||||
name string
|
||||
token string
|
||||
wantSub string
|
||||
wantErr bool
|
||||
}{
|
||||
{
|
||||
name: "valid jwt",
|
||||
token: signToken(t, keys, srv.URL, "brain", "test-user", time.Now().Add(time.Hour)),
|
||||
wantSub: "test-user",
|
||||
},
|
||||
{
|
||||
name: "expired jwt",
|
||||
token: signToken(t, keys, srv.URL, "brain", "test-user", time.Now().Add(-time.Hour)),
|
||||
wantErr: true,
|
||||
},
|
||||
{
|
||||
name: "wrong issuer",
|
||||
token: signToken(t, keys, "https://evil.example.com", "brain", "test-user", time.Now().Add(time.Hour)),
|
||||
wantErr: true,
|
||||
},
|
||||
{
|
||||
name: "wrong audience",
|
||||
token: signToken(t, keys, srv.URL, "other-service", "test-user", time.Now().Add(time.Hour)),
|
||||
wantErr: true,
|
||||
},
|
||||
{
|
||||
name: "tampered token",
|
||||
token: signToken(t, keys, srv.URL, "brain", "test-user", time.Now().Add(time.Hour)) + "tampered",
|
||||
wantErr: true,
|
||||
},
|
||||
{
|
||||
name: "not a jwt",
|
||||
token: "not-a-jwt",
|
||||
wantErr: true,
|
||||
},
|
||||
}
|
||||
|
||||
for _, tc := range tests {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
sub, err := v.Validate(ctx, tc.token)
|
||||
if tc.wantErr {
|
||||
assert.Error(t, err)
|
||||
assert.Empty(t, sub)
|
||||
} else {
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, tc.wantSub, sub)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestNewValidator_NoAudience(t *testing.T) {
|
||||
keys := generateRSAKeys(t)
|
||||
srv := mockOIDCServer(t, keys)
|
||||
ctx := context.Background()
|
||||
|
||||
v, err := auth.NewValidator(srv.URL, "")
|
||||
require.NoError(t, err)
|
||||
|
||||
// Token without audience passes when audience validation is disabled.
|
||||
tok, err := jwt.NewBuilder().
|
||||
Issuer(srv.URL).
|
||||
Subject("sub").
|
||||
Expiration(time.Now().Add(time.Hour)).
|
||||
Build()
|
||||
require.NoError(t, err)
|
||||
signed, err := jwt.Sign(tok, jwt.WithKey(jwa.RS256, keys.priv))
|
||||
require.NoError(t, err)
|
||||
|
||||
sub, err := v.Validate(ctx, string(signed))
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, "sub", sub)
|
||||
}
|
||||
|
||||
func TestNewValidator_BadDiscoveryURL(t *testing.T) {
|
||||
_, err := auth.NewValidator("http://127.0.0.1:1", "brain")
|
||||
assert.Error(t, err)
|
||||
}
|
||||
23
ingestion/internal/auth/protected_resource.go
Normal file
23
ingestion/internal/auth/protected_resource.go
Normal file
@@ -0,0 +1,23 @@
|
||||
package auth
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
)
|
||||
|
||||
// ProtectedResourceHandler returns an RFC 9728 oauth-protected-resource metadata
|
||||
// handler. Mount at GET /.well-known/oauth-protected-resource (no auth required).
|
||||
func ProtectedResourceHandler(resourceURL, issuerURL string) http.HandlerFunc {
|
||||
type metadata struct {
|
||||
Resource string `json:"resource"`
|
||||
AuthorizationServers []string `json:"authorization_servers"`
|
||||
}
|
||||
body, _ := json.Marshal(metadata{
|
||||
Resource: resourceURL,
|
||||
AuthorizationServers: []string{issuerURL},
|
||||
})
|
||||
return func(w http.ResponseWriter, _ *http.Request) {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_, _ = w.Write(body)
|
||||
}
|
||||
}
|
||||
28
ingestion/internal/auth/protected_resource_test.go
Normal file
28
ingestion/internal/auth/protected_resource_test.go
Normal file
@@ -0,0 +1,28 @@
|
||||
package auth_test
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"testing"
|
||||
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/auth"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func TestProtectedResourceHandler(t *testing.T) {
|
||||
h := auth.ProtectedResourceHandler("https://brain-mcp.d-ma.be", "https://auth.d-ma.be")
|
||||
req := httptest.NewRequest(http.MethodGet, "/.well-known/oauth-protected-resource", nil)
|
||||
rr := httptest.NewRecorder()
|
||||
h(rr, req)
|
||||
|
||||
assert.Equal(t, http.StatusOK, rr.Code)
|
||||
assert.Equal(t, "application/json", rr.Header().Get("Content-Type"))
|
||||
|
||||
var body map[string]any
|
||||
require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &body))
|
||||
assert.Equal(t, "https://brain-mcp.d-ma.be", body["resource"])
|
||||
servers := body["authorization_servers"].([]any)
|
||||
assert.Equal(t, "https://auth.d-ma.be", servers[0])
|
||||
}
|
||||
29
ingestion/internal/llm/router.go
Normal file
29
ingestion/internal/llm/router.go
Normal file
@@ -0,0 +1,29 @@
|
||||
package llm
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
)
|
||||
|
||||
// Router calls Primary first; on any error falls back to Fallback.
|
||||
// Fallback may be nil, in which case primary errors are returned directly.
|
||||
type Router struct {
|
||||
Primary *Client
|
||||
Fallback *Client
|
||||
}
|
||||
|
||||
// Complete implements pipeline.CompleteFunc, routing through Primary then Fallback.
|
||||
func (r *Router) Complete(ctx context.Context, system, user string) (string, error) {
|
||||
out, err := r.Primary.Complete(ctx, system, user)
|
||||
if err == nil {
|
||||
return out, nil
|
||||
}
|
||||
if r.Fallback == nil {
|
||||
return "", fmt.Errorf("primary llm: %w", err)
|
||||
}
|
||||
out, err2 := r.Fallback.Complete(ctx, system, user)
|
||||
if err2 != nil {
|
||||
return "", fmt.Errorf("primary llm: %w; fallback llm: %v", err, err2)
|
||||
}
|
||||
return out, nil
|
||||
}
|
||||
71
ingestion/internal/llm/router_test.go
Normal file
71
ingestion/internal/llm/router_test.go
Normal file
@@ -0,0 +1,71 @@
|
||||
package llm
|
||||
|
||||
import (
|
||||
"context"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func TestRouter_PrimarySucceeds(t *testing.T) {
|
||||
primary := mockServer(t, "from-primary")
|
||||
defer primary.Close()
|
||||
fallback := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
t.Error("fallback must not be called when primary succeeds")
|
||||
}))
|
||||
defer fallback.Close()
|
||||
|
||||
r := &Router{
|
||||
Primary: New(primary.URL, "", "m", time.Second),
|
||||
Fallback: New(fallback.URL, "", "m", time.Second),
|
||||
}
|
||||
out, err := r.Complete(context.Background(), "sys", "user")
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, "from-primary", out)
|
||||
}
|
||||
|
||||
func TestRouter_FallsBackOnPrimaryError(t *testing.T) {
|
||||
primary := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
http.Error(w, "unavailable", http.StatusServiceUnavailable)
|
||||
}))
|
||||
defer primary.Close()
|
||||
fallback := mockServer(t, "from-fallback")
|
||||
defer fallback.Close()
|
||||
|
||||
r := &Router{
|
||||
Primary: New(primary.URL, "", "m", time.Second),
|
||||
Fallback: New(fallback.URL, "", "m", time.Second),
|
||||
}
|
||||
out, err := r.Complete(context.Background(), "sys", "user")
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, "from-fallback", out)
|
||||
}
|
||||
|
||||
func TestRouter_BothFail(t *testing.T) {
|
||||
fail := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
http.Error(w, "err", http.StatusBadGateway)
|
||||
}))
|
||||
defer fail.Close()
|
||||
|
||||
r := &Router{
|
||||
Primary: New(fail.URL, "", "m", time.Second),
|
||||
Fallback: New(fail.URL, "", "m", time.Second),
|
||||
}
|
||||
_, err := r.Complete(context.Background(), "sys", "user")
|
||||
assert.Error(t, err)
|
||||
}
|
||||
|
||||
func TestRouter_NilFallback(t *testing.T) {
|
||||
fail := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
http.Error(w, "err", http.StatusBadGateway)
|
||||
}))
|
||||
defer fail.Close()
|
||||
|
||||
r := &Router{Primary: New(fail.URL, "", "m", time.Second)}
|
||||
_, err := r.Complete(context.Background(), "sys", "user")
|
||||
assert.Error(t, err)
|
||||
}
|
||||
36
ingestion/internal/mcp/auth.go
Normal file
36
ingestion/internal/mcp/auth.go
Normal file
@@ -0,0 +1,36 @@
|
||||
package mcp
|
||||
|
||||
import (
|
||||
"crypto/subtle"
|
||||
"net/http"
|
||||
"strings"
|
||||
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/auth"
|
||||
)
|
||||
|
||||
// BearerAuth returns a middleware that enforces authentication on every request.
|
||||
// It tries a valid Dex JWT first (when v is non-nil), then falls back to the
|
||||
// static token. Rejects if token is empty and no valid JWT is presented.
|
||||
func BearerAuth(token string, v *auth.Validator, next http.Handler) http.Handler {
|
||||
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
rawToken, ok := strings.CutPrefix(r.Header.Get("Authorization"), "Bearer ")
|
||||
if !ok {
|
||||
http.Error(w, "unauthorized", http.StatusUnauthorized)
|
||||
return
|
||||
}
|
||||
|
||||
if v != nil {
|
||||
if _, err := v.Validate(r.Context(), rawToken); err == nil {
|
||||
next.ServeHTTP(w, r)
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
if token != "" && subtle.ConstantTimeCompare([]byte(rawToken), []byte(token)) == 1 {
|
||||
next.ServeHTTP(w, r)
|
||||
return
|
||||
}
|
||||
|
||||
http.Error(w, "unauthorized", http.StatusUnauthorized)
|
||||
})
|
||||
}
|
||||
161
ingestion/internal/mcp/auth_test.go
Normal file
161
ingestion/internal/mcp/auth_test.go
Normal file
@@ -0,0 +1,161 @@
|
||||
package mcp_test
|
||||
|
||||
import (
|
||||
"context"
|
||||
"crypto/rand"
|
||||
"crypto/rsa"
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/lestrrat-go/jwx/v2/jwa"
|
||||
"github.com/lestrrat-go/jwx/v2/jwk"
|
||||
"github.com/lestrrat-go/jwx/v2/jwt"
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/auth"
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/mcp"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func okHandler() http.Handler {
|
||||
return http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||
w.WriteHeader(http.StatusOK)
|
||||
})
|
||||
}
|
||||
|
||||
func TestBearerAuth_MissingHeader(t *testing.T) {
|
||||
handler := mcp.BearerAuth("secret", nil, okHandler())
|
||||
req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
|
||||
rr := httptest.NewRecorder()
|
||||
handler.ServeHTTP(rr, req)
|
||||
assert.Equal(t, http.StatusUnauthorized, rr.Code)
|
||||
}
|
||||
|
||||
func TestBearerAuth_WrongToken(t *testing.T) {
|
||||
handler := mcp.BearerAuth("secret", nil, okHandler())
|
||||
req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
|
||||
req.Header.Set("Authorization", "Bearer wrong")
|
||||
rr := httptest.NewRecorder()
|
||||
handler.ServeHTTP(rr, req)
|
||||
assert.Equal(t, http.StatusUnauthorized, rr.Code)
|
||||
}
|
||||
|
||||
func TestBearerAuth_CorrectToken(t *testing.T) {
|
||||
called := false
|
||||
handler := mcp.BearerAuth("secret", nil, http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||
called = true
|
||||
w.WriteHeader(http.StatusOK)
|
||||
}))
|
||||
req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
|
||||
req.Header.Set("Authorization", "Bearer secret")
|
||||
rr := httptest.NewRecorder()
|
||||
handler.ServeHTTP(rr, req)
|
||||
assert.Equal(t, http.StatusOK, rr.Code)
|
||||
assert.True(t, called)
|
||||
}
|
||||
|
||||
func TestBearerAuth_EmptyConfiguredToken(t *testing.T) {
|
||||
handler := mcp.BearerAuth("", nil, okHandler())
|
||||
req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
|
||||
rr := httptest.NewRecorder()
|
||||
handler.ServeHTTP(rr, req)
|
||||
assert.Equal(t, http.StatusUnauthorized, rr.Code)
|
||||
}
|
||||
|
||||
// JWT auth tests
|
||||
|
||||
func buildOIDCServer(t *testing.T) (*httptest.Server, jwk.Key) {
|
||||
t.Helper()
|
||||
raw, err := rsa.GenerateKey(rand.Reader, 2048)
|
||||
require.NoError(t, err)
|
||||
priv, err := jwk.FromRaw(raw)
|
||||
require.NoError(t, err)
|
||||
require.NoError(t, priv.Set(jwk.KeyIDKey, "k1"))
|
||||
require.NoError(t, priv.Set(jwk.AlgorithmKey, jwa.RS256))
|
||||
pub, err := jwk.PublicKeyOf(priv)
|
||||
require.NoError(t, err)
|
||||
|
||||
set := jwk.NewSet()
|
||||
require.NoError(t, set.AddKey(pub))
|
||||
jwksBytes, err := json.Marshal(set)
|
||||
require.NoError(t, err)
|
||||
|
||||
muxSrv := http.NewServeMux()
|
||||
var srv *httptest.Server
|
||||
muxSrv.HandleFunc("/.well-known/openid-configuration", func(w http.ResponseWriter, _ *http.Request) {
|
||||
_ = json.NewEncoder(w).Encode(map[string]string{
|
||||
"issuer": srv.URL,
|
||||
"jwks_uri": srv.URL + "/jwks",
|
||||
})
|
||||
})
|
||||
muxSrv.HandleFunc("/jwks", func(w http.ResponseWriter, _ *http.Request) {
|
||||
_, _ = w.Write(jwksBytes)
|
||||
})
|
||||
srv = httptest.NewServer(muxSrv)
|
||||
t.Cleanup(srv.Close)
|
||||
return srv, priv
|
||||
}
|
||||
|
||||
func signJWT(t *testing.T, priv jwk.Key, issuer, audience string, exp time.Time) string {
|
||||
t.Helper()
|
||||
tok, err := jwt.NewBuilder().
|
||||
Issuer(issuer).Audience([]string{audience}).
|
||||
Subject("s").Expiration(exp).
|
||||
Build()
|
||||
require.NoError(t, err)
|
||||
signed, err := jwt.Sign(tok, jwt.WithKey(jwa.RS256, priv))
|
||||
require.NoError(t, err)
|
||||
return string(signed)
|
||||
}
|
||||
|
||||
func TestBearerAuth_ValidJWT(t *testing.T) {
|
||||
oidcSrv, priv := buildOIDCServer(t)
|
||||
v, err := auth.NewValidator(oidcSrv.URL, "brain")
|
||||
require.NoError(t, err)
|
||||
|
||||
called := false
|
||||
handler := mcp.BearerAuth("static-secret", v, http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||
called = true
|
||||
w.WriteHeader(http.StatusOK)
|
||||
}))
|
||||
|
||||
token := signJWT(t, priv, oidcSrv.URL, "brain", time.Now().Add(time.Hour))
|
||||
req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
|
||||
req.Header.Set("Authorization", "Bearer "+token)
|
||||
rr := httptest.NewRecorder()
|
||||
handler.ServeHTTP(rr, req)
|
||||
assert.Equal(t, http.StatusOK, rr.Code)
|
||||
assert.True(t, called)
|
||||
}
|
||||
|
||||
func TestBearerAuth_InvalidJWT_FallsBackToStaticToken(t *testing.T) {
|
||||
oidcSrv, _ := buildOIDCServer(t)
|
||||
v, err := auth.NewValidator(oidcSrv.URL, "brain")
|
||||
require.NoError(t, err)
|
||||
|
||||
handler := mcp.BearerAuth("static-secret", v, okHandler())
|
||||
req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
|
||||
req.Header.Set("Authorization", "Bearer static-secret")
|
||||
rr := httptest.NewRecorder()
|
||||
handler.ServeHTTP(rr, req)
|
||||
assert.Equal(t, http.StatusOK, rr.Code)
|
||||
}
|
||||
|
||||
func TestBearerAuth_InvalidJWT_WrongStaticToken(t *testing.T) {
|
||||
oidcSrv, priv := buildOIDCServer(t)
|
||||
v, err := auth.NewValidator(oidcSrv.URL, "brain")
|
||||
require.NoError(t, err)
|
||||
|
||||
handler := mcp.BearerAuth("static-secret", v, okHandler())
|
||||
// Expired JWT — JWT fails, static token doesn't match either
|
||||
token := signJWT(t, priv, oidcSrv.URL, "brain", time.Now().Add(-time.Hour))
|
||||
req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
|
||||
req.Header.Set("Authorization", "Bearer "+token)
|
||||
|
||||
_ = context.Background() // satisfies import
|
||||
rr := httptest.NewRecorder()
|
||||
handler.ServeHTTP(rr, req)
|
||||
assert.Equal(t, http.StatusUnauthorized, rr.Code)
|
||||
}
|
||||
270
ingestion/internal/mcp/handlers.go
Normal file
270
ingestion/internal/mcp/handlers.go
Normal file
@@ -0,0 +1,270 @@
|
||||
package mcp
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/api"
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/extract"
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/search"
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/session"
|
||||
)
|
||||
|
||||
// tools returns the tool descriptors. Handler bodies for each tool are filled
|
||||
// in subsequent tasks; this file currently only provides the descriptors.
|
||||
func (s *Server) tools() []map[string]any {
|
||||
str := func(desc string) map[string]any {
|
||||
return map[string]any{"type": "string", "description": desc}
|
||||
}
|
||||
int_ := func(desc string) map[string]any {
|
||||
return map[string]any{"type": "integer", "description": desc}
|
||||
}
|
||||
schema := func(required []string, props map[string]any) json.RawMessage {
|
||||
b, _ := json.Marshal(map[string]any{
|
||||
"type": "object", "required": required, "properties": props,
|
||||
})
|
||||
return b
|
||||
}
|
||||
|
||||
return []map[string]any{
|
||||
{
|
||||
"name": "brain_query",
|
||||
"description": "BM25 full-text search across brain/knowledge/ and brain/wiki/ markdown files.",
|
||||
"inputSchema": schema([]string{"query"}, map[string]any{
|
||||
"query": str("search terms"),
|
||||
"limit": int_("max results, default 5"),
|
||||
}),
|
||||
},
|
||||
{
|
||||
"name": "brain_write",
|
||||
"description": "Write a raw knowledge note to brain/knowledge/.",
|
||||
"inputSchema": schema([]string{"content"}, map[string]any{
|
||||
"content": str("markdown content"),
|
||||
"filename": str("optional filename"),
|
||||
"type": str("optional frontmatter type"),
|
||||
"domain": str("optional frontmatter domain"),
|
||||
}),
|
||||
},
|
||||
{
|
||||
"name": "brain_ingest_raw",
|
||||
"description": "Ingest pre-structured pages into the brain wiki, bypassing the LLM extraction step.",
|
||||
"inputSchema": schema([]string{"source", "pages"}, map[string]any{
|
||||
"source": str("source name"),
|
||||
"pages": map[string]any{"type": "array"},
|
||||
"dry_run": map[string]any{"type": "boolean"},
|
||||
}),
|
||||
},
|
||||
{
|
||||
"name": "brain_ingest",
|
||||
"description": "Ingest content into the brain wiki via the LLM extraction pipeline.",
|
||||
"inputSchema": schema([]string{}, map[string]any{
|
||||
"content": str("raw content; required when path is empty"),
|
||||
"source": str("source name; required when path is empty"),
|
||||
"path": str("file path; mutually exclusive with content+source"),
|
||||
"dry_run": map[string]any{"type": "boolean"},
|
||||
}),
|
||||
},
|
||||
{
|
||||
"name": "brain_answer",
|
||||
"description": "Retrieve relevant brain content via BM25 and synthesize a coherent answer using an LLM.",
|
||||
"inputSchema": schema([]string{"query"}, map[string]any{
|
||||
"query": str("question to answer"),
|
||||
}),
|
||||
},
|
||||
{
|
||||
"name": "brain_classify",
|
||||
"description": "Classify raw text into doc type, title, and tags using an LLM.",
|
||||
"inputSchema": schema([]string{"text"}, map[string]any{
|
||||
"text": str("raw document text to classify (first 3000 chars used)"),
|
||||
}),
|
||||
},
|
||||
{
|
||||
"name": "session_log",
|
||||
"description": "Append a structured entry to brain/sessions/<session_id>.jsonl.",
|
||||
"inputSchema": schema([]string{"session_id"}, map[string]any{
|
||||
"session_id": str("session identifier"),
|
||||
"skill": str("skill name"),
|
||||
"phase": str("phase within the skill"),
|
||||
"project_root": str("absolute project root"),
|
||||
"final_status": str("pass | fail | skip (legacy: ok | error | skipped also accepted)"),
|
||||
"file_path": str("optional file produced"),
|
||||
"model_used": str("optional model identifier"),
|
||||
"duration_ms": int_("optional duration in ms"),
|
||||
"message": str("optional free-text"),
|
||||
}),
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
type brainQueryArgs struct {
|
||||
Query string `json:"query"`
|
||||
Limit int `json:"limit,omitempty"`
|
||||
}
|
||||
|
||||
func (s *Server) brainQuery(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
|
||||
var a brainQueryArgs
|
||||
if err := json.Unmarshal(args, &a); err != nil {
|
||||
return nil, fmt.Errorf("parse args: %w", err)
|
||||
}
|
||||
if a.Query == "" {
|
||||
return nil, fmt.Errorf("query is required")
|
||||
}
|
||||
if a.Limit == 0 {
|
||||
a.Limit = 5
|
||||
}
|
||||
results, err := search.Query(s.brainDir, a.Query, a.Limit)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("search: %w", err)
|
||||
}
|
||||
return json.Marshal(map[string]any{"results": results})
|
||||
}
|
||||
|
||||
type brainWriteArgs struct {
|
||||
Content string `json:"content"`
|
||||
Filename string `json:"filename,omitempty"`
|
||||
Type string `json:"type,omitempty"`
|
||||
Domain string `json:"domain,omitempty"`
|
||||
}
|
||||
|
||||
func (s *Server) brainWrite(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
|
||||
var a brainWriteArgs
|
||||
if err := json.Unmarshal(args, &a); err != nil {
|
||||
return nil, fmt.Errorf("parse args: %w", err)
|
||||
}
|
||||
relPath, err := api.WriteNote(s.brainDir, a.Content, a.Filename, a.Type, a.Domain)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return json.Marshal(map[string]string{"path": relPath})
|
||||
}
|
||||
|
||||
type brainIngestRawArgs struct {
|
||||
Source string `json:"source"`
|
||||
Pages []pipeline.RawPage `json:"pages"`
|
||||
DryRun bool `json:"dry_run,omitempty"`
|
||||
}
|
||||
|
||||
func (s *Server) brainIngestRaw(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
|
||||
var a brainIngestRawArgs
|
||||
if err := json.Unmarshal(args, &a); err != nil {
|
||||
return nil, fmt.Errorf("parse args: %w", err)
|
||||
}
|
||||
if a.Source == "" {
|
||||
return nil, fmt.Errorf("source is required")
|
||||
}
|
||||
if len(a.Pages) == 0 {
|
||||
return nil, fmt.Errorf("pages must be non-empty")
|
||||
}
|
||||
result, err := pipeline.RunRaw(s.brainDir, a.Source, a.Pages, a.DryRun)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("ingest: %w", err)
|
||||
}
|
||||
pages := result.Pages
|
||||
if pages == nil {
|
||||
pages = []string{}
|
||||
}
|
||||
warnings := result.Warnings
|
||||
if warnings == nil {
|
||||
warnings = []string{}
|
||||
}
|
||||
return json.Marshal(map[string]any{"pages": pages, "warnings": warnings})
|
||||
}
|
||||
|
||||
type brainIngestArgs struct {
|
||||
Content string `json:"content,omitempty"`
|
||||
Source string `json:"source,omitempty"`
|
||||
Path string `json:"path,omitempty"`
|
||||
DryRun bool `json:"dry_run,omitempty"`
|
||||
}
|
||||
|
||||
func (s *Server) brainIngest(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
|
||||
var a brainIngestArgs
|
||||
if err := json.Unmarshal(args, &a); err != nil {
|
||||
return nil, fmt.Errorf("parse args: %w", err)
|
||||
}
|
||||
if a.Path != "" && a.Content != "" {
|
||||
return nil, fmt.Errorf("path and content+source are mutually exclusive")
|
||||
}
|
||||
if a.Path == "" && a.Content == "" {
|
||||
return nil, fmt.Errorf("either path or content+source is required")
|
||||
}
|
||||
if s.pipeline.Complete == nil {
|
||||
return nil, fmt.Errorf("LLM not configured: set INGEST_LLM_URL")
|
||||
}
|
||||
|
||||
if a.Path != "" {
|
||||
text, err := extract.Text(a.Path)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("extract: %w", err)
|
||||
}
|
||||
source := a.Source
|
||||
if source == "" {
|
||||
source = filepath.Base(strings.TrimSuffix(a.Path, filepath.Ext(a.Path)))
|
||||
}
|
||||
return s.runIngest(ctx, text, source, a.DryRun)
|
||||
}
|
||||
if a.Source == "" {
|
||||
return nil, fmt.Errorf("source is required when content is provided")
|
||||
}
|
||||
return s.runIngest(ctx, a.Content, a.Source, a.DryRun)
|
||||
}
|
||||
|
||||
type sessionLogArgs struct {
|
||||
SessionID string `json:"session_id"`
|
||||
Skill string `json:"skill,omitempty"`
|
||||
Phase string `json:"phase,omitempty"`
|
||||
ProjectRoot string `json:"project_root,omitempty"`
|
||||
FinalStatus string `json:"final_status,omitempty"`
|
||||
FilePath string `json:"file_path,omitempty"`
|
||||
ModelUsed string `json:"model_used,omitempty"`
|
||||
DurationMs int64 `json:"duration_ms,omitempty"`
|
||||
Message string `json:"message,omitempty"`
|
||||
}
|
||||
|
||||
func (s *Server) sessionLog(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
|
||||
var a sessionLogArgs
|
||||
if err := json.Unmarshal(args, &a); err != nil {
|
||||
return nil, fmt.Errorf("parse args: %w", err)
|
||||
}
|
||||
if a.SessionID == "" {
|
||||
return nil, fmt.Errorf("session_id is required")
|
||||
}
|
||||
entry := session.Entry{
|
||||
SessionID: a.SessionID,
|
||||
Timestamp: time.Now().UTC(),
|
||||
Skill: a.Skill,
|
||||
Phase: a.Phase,
|
||||
ProjectRoot: a.ProjectRoot,
|
||||
FinalStatus: a.FinalStatus,
|
||||
FilePath: a.FilePath,
|
||||
ModelUsed: a.ModelUsed,
|
||||
DurationMs: a.DurationMs,
|
||||
Message: a.Message,
|
||||
}
|
||||
dir := filepath.Join(s.brainDir, "sessions")
|
||||
if err := session.Append(dir, a.SessionID, entry); err != nil {
|
||||
return nil, fmt.Errorf("append: %w", err)
|
||||
}
|
||||
return json.Marshal(map[string]string{"status": "ok", "session_id": a.SessionID})
|
||||
}
|
||||
|
||||
func (s *Server) runIngest(ctx context.Context, content, source string, dryRun bool) (json.RawMessage, error) {
|
||||
result, err := pipeline.Run(ctx, s.pipeline, s.brainDir, content, source, dryRun)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("ingest: %w", err)
|
||||
}
|
||||
pages := result.Pages
|
||||
if pages == nil {
|
||||
pages = []string{}
|
||||
}
|
||||
warnings := result.Warnings
|
||||
if warnings == nil {
|
||||
warnings = []string{}
|
||||
}
|
||||
return json.Marshal(map[string]any{"pages": pages, "warnings": warnings})
|
||||
}
|
||||
196
ingestion/internal/mcp/handlers_test.go
Normal file
196
ingestion/internal/mcp/handlers_test.go
Normal file
@@ -0,0 +1,196 @@
|
||||
package mcp_test
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"testing"
|
||||
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/mcp"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func toolCall(t *testing.T, srv http.Handler, name string, args map[string]any) map[string]any {
|
||||
t.Helper()
|
||||
bodyBytes, err := json.Marshal(map[string]any{
|
||||
"jsonrpc": "2.0", "id": 1, "method": "tools/call",
|
||||
"params": map[string]any{"name": name, "arguments": args},
|
||||
})
|
||||
require.NoError(t, err)
|
||||
req := httptest.NewRequest(http.MethodPost, "/mcp", bytes.NewReader(bodyBytes))
|
||||
rr := httptest.NewRecorder()
|
||||
srv.ServeHTTP(rr, req)
|
||||
require.Equal(t, http.StatusOK, rr.Code)
|
||||
var resp map[string]any
|
||||
require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &resp))
|
||||
return resp
|
||||
}
|
||||
|
||||
func TestBrainQueryReturnsResults(t *testing.T) {
|
||||
brainDir := t.TempDir()
|
||||
knowledge := filepath.Join(brainDir, "knowledge")
|
||||
require.NoError(t, os.MkdirAll(knowledge, 0o755))
|
||||
require.NoError(t, os.WriteFile(
|
||||
filepath.Join(knowledge, "tdd.md"),
|
||||
[]byte("# TDD\n\nTest-driven development is iterative.\n"),
|
||||
0o644,
|
||||
))
|
||||
|
||||
srv := mcp.NewServer(brainDir, nil, nil, nil)
|
||||
resp := toolCall(t, srv, "brain_query", map[string]any{"query": "tdd"})
|
||||
|
||||
require.Nil(t, resp["error"])
|
||||
result := resp["result"].(map[string]any)
|
||||
content := result["content"].([]any)
|
||||
require.NotEmpty(t, content)
|
||||
text := content[0].(map[string]any)["text"].(string)
|
||||
assert.Contains(t, text, "tdd.md")
|
||||
}
|
||||
|
||||
func TestBrainWriteCreatesFile(t *testing.T) {
|
||||
brainDir := t.TempDir()
|
||||
srv := mcp.NewServer(brainDir, nil, nil, nil)
|
||||
|
||||
resp := toolCall(t, srv, "brain_write", map[string]any{
|
||||
"content": "# Test\n\nbody",
|
||||
"filename": "test.md",
|
||||
"type": "note",
|
||||
"domain": "personal",
|
||||
})
|
||||
require.Nil(t, resp["error"])
|
||||
|
||||
got, err := os.ReadFile(filepath.Join(brainDir, "knowledge", "test.md"))
|
||||
require.NoError(t, err)
|
||||
assert.Contains(t, string(got), "type: note")
|
||||
assert.Contains(t, string(got), "domain: personal")
|
||||
assert.Contains(t, string(got), "# Test")
|
||||
}
|
||||
|
||||
func TestBrainWriteRejectsTraversal(t *testing.T) {
|
||||
brainDir := t.TempDir()
|
||||
srv := mcp.NewServer(brainDir, nil, nil, nil)
|
||||
|
||||
resp := toolCall(t, srv, "brain_write", map[string]any{
|
||||
"content": "x",
|
||||
"filename": "../escape.md",
|
||||
})
|
||||
require.NotNil(t, resp["error"])
|
||||
}
|
||||
|
||||
func TestBrainWriteAcceptsDoubleDotInName(t *testing.T) {
|
||||
brainDir := t.TempDir()
|
||||
srv := mcp.NewServer(brainDir, nil, nil, nil)
|
||||
|
||||
resp := toolCall(t, srv, "brain_write", map[string]any{
|
||||
"content": "x",
|
||||
"filename": "notes..draft.md",
|
||||
})
|
||||
require.Nil(t, resp["error"])
|
||||
|
||||
_, err := os.Stat(filepath.Join(brainDir, "knowledge", "notes..draft.md"))
|
||||
require.NoError(t, err, "filename with embedded .. should be allowed")
|
||||
}
|
||||
|
||||
func TestBrainIngestRawDryRun(t *testing.T) {
|
||||
brainDir := t.TempDir()
|
||||
require.NoError(t, os.MkdirAll(filepath.Join(brainDir, "wiki", "concepts"), 0o755))
|
||||
srv := mcp.NewServer(brainDir, nil, nil, nil)
|
||||
|
||||
resp := toolCall(t, srv, "brain_ingest_raw", map[string]any{
|
||||
"source": "test-source",
|
||||
"dry_run": true,
|
||||
"pages": []map[string]any{
|
||||
{
|
||||
"title": "Test Concept",
|
||||
"type": "concept",
|
||||
"content": "## Definition\nA test concept.",
|
||||
},
|
||||
},
|
||||
})
|
||||
require.Nil(t, resp["error"])
|
||||
result := resp["result"].(map[string]any)
|
||||
content := result["content"].([]any)
|
||||
text := content[0].(map[string]any)["text"].(string)
|
||||
|
||||
var parsed struct {
|
||||
Pages []string `json:"pages"`
|
||||
}
|
||||
require.NoError(t, json.Unmarshal([]byte(text), &parsed))
|
||||
require.NotEmpty(t, parsed.Pages, "expected at least one page path")
|
||||
assert.Contains(t, parsed.Pages[0], "wiki/concepts/test-concept.md")
|
||||
|
||||
// dry_run: no file should exist
|
||||
_, err := os.Stat(filepath.Join(brainDir, "wiki", "concepts", "test-concept.md"))
|
||||
assert.True(t, os.IsNotExist(err))
|
||||
}
|
||||
|
||||
func TestBrainIngestRejectsBoth(t *testing.T) {
|
||||
brainDir := t.TempDir()
|
||||
srv := mcp.NewServer(brainDir, nil, nil, nil)
|
||||
|
||||
resp := toolCall(t, srv, "brain_ingest", map[string]any{
|
||||
"content": "x",
|
||||
"source": "y",
|
||||
"path": "/tmp/foo.md",
|
||||
})
|
||||
require.NotNil(t, resp["error"])
|
||||
}
|
||||
|
||||
func TestBrainIngestRequiresOne(t *testing.T) {
|
||||
brainDir := t.TempDir()
|
||||
srv := mcp.NewServer(brainDir, nil, nil, nil)
|
||||
|
||||
resp := toolCall(t, srv, "brain_ingest", map[string]any{})
|
||||
require.NotNil(t, resp["error"])
|
||||
}
|
||||
|
||||
func TestBrainIngestRejectsContentWithoutSource(t *testing.T) {
|
||||
brainDir := t.TempDir()
|
||||
srv := mcp.NewServer(brainDir, nil, nil, nil)
|
||||
|
||||
resp := toolCall(t, srv, "brain_ingest", map[string]any{
|
||||
"content": "x",
|
||||
})
|
||||
require.NotNil(t, resp["error"])
|
||||
}
|
||||
|
||||
func TestBrainIngestRequiresLLMConfigured(t *testing.T) {
|
||||
brainDir := t.TempDir()
|
||||
srv := mcp.NewServer(brainDir, nil, nil, nil) // nil pipelineCfg → no LLM
|
||||
|
||||
resp := toolCall(t, srv, "brain_ingest", map[string]any{
|
||||
"content": "some content",
|
||||
"source": "test",
|
||||
})
|
||||
require.NotNil(t, resp["error"])
|
||||
errObj := resp["error"].(map[string]any)
|
||||
assert.Contains(t, errObj["message"].(string), "LLM not configured")
|
||||
}
|
||||
|
||||
func TestSessionLogAppends(t *testing.T) {
|
||||
brainDir := t.TempDir()
|
||||
srv := mcp.NewServer(brainDir, nil, nil, nil)
|
||||
|
||||
resp := toolCall(t, srv, "session_log", map[string]any{
|
||||
"session_id": "session-x",
|
||||
"skill": "tdd",
|
||||
"phase": "red",
|
||||
"final_status": "ok",
|
||||
})
|
||||
require.Nil(t, resp["error"])
|
||||
|
||||
got, err := os.ReadFile(filepath.Join(brainDir, "sessions", "session-x.jsonl"))
|
||||
require.NoError(t, err)
|
||||
assert.Contains(t, string(got), `"skill":"tdd"`)
|
||||
assert.Contains(t, string(got), `"phase":"red"`)
|
||||
}
|
||||
|
||||
func TestSessionLogRequiresSessionID(t *testing.T) {
|
||||
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
|
||||
resp := toolCall(t, srv, "session_log", map[string]any{"skill": "tdd"})
|
||||
require.NotNil(t, resp["error"])
|
||||
}
|
||||
35
ingestion/internal/mcp/integration_test.go
Normal file
35
ingestion/internal/mcp/integration_test.go
Normal file
@@ -0,0 +1,35 @@
|
||||
package mcp_test
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"encoding/json"
|
||||
"io"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"testing"
|
||||
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/mcp"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func TestMCPMountedHandler(t *testing.T) {
|
||||
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
|
||||
mux := http.NewServeMux()
|
||||
mux.Handle("POST /mcp", srv)
|
||||
|
||||
ts := httptest.NewServer(mux)
|
||||
defer ts.Close()
|
||||
|
||||
body, err := json.Marshal(map[string]any{
|
||||
"jsonrpc": "2.0", "id": 1, "method": "tools/list",
|
||||
})
|
||||
require.NoError(t, err)
|
||||
resp, err := http.Post(ts.URL+"/mcp", "application/json", bytes.NewReader(body))
|
||||
require.NoError(t, err)
|
||||
defer func() { _ = resp.Body.Close() }()
|
||||
assert.Equal(t, http.StatusOK, resp.StatusCode)
|
||||
|
||||
out, _ := io.ReadAll(resp.Body)
|
||||
assert.Contains(t, string(out), `"brain_query"`)
|
||||
}
|
||||
152
ingestion/internal/mcp/server.go
Normal file
152
ingestion/internal/mcp/server.go
Normal file
@@ -0,0 +1,152 @@
|
||||
// Package mcp implements an MCP HTTP handler for the ingestion service.
|
||||
// Exposed tools: brain_query, brain_write, brain_ingest, brain_ingest_raw, session_log.
|
||||
package mcp
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"net/http"
|
||||
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
|
||||
)
|
||||
|
||||
type request struct {
|
||||
JSONRPC string `json:"jsonrpc"`
|
||||
ID any `json:"id"`
|
||||
Method string `json:"method"`
|
||||
Params json.RawMessage `json:"params"`
|
||||
}
|
||||
|
||||
type response struct {
|
||||
JSONRPC string `json:"jsonrpc"`
|
||||
ID any `json:"id,omitempty"`
|
||||
Result any `json:"result,omitempty"`
|
||||
Error *rpcError `json:"error,omitempty"`
|
||||
}
|
||||
|
||||
type rpcError struct {
|
||||
Code int `json:"code"`
|
||||
Message string `json:"message"`
|
||||
}
|
||||
|
||||
// Server handles MCP JSON-RPC over HTTP for the ingestion service.
|
||||
type Server struct {
|
||||
brainDir string
|
||||
pipeline pipeline.Config
|
||||
llm pipeline.CompleteFunc
|
||||
answerLLM pipeline.CompleteFunc // nil = brain_answer and brain_classify unavailable
|
||||
}
|
||||
|
||||
// NewServer constructs a Server bound to brainDir. pipelineCfg supplies the
|
||||
// LLM-backed pipeline; llm may be nil for non-LLM tools only.
|
||||
// answerLLM drives brain_answer and brain_classify; nil disables those tools.
|
||||
func NewServer(brainDir string, pipelineCfg *pipeline.Config, llm pipeline.CompleteFunc, answerLLM pipeline.CompleteFunc) *Server {
|
||||
cfg := pipeline.Config{}
|
||||
if pipelineCfg != nil {
|
||||
cfg = *pipelineCfg
|
||||
}
|
||||
return &Server{brainDir: brainDir, pipeline: cfg, llm: llm, answerLLM: answerLLM}
|
||||
}
|
||||
|
||||
func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {
|
||||
// MCP streamable HTTP: GET establishes the SSE stream for server-to-client events.
|
||||
if r.Method == http.MethodGet {
|
||||
w.Header().Set("Content-Type", "text/event-stream")
|
||||
w.Header().Set("Cache-Control", "no-cache")
|
||||
w.Header().Set("Connection", "keep-alive")
|
||||
w.Header().Set("X-Accel-Buffering", "no")
|
||||
w.WriteHeader(http.StatusOK)
|
||||
if f, ok := w.(http.Flusher); ok {
|
||||
f.Flush()
|
||||
}
|
||||
<-r.Context().Done()
|
||||
return
|
||||
}
|
||||
|
||||
var req request
|
||||
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
|
||||
writeError(w, nil, -32700, "parse error")
|
||||
return
|
||||
}
|
||||
|
||||
// JSON-RPC 2.0 notifications (no id) must not receive a response.
|
||||
if req.ID == nil {
|
||||
return
|
||||
}
|
||||
|
||||
var result any
|
||||
var rpcErr *rpcError
|
||||
|
||||
switch req.Method {
|
||||
case "initialize":
|
||||
result = map[string]any{
|
||||
"protocolVersion": "2024-11-05",
|
||||
"capabilities": map[string]any{"tools": map[string]any{}},
|
||||
"serverInfo": map[string]any{"name": "ingestion-brain", "version": "0.1.0"},
|
||||
}
|
||||
|
||||
case "tools/list":
|
||||
result = map[string]any{"tools": s.tools()}
|
||||
|
||||
case "tools/call":
|
||||
var p struct {
|
||||
Name string `json:"name"`
|
||||
Arguments json.RawMessage `json:"arguments"`
|
||||
}
|
||||
if err := json.Unmarshal(req.Params, &p); err != nil {
|
||||
rpcErr = &rpcError{Code: -32602, Message: "invalid params"}
|
||||
break
|
||||
}
|
||||
out, err := s.handleCall(r.Context(), p.Name, p.Arguments)
|
||||
if err != nil {
|
||||
rpcErr = &rpcError{Code: -32000, Message: err.Error()}
|
||||
break
|
||||
}
|
||||
result = map[string]any{
|
||||
"content": []map[string]any{{"type": "text", "text": string(out)}},
|
||||
}
|
||||
|
||||
default:
|
||||
rpcErr = &rpcError{Code: -32601, Message: "method not found: " + req.Method}
|
||||
}
|
||||
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_ = json.NewEncoder(w).Encode(response{
|
||||
JSONRPC: "2.0",
|
||||
ID: req.ID,
|
||||
Result: result,
|
||||
Error: rpcErr,
|
||||
})
|
||||
}
|
||||
|
||||
func writeError(w http.ResponseWriter, id any, code int, msg string) {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_ = json.NewEncoder(w).Encode(response{
|
||||
JSONRPC: "2.0",
|
||||
ID: id,
|
||||
Error: &rpcError{Code: code, Message: msg},
|
||||
})
|
||||
}
|
||||
|
||||
// handleCall dispatches a tools/call to the appropriate tool handler.
|
||||
func (s *Server) handleCall(ctx context.Context, name string, args json.RawMessage) (json.RawMessage, error) {
|
||||
switch name {
|
||||
case "brain_query":
|
||||
return s.brainQuery(ctx, args)
|
||||
case "brain_write":
|
||||
return s.brainWrite(ctx, args)
|
||||
case "brain_ingest_raw":
|
||||
return s.brainIngestRaw(ctx, args)
|
||||
case "brain_ingest":
|
||||
return s.brainIngest(ctx, args)
|
||||
case "session_log":
|
||||
return s.sessionLog(ctx, args)
|
||||
case "brain_answer":
|
||||
return s.brainAnswer(ctx, args)
|
||||
case "brain_classify":
|
||||
return s.brainClassify(ctx, args)
|
||||
default:
|
||||
return nil, fmt.Errorf("unknown tool: %s", name)
|
||||
}
|
||||
}
|
||||
92
ingestion/internal/mcp/server_test.go
Normal file
92
ingestion/internal/mcp/server_test.go
Normal file
@@ -0,0 +1,92 @@
|
||||
package mcp_test
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/mcp"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func body(t *testing.T, v any) *bytes.Buffer {
|
||||
t.Helper()
|
||||
b, err := json.Marshal(v)
|
||||
require.NoError(t, err)
|
||||
return bytes.NewBuffer(b)
|
||||
}
|
||||
|
||||
func TestServerInitialize(t *testing.T) {
|
||||
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
|
||||
|
||||
req := httptest.NewRequest(http.MethodPost, "/mcp", body(t, map[string]any{
|
||||
"jsonrpc": "2.0", "id": 1, "method": "initialize",
|
||||
"params": map[string]any{},
|
||||
}))
|
||||
rr := httptest.NewRecorder()
|
||||
srv.ServeHTTP(rr, req)
|
||||
|
||||
assert.Equal(t, http.StatusOK, rr.Code)
|
||||
var resp map[string]any
|
||||
require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &resp))
|
||||
result := resp["result"].(map[string]any)
|
||||
assert.Equal(t, "2024-11-05", result["protocolVersion"])
|
||||
}
|
||||
|
||||
func TestServerToolsList(t *testing.T) {
|
||||
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
|
||||
|
||||
req := httptest.NewRequest(http.MethodPost, "/mcp", body(t, map[string]any{
|
||||
"jsonrpc": "2.0", "id": 2, "method": "tools/list",
|
||||
}))
|
||||
rr := httptest.NewRecorder()
|
||||
srv.ServeHTTP(rr, req)
|
||||
|
||||
assert.Equal(t, http.StatusOK, rr.Code)
|
||||
var resp map[string]any
|
||||
require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &resp))
|
||||
tools := resp["result"].(map[string]any)["tools"].([]any)
|
||||
names := make([]string, 0, len(tools))
|
||||
for _, t := range tools {
|
||||
names = append(names, t.(map[string]any)["name"].(string))
|
||||
}
|
||||
assert.ElementsMatch(t, []string{
|
||||
"brain_query", "brain_write", "brain_ingest_raw", "brain_ingest",
|
||||
"brain_answer", "brain_classify", "session_log",
|
||||
}, names)
|
||||
}
|
||||
|
||||
func TestServerNotificationGetsNoBody(t *testing.T) {
|
||||
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
|
||||
|
||||
req := httptest.NewRequest(http.MethodPost, "/mcp", body(t, map[string]any{
|
||||
"jsonrpc": "2.0", "method": "notifications/initialized",
|
||||
}))
|
||||
rr := httptest.NewRecorder()
|
||||
srv.ServeHTTP(rr, req)
|
||||
|
||||
assert.Equal(t, http.StatusOK, rr.Code)
|
||||
assert.Empty(t, strings.TrimSpace(rr.Body.String()))
|
||||
}
|
||||
|
||||
func TestServerUnknownMethodReturnsError(t *testing.T) {
|
||||
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
|
||||
|
||||
req := httptest.NewRequest(http.MethodPost, "/mcp", body(t, map[string]any{
|
||||
"jsonrpc": "2.0", "id": 3, "method": "unknown/method",
|
||||
}))
|
||||
rr := httptest.NewRecorder()
|
||||
srv.ServeHTTP(rr, req)
|
||||
|
||||
assert.Equal(t, http.StatusOK, rr.Code)
|
||||
var resp map[string]any
|
||||
require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &resp))
|
||||
require.NotNil(t, resp["error"])
|
||||
errObj := resp["error"].(map[string]any)
|
||||
assert.Equal(t, float64(-32601), errObj["code"])
|
||||
assert.Contains(t, errObj["message"].(string), "unknown/method")
|
||||
}
|
||||
114
ingestion/internal/mcp/tools_answer.go
Normal file
114
ingestion/internal/mcp/tools_answer.go
Normal file
@@ -0,0 +1,114 @@
|
||||
package mcp
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"strings"
|
||||
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/search"
|
||||
)
|
||||
|
||||
const (
|
||||
answerSystemPrompt = `You are a knowledge assistant. Answer the question using ONLY the provided sources.
|
||||
Cite source file paths inline when referencing specific content.
|
||||
If the context does not contain enough information to answer, say so clearly.`
|
||||
|
||||
classifySystemPrompt = `Classify the document. Respond with JSON only, no markdown fences.
|
||||
{"type":"...","title":"...","tags":["..."]}
|
||||
Valid types: spec, plan, decision, note, wiki, log, code, unknown.`
|
||||
)
|
||||
|
||||
type brainAnswerArgs struct {
|
||||
Query string `json:"query"`
|
||||
}
|
||||
|
||||
func (s *Server) brainAnswer(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
|
||||
if s.answerLLM == nil {
|
||||
return nil, fmt.Errorf("answer LLM not configured: set BRAIN_LLM_PRIMARY_URL")
|
||||
}
|
||||
var a brainAnswerArgs
|
||||
if err := json.Unmarshal(args, &a); err != nil {
|
||||
return nil, fmt.Errorf("parse args: %w", err)
|
||||
}
|
||||
if a.Query == "" {
|
||||
return nil, fmt.Errorf("query is required")
|
||||
}
|
||||
|
||||
results, err := search.Query(s.brainDir, a.Query, 10)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("search: %w", err)
|
||||
}
|
||||
if len(results) == 0 {
|
||||
return json.Marshal(map[string]any{
|
||||
"answer": "No relevant content found in brain.",
|
||||
"sources": []string{},
|
||||
})
|
||||
}
|
||||
|
||||
var sb strings.Builder
|
||||
sources := make([]string, 0, len(results))
|
||||
for _, r := range results {
|
||||
fmt.Fprintf(&sb, "<source path=%q>\n%s\n</source>\n\n", r.Path, r.Excerpt)
|
||||
sources = append(sources, r.Path)
|
||||
}
|
||||
|
||||
answer, err := s.answerLLM(ctx, answerSystemPrompt, sb.String()+"Question: "+a.Query)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("llm: %w", err)
|
||||
}
|
||||
|
||||
return json.Marshal(map[string]any{
|
||||
"answer": answer,
|
||||
"sources": sources,
|
||||
})
|
||||
}
|
||||
|
||||
type brainClassifyArgs struct {
|
||||
Text string `json:"text"`
|
||||
}
|
||||
|
||||
type classifyResult struct {
|
||||
Type string `json:"type"`
|
||||
Title string `json:"title"`
|
||||
Tags []string `json:"tags"`
|
||||
}
|
||||
|
||||
func (s *Server) brainClassify(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
|
||||
if s.answerLLM == nil {
|
||||
return nil, fmt.Errorf("answer LLM not configured: set BRAIN_LLM_PRIMARY_URL")
|
||||
}
|
||||
var a brainClassifyArgs
|
||||
if err := json.Unmarshal(args, &a); err != nil {
|
||||
return nil, fmt.Errorf("parse args: %w", err)
|
||||
}
|
||||
if a.Text == "" {
|
||||
return nil, fmt.Errorf("text is required")
|
||||
}
|
||||
|
||||
text := a.Text
|
||||
if len(text) > 3000 {
|
||||
text = text[:3000]
|
||||
}
|
||||
|
||||
raw, err := s.answerLLM(ctx, classifySystemPrompt, text)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("llm: %w", err)
|
||||
}
|
||||
|
||||
// Strip markdown fences if model adds them despite the instruction.
|
||||
raw = strings.TrimSpace(raw)
|
||||
raw = strings.TrimPrefix(raw, "```json")
|
||||
raw = strings.TrimPrefix(raw, "```")
|
||||
raw = strings.TrimSuffix(raw, "```")
|
||||
raw = strings.TrimSpace(raw)
|
||||
|
||||
var cr classifyResult
|
||||
if err := json.Unmarshal([]byte(raw), &cr); err != nil {
|
||||
return nil, fmt.Errorf("parse classify response %q: %w", raw, err)
|
||||
}
|
||||
if cr.Tags == nil {
|
||||
cr.Tags = []string{}
|
||||
}
|
||||
return json.Marshal(cr)
|
||||
}
|
||||
103
ingestion/internal/mcp/tools_answer_test.go
Normal file
103
ingestion/internal/mcp/tools_answer_test.go
Normal file
@@ -0,0 +1,103 @@
|
||||
package mcp_test
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"testing"
|
||||
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/mcp"
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func mockAnswerLLM(response string) pipeline.CompleteFunc {
|
||||
return func(_ context.Context, _, _ string) (string, error) {
|
||||
return response, nil
|
||||
}
|
||||
}
|
||||
|
||||
func brainDirWithContent(t *testing.T) string {
|
||||
t.Helper()
|
||||
dir := t.TempDir()
|
||||
wikiDir := filepath.Join(dir, "wiki")
|
||||
require.NoError(t, os.MkdirAll(wikiDir, 0o755))
|
||||
require.NoError(t, os.WriteFile(filepath.Join(wikiDir, "test.md"), []byte(
|
||||
"---\ntitle: Pass-rate Logging\ntype: spec\n---\n\nPass-rate logging tracks skill invocations.",
|
||||
), 0o644))
|
||||
return dir
|
||||
}
|
||||
|
||||
func callTool(t *testing.T, ts *httptest.Server, name string, arguments map[string]any) map[string]any {
|
||||
t.Helper()
|
||||
req := map[string]any{
|
||||
"jsonrpc": "2.0", "id": 1, "method": "tools/call",
|
||||
"params": map[string]any{"name": name, "arguments": arguments},
|
||||
}
|
||||
resp, err := http.Post(ts.URL, "application/json", body(t, req))
|
||||
require.NoError(t, err)
|
||||
defer resp.Body.Close() //nolint:errcheck
|
||||
var out map[string]any
|
||||
require.NoError(t, json.NewDecoder(resp.Body).Decode(&out))
|
||||
return out
|
||||
}
|
||||
|
||||
func TestBrainAnswer_NoLLM(t *testing.T) {
|
||||
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
|
||||
ts := httptest.NewServer(srv)
|
||||
defer ts.Close()
|
||||
|
||||
rpc := callTool(t, ts, "brain_answer", map[string]any{"query": "test"})
|
||||
assert.NotNil(t, rpc["error"], "expected error when answerLLM is nil")
|
||||
}
|
||||
|
||||
func TestBrainAnswer_Synthesizes(t *testing.T) {
|
||||
brainDir := brainDirWithContent(t)
|
||||
srv := mcp.NewServer(brainDir, nil, nil, mockAnswerLLM("Pass-rate logging is described in spec."))
|
||||
ts := httptest.NewServer(srv)
|
||||
defer ts.Close()
|
||||
|
||||
rpc := callTool(t, ts, "brain_answer", map[string]any{"query": "pass-rate logging"})
|
||||
require.Nil(t, rpc["error"])
|
||||
|
||||
content := rpc["result"].(map[string]any)["content"].([]any)[0].(map[string]any)["text"].(string)
|
||||
var result map[string]any
|
||||
require.NoError(t, json.Unmarshal([]byte(content), &result))
|
||||
assert.Equal(t, "Pass-rate logging is described in spec.", result["answer"])
|
||||
assert.NotEmpty(t, result["sources"])
|
||||
}
|
||||
|
||||
func TestBrainClassify_ReturnsJSON(t *testing.T) {
|
||||
llmResp := `{"type":"spec","title":"My Spec","tags":["go","mcp"]}`
|
||||
srv := mcp.NewServer(t.TempDir(), nil, nil, mockAnswerLLM(llmResp))
|
||||
ts := httptest.NewServer(srv)
|
||||
defer ts.Close()
|
||||
|
||||
rpc := callTool(t, ts, "brain_classify", map[string]any{"text": "# My Spec\n\nThis is a Go MCP spec."})
|
||||
require.Nil(t, rpc["error"])
|
||||
|
||||
content := rpc["result"].(map[string]any)["content"].([]any)[0].(map[string]any)["text"].(string)
|
||||
var result map[string]any
|
||||
require.NoError(t, json.Unmarshal([]byte(content), &result))
|
||||
assert.Equal(t, "spec", result["type"])
|
||||
assert.Equal(t, "My Spec", result["title"])
|
||||
}
|
||||
|
||||
func TestBrainClassify_StripsFences(t *testing.T) {
|
||||
llmResp := "```json\n{\"type\":\"note\",\"title\":\"T\",\"tags\":[]}\n```"
|
||||
srv := mcp.NewServer(t.TempDir(), nil, nil, mockAnswerLLM(llmResp))
|
||||
ts := httptest.NewServer(srv)
|
||||
defer ts.Close()
|
||||
|
||||
rpc := callTool(t, ts, "brain_classify", map[string]any{"text": "some text"})
|
||||
require.Nil(t, rpc["error"])
|
||||
|
||||
content := rpc["result"].(map[string]any)["content"].([]any)[0].(map[string]any)["text"].(string)
|
||||
var result map[string]any
|
||||
require.NoError(t, json.Unmarshal([]byte(content), &result))
|
||||
assert.Equal(t, "note", result["type"])
|
||||
}
|
||||
@@ -18,7 +18,8 @@ type RawPage struct {
|
||||
}
|
||||
|
||||
// ParseRawPages parses LLM output as a JSON array of RawPage objects.
|
||||
// If the array is truncated mid-object (token limit), it salvages all complete objects.
|
||||
// If the output contains invalid JSON escape sequences (e.g. \. from Markdown),
|
||||
// it attempts repair before falling back to truncation recovery.
|
||||
func ParseRawPages(output string) ([]RawPage, []string) {
|
||||
output = strings.TrimSpace(output)
|
||||
if output == "" {
|
||||
@@ -27,23 +28,30 @@ func ParseRawPages(output string) ([]RawPage, []string) {
|
||||
|
||||
output = stripFences(output)
|
||||
|
||||
// Fast path: valid JSON.
|
||||
var pages []RawPage
|
||||
if err := json.Unmarshal([]byte(output), &pages); err == nil {
|
||||
return pages, nil
|
||||
}
|
||||
|
||||
// Repair pass: fix invalid escape sequences (e.g. \. \d from Markdown content).
|
||||
repaired := repairJSON(output)
|
||||
if err := json.Unmarshal([]byte(repaired), &pages); err == nil {
|
||||
return pages, []string{"repaired invalid JSON escape sequences in LLM output"}
|
||||
}
|
||||
|
||||
// Truncation recovery: find last `}` that closes a complete object.
|
||||
idx := strings.LastIndex(output, "}")
|
||||
idx := strings.LastIndex(repaired, "}")
|
||||
if idx < 0 {
|
||||
return nil, []string{"LLM output contained no complete JSON objects"}
|
||||
}
|
||||
|
||||
start := strings.Index(output, "[")
|
||||
start := strings.Index(repaired, "[")
|
||||
if start < 0 {
|
||||
return nil, []string{"LLM output contained no JSON array opening bracket"}
|
||||
}
|
||||
|
||||
candidate := output[start:idx+1] + "]"
|
||||
candidate := repaired[start:idx+1] + "]"
|
||||
if err := json.Unmarshal([]byte(candidate), &pages); err != nil {
|
||||
return nil, []string{fmt.Sprintf("truncation recovery failed: %v", err)}
|
||||
}
|
||||
@@ -51,6 +59,45 @@ func ParseRawPages(output string) ([]RawPage, []string) {
|
||||
return pages, []string{fmt.Sprintf("LLM output was truncated; recovered %d page(s)", len(pages))}
|
||||
}
|
||||
|
||||
// repairJSON replaces invalid JSON escape sequences (e.g. \. \d \p) with
|
||||
// a properly escaped backslash followed by the same character.
|
||||
// It iterates byte-by-byte to correctly skip already-valid escape sequences
|
||||
// (including \\) without requiring lookbehind support.
|
||||
func repairJSON(s string) string {
|
||||
var b strings.Builder
|
||||
b.Grow(len(s))
|
||||
i := 0
|
||||
for i < len(s) {
|
||||
if s[i] != '\\' {
|
||||
b.WriteByte(s[i])
|
||||
i++
|
||||
continue
|
||||
}
|
||||
// We have a backslash. Peek at the next character.
|
||||
if i+1 >= len(s) {
|
||||
// Trailing backslash — emit as-is.
|
||||
b.WriteByte(s[i])
|
||||
i++
|
||||
continue
|
||||
}
|
||||
next := s[i+1]
|
||||
switch next {
|
||||
case '"', '\\', '/', 'b', 'f', 'n', 'r', 't', 'u':
|
||||
// Valid JSON escape sequence — emit both characters as-is.
|
||||
b.WriteByte(s[i])
|
||||
b.WriteByte(next)
|
||||
i += 2
|
||||
default:
|
||||
// Invalid escape — double the backslash.
|
||||
b.WriteByte('\\')
|
||||
b.WriteByte('\\')
|
||||
b.WriteByte(next)
|
||||
i += 2
|
||||
}
|
||||
}
|
||||
return b.String()
|
||||
}
|
||||
|
||||
func stripFences(s string) string {
|
||||
for _, prefix := range []string{"```json\n", "```json\r\n", "```\n", "```\r\n"} {
|
||||
if strings.HasPrefix(s, prefix) {
|
||||
|
||||
@@ -59,3 +59,29 @@ func TestParseRawPages_MissingTitle(t *testing.T) {
|
||||
assert.Empty(t, warnings)
|
||||
assert.Empty(t, pages[0].Title)
|
||||
}
|
||||
|
||||
func TestParseRawPages_InvalidEscapeRepaired(t *testing.T) {
|
||||
// LLM copied markdown escaped list numbers (\.) into JSON — invalid escape
|
||||
raw := "[{\"title\":\"Foo\",\"type\":\"concept\",\"content\":\"Step 4\\. Do it.\"}]"
|
||||
pages, warnings := ParseRawPages(raw)
|
||||
require.Len(t, pages, 1)
|
||||
assert.Equal(t, "Foo", pages[0].Title)
|
||||
assert.Contains(t, pages[0].Content, `4\.`)
|
||||
assert.NotEmpty(t, warnings) // repair warning
|
||||
}
|
||||
|
||||
func TestRepairJSON_FixesInvalidEscapes(t *testing.T) {
|
||||
cases := []struct {
|
||||
in string
|
||||
want string
|
||||
}{
|
||||
{`{"a":"foo\.bar"}`, `{"a":"foo\\.bar"}`},
|
||||
{`{"a":"\\n is fine"}`, `{"a":"\\n is fine"}`}, // valid \n untouched
|
||||
{`{"a":"\d+ items"}`, `{"a":"\\d+ items"}`},
|
||||
{`{"a":"already \\ escaped"}`, `{"a":"already \\ escaped"}`}, // valid \\ untouched
|
||||
}
|
||||
for _, tc := range cases {
|
||||
got := repairJSON(tc.in)
|
||||
assert.Equal(t, tc.want, got, "input: %s", tc.in)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -59,11 +59,31 @@ func Run(ctx context.Context, cfg Config, brainDir, content, source string, dryR
|
||||
allWarnings = append(allWarnings, warnings...)
|
||||
}
|
||||
|
||||
pages, buildWarnings := BuildPages(allRaw, sourceSlug, date)
|
||||
allWarnings = append(allWarnings, buildWarnings...)
|
||||
return buildAndWrite(allRaw, sourceSlug, date, brainDir, source, inventory, allWarnings, dryRun)
|
||||
}
|
||||
|
||||
// RunRaw runs the pipeline on pre-parsed RawPages, skipping the LLM extraction
|
||||
// step. Use this when the caller has already produced the structured RawPage data
|
||||
// (e.g. from a more capable model or manual curation).
|
||||
func RunRaw(brainDir, source string, rawPages []RawPage, dryRun bool) (Result, error) {
|
||||
inventory, err := wiki.LoadInventory(brainDir)
|
||||
if err != nil {
|
||||
return Result{}, fmt.Errorf("load inventory: %w", err)
|
||||
}
|
||||
|
||||
sourceSlug := wiki.Slug(source)
|
||||
date := time.Now().UTC().Format("2006-01-02")
|
||||
|
||||
return buildAndWrite(rawPages, sourceSlug, date, brainDir, source, inventory, nil, dryRun)
|
||||
}
|
||||
|
||||
// buildAndWrite runs BuildPages through write for both Run and RunRaw.
|
||||
func buildAndWrite(rawPages []RawPage, sourceSlug, date, brainDir, source string, inventory map[wiki.PageType][]wiki.Entry, warnings []string, dryRun bool) (Result, error) {
|
||||
pages, buildWarnings := BuildPages(rawPages, sourceSlug, date)
|
||||
warnings = append(warnings, buildWarnings...)
|
||||
resolved := Resolve(pages, inventory)
|
||||
canonicalized, linkWarnings := CanonicalizeLinks(resolved, inventory)
|
||||
allWarnings = append(allWarnings, linkWarnings...)
|
||||
warnings = append(warnings, linkWarnings...)
|
||||
withRefs := injectSourceRefs(canonicalized, inventory, brainDir)
|
||||
merged := mergeAll(withRefs)
|
||||
|
||||
@@ -83,14 +103,14 @@ func Run(ctx context.Context, cfg Config, brainDir, content, source string, dryR
|
||||
|
||||
if !dryRun {
|
||||
if err := wiki.RebuildIndex(brainDir, date); err != nil {
|
||||
allWarnings = append(allWarnings, fmt.Sprintf("rebuild index: %v", err))
|
||||
warnings = append(warnings, fmt.Sprintf("rebuild index: %v", err))
|
||||
}
|
||||
if err := wiki.AppendLog(brainDir, source, written, allWarnings, date); err != nil {
|
||||
allWarnings = append(allWarnings, fmt.Sprintf("append log: %v", err))
|
||||
if err := wiki.AppendLog(brainDir, source, written, warnings, date); err != nil {
|
||||
warnings = append(warnings, fmt.Sprintf("append log: %v", err))
|
||||
}
|
||||
}
|
||||
|
||||
return Result{Pages: written, Warnings: allWarnings}, nil
|
||||
return Result{Pages: written, Warnings: warnings}, nil
|
||||
}
|
||||
|
||||
// mergeAll deduplicates pages by path, merging content from later occurrences.
|
||||
|
||||
98
ingestion/internal/session/session.go
Normal file
98
ingestion/internal/session/session.go
Normal file
@@ -0,0 +1,98 @@
|
||||
// ingestion/internal/session/session.go
|
||||
package session
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"fmt"
|
||||
"io/fs"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"time"
|
||||
)
|
||||
|
||||
// Entry is one skill invocation record, appended to the session JSONL log.
|
||||
type Entry struct {
|
||||
SessionID string `json:"session_id"`
|
||||
Timestamp time.Time `json:"timestamp"`
|
||||
Skill string `json:"skill"`
|
||||
Phase string `json:"phase,omitempty"`
|
||||
ProjectRoot string `json:"project_root,omitempty"`
|
||||
Input json.RawMessage `json:"input,omitempty"`
|
||||
Attempts []Attempt `json:"attempts,omitempty"`
|
||||
FinalStatus string `json:"final_status"`
|
||||
FilePath string `json:"file_path,omitempty"`
|
||||
ModelUsed string `json:"model_used,omitempty"`
|
||||
DurationMs int64 `json:"duration_ms,omitempty"`
|
||||
Message string `json:"message,omitempty"`
|
||||
}
|
||||
|
||||
// Attempt represents one subprocess invocation within a skill call.
|
||||
type Attempt struct {
|
||||
Attempt int `json:"attempt"`
|
||||
Model string `json:"model"`
|
||||
Tier string `json:"tier"` // local | subagent | managed
|
||||
DurationMs int64 `json:"duration_ms"`
|
||||
WarmStart bool `json:"warm_start"` // model already loaded in llama-swap
|
||||
Verified bool `json:"verified"`
|
||||
Verdict string `json:"verdict,omitempty"` // accept | escalate | error
|
||||
Feedback string `json:"feedback,omitempty"` // verifier feedback on escalation
|
||||
OutputSummary string `json:"output_summary,omitempty"`
|
||||
RunnerOutput string `json:"runner_output,omitempty"`
|
||||
}
|
||||
|
||||
// Append writes entry as a single JSON line to sessionsDir/{sessionID}.jsonl.
|
||||
func Append(sessionsDir, sessionID string, entry Entry) error {
|
||||
if err := os.MkdirAll(sessionsDir, 0o755); err != nil {
|
||||
return fmt.Errorf("create sessions dir: %w", err)
|
||||
}
|
||||
path := filepath.Join(sessionsDir, sessionID+".jsonl")
|
||||
f, err := os.OpenFile(path, os.O_CREATE|os.O_APPEND|os.O_WRONLY, 0o644)
|
||||
if err != nil {
|
||||
return fmt.Errorf("open session log: %w", err)
|
||||
}
|
||||
|
||||
line, err := json.Marshal(entry)
|
||||
if err != nil {
|
||||
_ = f.Close()
|
||||
return fmt.Errorf("marshal entry: %w", err)
|
||||
}
|
||||
if _, err = fmt.Fprintf(f, "%s\n", line); err != nil {
|
||||
_ = f.Close()
|
||||
return fmt.Errorf("write entry: %w", err)
|
||||
}
|
||||
if err = f.Close(); err != nil {
|
||||
return fmt.Errorf("close session log: %w", err)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// Read returns all entries for sessionID. Returns empty slice if no log exists.
|
||||
func Read(sessionsDir, sessionID string) ([]Entry, error) {
|
||||
path := filepath.Join(sessionsDir, sessionID+".jsonl")
|
||||
f, err := os.Open(path)
|
||||
if errors.Is(err, fs.ErrNotExist) {
|
||||
return []Entry{}, nil
|
||||
}
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("open session log: %w", err)
|
||||
}
|
||||
defer f.Close() //nolint:errcheck
|
||||
|
||||
var entries []Entry
|
||||
scanner := bufio.NewScanner(f)
|
||||
scanner.Buffer(make([]byte, 0, 256*1024), 1<<20) // up to 1 MB per line
|
||||
for scanner.Scan() {
|
||||
line := scanner.Bytes()
|
||||
if len(line) == 0 {
|
||||
continue
|
||||
}
|
||||
var e Entry
|
||||
if err := json.Unmarshal(line, &e); err != nil {
|
||||
return nil, fmt.Errorf("parse entry: %w", err)
|
||||
}
|
||||
entries = append(entries, e)
|
||||
}
|
||||
return entries, scanner.Err()
|
||||
}
|
||||
50
ingestion/internal/session/session_test.go
Normal file
50
ingestion/internal/session/session_test.go
Normal file
@@ -0,0 +1,50 @@
|
||||
package session_test
|
||||
|
||||
import (
|
||||
"os"
|
||||
"path/filepath"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/mathiasbq/hyperguild/ingestion/internal/session"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func TestAppendAndRead(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
sid := "test-session"
|
||||
|
||||
e1 := session.Entry{
|
||||
SessionID: sid,
|
||||
Timestamp: time.Now().UTC().Truncate(time.Second),
|
||||
Skill: "tdd",
|
||||
Phase: "red",
|
||||
FinalStatus: "ok",
|
||||
}
|
||||
e2 := session.Entry{
|
||||
SessionID: sid,
|
||||
Timestamp: time.Now().UTC().Truncate(time.Second),
|
||||
Skill: "tdd",
|
||||
Phase: "green",
|
||||
FinalStatus: "ok",
|
||||
}
|
||||
|
||||
require.NoError(t, session.Append(dir, sid, e1))
|
||||
require.NoError(t, session.Append(dir, sid, e2))
|
||||
|
||||
got, err := session.Read(dir, sid)
|
||||
require.NoError(t, err)
|
||||
require.Len(t, got, 2)
|
||||
assert.Equal(t, "red", got[0].Phase)
|
||||
assert.Equal(t, "green", got[1].Phase)
|
||||
|
||||
_, statErr := os.Stat(filepath.Join(dir, sid+".jsonl"))
|
||||
require.NoError(t, statErr, "session file should exist on disk")
|
||||
}
|
||||
|
||||
func TestReadMissingReturnsEmpty(t *testing.T) {
|
||||
got, err := session.Read(t.TempDir(), "nope")
|
||||
require.NoError(t, err)
|
||||
assert.Empty(t, got)
|
||||
}
|
||||
84
internal/auth/jwt.go
Normal file
84
internal/auth/jwt.go
Normal file
@@ -0,0 +1,84 @@
|
||||
package auth
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"net/http"
|
||||
"time"
|
||||
|
||||
"github.com/lestrrat-go/jwx/v2/jwk"
|
||||
"github.com/lestrrat-go/jwx/v2/jwt"
|
||||
)
|
||||
|
||||
// Validator validates Bearer JWTs issued by a Dex (OIDC) authorization server.
|
||||
// Audience is optional; leave empty to skip audience validation.
|
||||
type Validator struct {
|
||||
issuer string
|
||||
audience string
|
||||
jwksURI string
|
||||
cache *jwk.Cache
|
||||
}
|
||||
|
||||
// NewValidator fetches the OIDC discovery document from issuerURL, extracts
|
||||
// jwks_uri, seeds the JWKS cache, and returns a ready Validator.
|
||||
// If DEX_ISSUER_URL is not set the caller should pass "" and skip construction.
|
||||
func NewValidator(issuerURL, audience string) (*Validator, error) {
|
||||
resp, err := http.Get(issuerURL + "/.well-known/openid-configuration") //nolint:noctx
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("fetch oidc discovery: %w", err)
|
||||
}
|
||||
defer resp.Body.Close() //nolint:errcheck
|
||||
if resp.StatusCode != http.StatusOK {
|
||||
return nil, fmt.Errorf("oidc discovery: status %d", resp.StatusCode)
|
||||
}
|
||||
|
||||
var doc struct {
|
||||
JWKSURI string `json:"jwks_uri"`
|
||||
}
|
||||
if err := json.NewDecoder(resp.Body).Decode(&doc); err != nil {
|
||||
return nil, fmt.Errorf("decode oidc discovery: %w", err)
|
||||
}
|
||||
if doc.JWKSURI == "" {
|
||||
return nil, fmt.Errorf("oidc discovery: empty jwks_uri")
|
||||
}
|
||||
|
||||
ctx := context.Background()
|
||||
cache := jwk.NewCache(ctx)
|
||||
if err := cache.Register(doc.JWKSURI, jwk.WithMinRefreshInterval(time.Hour)); err != nil {
|
||||
return nil, fmt.Errorf("register jwks cache: %w", err)
|
||||
}
|
||||
if _, err := cache.Refresh(ctx, doc.JWKSURI); err != nil {
|
||||
return nil, fmt.Errorf("initial jwks fetch: %w", err)
|
||||
}
|
||||
|
||||
return &Validator{
|
||||
issuer: issuerURL,
|
||||
audience: audience,
|
||||
jwksURI: doc.JWKSURI,
|
||||
cache: cache,
|
||||
}, nil
|
||||
}
|
||||
|
||||
// Validate parses and validates rawToken. Returns the subject claim on success.
|
||||
func (v *Validator) Validate(ctx context.Context, rawToken string) (string, error) {
|
||||
keySet, err := v.cache.Get(ctx, v.jwksURI)
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("get jwks: %w", err)
|
||||
}
|
||||
|
||||
opts := []jwt.ParseOption{
|
||||
jwt.WithKeySet(keySet),
|
||||
jwt.WithValidate(true),
|
||||
jwt.WithIssuer(v.issuer),
|
||||
}
|
||||
if v.audience != "" {
|
||||
opts = append(opts, jwt.WithAudience(v.audience))
|
||||
}
|
||||
|
||||
tok, err := jwt.ParseString(rawToken, opts...)
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("validate jwt: %w", err)
|
||||
}
|
||||
return tok.Subject(), nil
|
||||
}
|
||||
169
internal/auth/jwt_test.go
Normal file
169
internal/auth/jwt_test.go
Normal file
@@ -0,0 +1,169 @@
|
||||
package auth_test
|
||||
|
||||
import (
|
||||
"context"
|
||||
"crypto/rand"
|
||||
"crypto/rsa"
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/lestrrat-go/jwx/v2/jwa"
|
||||
"github.com/lestrrat-go/jwx/v2/jwk"
|
||||
"github.com/lestrrat-go/jwx/v2/jwt"
|
||||
"github.com/mathiasbq/supervisor/internal/auth"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
type testKeys struct {
|
||||
priv jwk.Key
|
||||
pub jwk.Key
|
||||
}
|
||||
|
||||
func generateRSAKeys(t *testing.T) testKeys {
|
||||
t.Helper()
|
||||
raw, err := rsa.GenerateKey(rand.Reader, 2048)
|
||||
require.NoError(t, err)
|
||||
|
||||
priv, err := jwk.FromRaw(raw)
|
||||
require.NoError(t, err)
|
||||
require.NoError(t, priv.Set(jwk.KeyIDKey, "test-kid"))
|
||||
require.NoError(t, priv.Set(jwk.AlgorithmKey, jwa.RS256))
|
||||
|
||||
pub, err := jwk.PublicKeyOf(priv)
|
||||
require.NoError(t, err)
|
||||
|
||||
return testKeys{priv: priv, pub: pub}
|
||||
}
|
||||
|
||||
func mockOIDCServer(t *testing.T, keys testKeys) *httptest.Server {
|
||||
t.Helper()
|
||||
set := jwk.NewSet()
|
||||
require.NoError(t, set.AddKey(keys.pub))
|
||||
jwksBytes, err := json.Marshal(set)
|
||||
require.NoError(t, err)
|
||||
|
||||
mux := http.NewServeMux()
|
||||
var srv *httptest.Server
|
||||
mux.HandleFunc("/.well-known/openid-configuration", func(w http.ResponseWriter, _ *http.Request) {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_ = json.NewEncoder(w).Encode(map[string]string{
|
||||
"issuer": srv.URL,
|
||||
"jwks_uri": srv.URL + "/jwks",
|
||||
})
|
||||
})
|
||||
mux.HandleFunc("/jwks", func(w http.ResponseWriter, _ *http.Request) {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_, _ = w.Write(jwksBytes)
|
||||
})
|
||||
srv = httptest.NewServer(mux)
|
||||
t.Cleanup(srv.Close)
|
||||
return srv
|
||||
}
|
||||
|
||||
func signToken(t *testing.T, keys testKeys, issuer, audience, subject string, exp time.Time) string {
|
||||
t.Helper()
|
||||
b := jwt.NewBuilder().
|
||||
Issuer(issuer).
|
||||
Subject(subject).
|
||||
Expiration(exp)
|
||||
if audience != "" {
|
||||
b = b.Audience([]string{audience})
|
||||
}
|
||||
tok, err := b.Build()
|
||||
require.NoError(t, err)
|
||||
signed, err := jwt.Sign(tok, jwt.WithKey(jwa.RS256, keys.priv))
|
||||
require.NoError(t, err)
|
||||
return string(signed)
|
||||
}
|
||||
|
||||
func TestValidator(t *testing.T) {
|
||||
keys := generateRSAKeys(t)
|
||||
srv := mockOIDCServer(t, keys)
|
||||
ctx := context.Background()
|
||||
|
||||
v, err := auth.NewValidator(srv.URL, "brain")
|
||||
require.NoError(t, err)
|
||||
|
||||
tests := []struct {
|
||||
name string
|
||||
token string
|
||||
wantSub string
|
||||
wantErr bool
|
||||
}{
|
||||
{
|
||||
name: "valid jwt",
|
||||
token: signToken(t, keys, srv.URL, "brain", "test-user", time.Now().Add(time.Hour)),
|
||||
wantSub: "test-user",
|
||||
},
|
||||
{
|
||||
name: "expired jwt",
|
||||
token: signToken(t, keys, srv.URL, "brain", "test-user", time.Now().Add(-time.Hour)),
|
||||
wantErr: true,
|
||||
},
|
||||
{
|
||||
name: "wrong issuer",
|
||||
token: signToken(t, keys, "https://evil.example.com", "brain", "test-user", time.Now().Add(time.Hour)),
|
||||
wantErr: true,
|
||||
},
|
||||
{
|
||||
name: "wrong audience",
|
||||
token: signToken(t, keys, srv.URL, "other-service", "test-user", time.Now().Add(time.Hour)),
|
||||
wantErr: true,
|
||||
},
|
||||
{
|
||||
name: "tampered token",
|
||||
token: signToken(t, keys, srv.URL, "brain", "test-user", time.Now().Add(time.Hour)) + "tampered",
|
||||
wantErr: true,
|
||||
},
|
||||
{
|
||||
name: "not a jwt",
|
||||
token: "not-a-jwt",
|
||||
wantErr: true,
|
||||
},
|
||||
}
|
||||
|
||||
for _, tc := range tests {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
sub, err := v.Validate(ctx, tc.token)
|
||||
if tc.wantErr {
|
||||
assert.Error(t, err)
|
||||
assert.Empty(t, sub)
|
||||
} else {
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, tc.wantSub, sub)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestNewValidator_NoAudience(t *testing.T) {
|
||||
keys := generateRSAKeys(t)
|
||||
srv := mockOIDCServer(t, keys)
|
||||
ctx := context.Background()
|
||||
|
||||
v, err := auth.NewValidator(srv.URL, "")
|
||||
require.NoError(t, err)
|
||||
|
||||
// Token without audience passes when audience validation is disabled.
|
||||
tok, err := jwt.NewBuilder().
|
||||
Issuer(srv.URL).
|
||||
Subject("sub").
|
||||
Expiration(time.Now().Add(time.Hour)).
|
||||
Build()
|
||||
require.NoError(t, err)
|
||||
signed, err := jwt.Sign(tok, jwt.WithKey(jwa.RS256, keys.priv))
|
||||
require.NoError(t, err)
|
||||
|
||||
sub, err := v.Validate(ctx, string(signed))
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, "sub", sub)
|
||||
}
|
||||
|
||||
func TestNewValidator_BadDiscoveryURL(t *testing.T) {
|
||||
_, err := auth.NewValidator("http://127.0.0.1:1", "brain")
|
||||
assert.Error(t, err)
|
||||
}
|
||||
23
internal/auth/protected_resource.go
Normal file
23
internal/auth/protected_resource.go
Normal file
@@ -0,0 +1,23 @@
|
||||
package auth
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
)
|
||||
|
||||
// ProtectedResourceHandler returns an RFC 9728 oauth-protected-resource metadata
|
||||
// handler. Mount at GET /.well-known/oauth-protected-resource (no auth required).
|
||||
func ProtectedResourceHandler(resourceURL, issuerURL string) http.HandlerFunc {
|
||||
type metadata struct {
|
||||
Resource string `json:"resource"`
|
||||
AuthorizationServers []string `json:"authorization_servers"`
|
||||
}
|
||||
body, _ := json.Marshal(metadata{
|
||||
Resource: resourceURL,
|
||||
AuthorizationServers: []string{issuerURL},
|
||||
})
|
||||
return func(w http.ResponseWriter, _ *http.Request) {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_, _ = w.Write(body)
|
||||
}
|
||||
}
|
||||
28
internal/auth/protected_resource_test.go
Normal file
28
internal/auth/protected_resource_test.go
Normal file
@@ -0,0 +1,28 @@
|
||||
package auth_test
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"testing"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/auth"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func TestProtectedResourceHandler(t *testing.T) {
|
||||
h := auth.ProtectedResourceHandler("https://brain-mcp.d-ma.be", "https://auth.d-ma.be")
|
||||
req := httptest.NewRequest(http.MethodGet, "/.well-known/oauth-protected-resource", nil)
|
||||
rr := httptest.NewRecorder()
|
||||
h(rr, req)
|
||||
|
||||
assert.Equal(t, http.StatusOK, rr.Code)
|
||||
assert.Equal(t, "application/json", rr.Header().Get("Content-Type"))
|
||||
|
||||
var body map[string]any
|
||||
require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &body))
|
||||
assert.Equal(t, "https://brain-mcp.d-ma.be", body["resource"])
|
||||
servers := body["authorization_servers"].([]any)
|
||||
assert.Equal(t, "https://auth.d-ma.be", servers[0])
|
||||
}
|
||||
@@ -13,6 +13,7 @@ type Config struct {
|
||||
KBRetrievalURL string // KB_RETRIEVAL_URL — base URL for brain_search
|
||||
SessionsDir string // SUPERVISOR_SESSIONS_DIR, default ./brain/sessions
|
||||
BrainDir string // SUPERVISOR_BRAIN_DIR, default ./brain
|
||||
MCPAuthToken string // SUPERVISOR_MCP_TOKEN — optional bearer token for MCP HTTP; empty disables auth
|
||||
}
|
||||
|
||||
func Load() (Config, error) {
|
||||
@@ -28,6 +29,7 @@ func Load() (Config, error) {
|
||||
cfg.KBRetrievalURL = envOr("KB_RETRIEVAL_URL", "")
|
||||
cfg.SessionsDir = envOr("SUPERVISOR_SESSIONS_DIR", "./brain/sessions")
|
||||
cfg.BrainDir = envOr("SUPERVISOR_BRAIN_DIR", "./brain")
|
||||
cfg.MCPAuthToken = os.Getenv("SUPERVISOR_MCP_TOKEN")
|
||||
return cfg, nil
|
||||
}
|
||||
|
||||
|
||||
@@ -16,6 +16,7 @@ func TestLoadDefaults(t *testing.T) {
|
||||
t.Setenv("INGEST_BASE_URL", "")
|
||||
t.Setenv("SUPERVISOR_SESSIONS_DIR", "")
|
||||
t.Setenv("SUPERVISOR_BRAIN_DIR", "")
|
||||
t.Setenv("SUPERVISOR_MCP_TOKEN", "")
|
||||
|
||||
cfg, err := config.Load()
|
||||
require.NoError(t, err)
|
||||
@@ -25,6 +26,7 @@ func TestLoadDefaults(t *testing.T) {
|
||||
assert.Equal(t, "http://localhost:3300", cfg.IngestBaseURL)
|
||||
assert.Equal(t, "./brain/sessions", cfg.SessionsDir)
|
||||
assert.Equal(t, "./brain", cfg.BrainDir)
|
||||
assert.Equal(t, "", cfg.MCPAuthToken)
|
||||
}
|
||||
|
||||
func TestLoadFromEnv(t *testing.T) {
|
||||
@@ -32,6 +34,7 @@ func TestLoadFromEnv(t *testing.T) {
|
||||
t.Setenv("LITELLM_BASE_URL", "http://localhost:4000")
|
||||
t.Setenv("LITELLM_API_KEY", "test-key")
|
||||
t.Setenv("SUPERVISOR_CONFIG_DIR", "/etc/supervisor")
|
||||
t.Setenv("SUPERVISOR_MCP_TOKEN", "secret-token")
|
||||
|
||||
cfg, err := config.Load()
|
||||
require.NoError(t, err)
|
||||
@@ -39,4 +42,5 @@ func TestLoadFromEnv(t *testing.T) {
|
||||
assert.Equal(t, "http://localhost:4000", cfg.LiteLLMBaseURL)
|
||||
assert.Equal(t, "test-key", cfg.LiteLLMAPIKey)
|
||||
assert.Equal(t, "/etc/supervisor", cfg.ConfigDir)
|
||||
assert.Equal(t, "secret-token", cfg.MCPAuthToken)
|
||||
}
|
||||
|
||||
101
internal/config/routing.go
Normal file
101
internal/config/routing.go
Normal file
@@ -0,0 +1,101 @@
|
||||
package config
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
"strconv"
|
||||
)
|
||||
|
||||
// RoutingConfig holds the runtime configuration for the routing pod.
|
||||
// Separate from Config because the routing pod's surface differs from the supervisor's.
|
||||
type RoutingConfig struct {
|
||||
Port string // ROUTING_PORT, default 3210
|
||||
MCPAuthToken string // ROUTING_MCP_TOKEN, optional bearer token
|
||||
LiteLLMBaseURL string // LITELLM_BASE_URL, default http://piguard:4000
|
||||
LiteLLMAPIKey string // LITELLM_API_KEY
|
||||
BrainURL string // BRAIN_URL, default http://ingestion.supervisor:3300
|
||||
FastModel string // HYPERGUILD_FAST_MODEL, default koala/qwen35-9b-fast
|
||||
ThinkingModel string // HYPERGUILD_THINKING_MODEL, default iguana/gemma4-26b
|
||||
// RouteLocalFloor and RouteLocalCeil intentionally invert the usual
|
||||
// floor < ceil mathematical convention: Floor (default 0.90) is the
|
||||
// UPPER boundary — at/above it, always route local; Ceil (default 0.70)
|
||||
// is the LOWER boundary — below it, always route Claude. The band in
|
||||
// between is the 50/50 sample zone. The naming follows the spec's policy
|
||||
// vocabulary; see internal/routing/policy.go for the consumer.
|
||||
RouteLocalFloor float64 // HYPERGUILD_ROUTE_LOCAL_FLOOR, default 0.90
|
||||
RouteLocalCeil float64 // HYPERGUILD_ROUTE_LOCAL_CEIL, default 0.70
|
||||
PassRateTTLSeconds int // HYPERGUILD_PASS_RATE_TTL_SECONDS, default 60
|
||||
|
||||
// project_create configuration. Empty GiteaMCPURL disables the
|
||||
// project_create tool registration so the routing pod still starts
|
||||
// in environments where it's not wired up.
|
||||
GiteaMCPURL string // GITEA_MCP_URL, e.g. http://koala:30340/mcp
|
||||
GiteaMCPToken string // GITEA_MCP_TOKEN, bearer for gitea-mcp
|
||||
GiteaOwner string // GITEA_OWNER, default mathias
|
||||
GitHubOwner string // GITHUB_OWNER, default mathiasb
|
||||
InfraRepo string // INFRA_REPO, default infra
|
||||
GitHubPAT string // GITHUB_PAT, repo scope; never logged
|
||||
}
|
||||
|
||||
func LoadRouting() (RoutingConfig, error) {
|
||||
cfg := RoutingConfig{
|
||||
Port: envOr("ROUTING_PORT", "3210"),
|
||||
MCPAuthToken: os.Getenv("ROUTING_MCP_TOKEN"),
|
||||
LiteLLMBaseURL: envOr("LITELLM_BASE_URL", "http://piguard:4000"),
|
||||
LiteLLMAPIKey: os.Getenv("LITELLM_API_KEY"),
|
||||
BrainURL: envOr("BRAIN_URL", "http://ingestion.supervisor:3300"),
|
||||
FastModel: envOr("HYPERGUILD_FAST_MODEL", "koala/qwen35-9b-fast"),
|
||||
ThinkingModel: envOr("HYPERGUILD_THINKING_MODEL", "iguana/gemma4-26b"),
|
||||
}
|
||||
|
||||
floor, err := parseFloatEnv("HYPERGUILD_ROUTE_LOCAL_FLOOR", 0.90)
|
||||
if err != nil {
|
||||
return RoutingConfig{}, err
|
||||
}
|
||||
cfg.RouteLocalFloor = floor
|
||||
|
||||
ceil, err := parseFloatEnv("HYPERGUILD_ROUTE_LOCAL_CEIL", 0.70)
|
||||
if err != nil {
|
||||
return RoutingConfig{}, err
|
||||
}
|
||||
cfg.RouteLocalCeil = ceil
|
||||
|
||||
ttl, err := parseIntEnv("HYPERGUILD_PASS_RATE_TTL_SECONDS", 60)
|
||||
if err != nil {
|
||||
return RoutingConfig{}, err
|
||||
}
|
||||
cfg.PassRateTTLSeconds = ttl
|
||||
|
||||
cfg.GiteaMCPURL = os.Getenv("GITEA_MCP_URL")
|
||||
cfg.GiteaMCPToken = os.Getenv("GITEA_MCP_TOKEN")
|
||||
cfg.GiteaOwner = envOr("GITEA_OWNER", "mathias")
|
||||
cfg.GitHubOwner = envOr("GITHUB_OWNER", "mathiasb")
|
||||
cfg.InfraRepo = envOr("INFRA_REPO", "infra")
|
||||
cfg.GitHubPAT = os.Getenv("GITHUB_PAT")
|
||||
|
||||
return cfg, nil
|
||||
}
|
||||
|
||||
func parseFloatEnv(key string, def float64) (float64, error) {
|
||||
v := os.Getenv(key)
|
||||
if v == "" {
|
||||
return def, nil
|
||||
}
|
||||
f, err := strconv.ParseFloat(v, 64)
|
||||
if err != nil {
|
||||
return 0, fmt.Errorf("config: %s: %w", key, err)
|
||||
}
|
||||
return f, nil
|
||||
}
|
||||
|
||||
func parseIntEnv(key string, def int) (int, error) {
|
||||
v := os.Getenv(key)
|
||||
if v == "" {
|
||||
return def, nil
|
||||
}
|
||||
n, err := strconv.Atoi(v)
|
||||
if err != nil {
|
||||
return 0, fmt.Errorf("config: %s: %w", key, err)
|
||||
}
|
||||
return n, nil
|
||||
}
|
||||
73
internal/config/routing_test.go
Normal file
73
internal/config/routing_test.go
Normal file
@@ -0,0 +1,73 @@
|
||||
package config_test
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/config"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func TestLoadRoutingDefaults(t *testing.T) {
|
||||
for _, k := range []string{
|
||||
"ROUTING_PORT", "ROUTING_MCP_TOKEN", "LITELLM_BASE_URL", "LITELLM_API_KEY",
|
||||
"BRAIN_URL", "HYPERGUILD_FAST_MODEL", "HYPERGUILD_THINKING_MODEL",
|
||||
"HYPERGUILD_ROUTE_LOCAL_FLOOR", "HYPERGUILD_ROUTE_LOCAL_CEIL",
|
||||
"HYPERGUILD_PASS_RATE_TTL_SECONDS",
|
||||
} {
|
||||
t.Setenv(k, "")
|
||||
}
|
||||
|
||||
cfg, err := config.LoadRouting()
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, "3210", cfg.Port)
|
||||
assert.Equal(t, "", cfg.MCPAuthToken)
|
||||
assert.Equal(t, "http://piguard:4000", cfg.LiteLLMBaseURL)
|
||||
assert.Equal(t, "http://ingestion.supervisor:3300", cfg.BrainURL)
|
||||
assert.Equal(t, "koala/qwen35-9b-fast", cfg.FastModel)
|
||||
assert.Equal(t, "iguana/gemma4-26b", cfg.ThinkingModel)
|
||||
assert.InDelta(t, 0.90, cfg.RouteLocalFloor, 1e-9)
|
||||
assert.InDelta(t, 0.70, cfg.RouteLocalCeil, 1e-9)
|
||||
assert.Equal(t, 60, cfg.PassRateTTLSeconds)
|
||||
assert.Equal(t, "", cfg.LiteLLMAPIKey)
|
||||
}
|
||||
|
||||
func TestLoadRoutingFromEnv(t *testing.T) {
|
||||
t.Setenv("ROUTING_PORT", "3250")
|
||||
t.Setenv("ROUTING_MCP_TOKEN", "tok-xyz")
|
||||
t.Setenv("LITELLM_BASE_URL", "http://localhost:4000")
|
||||
t.Setenv("LITELLM_API_KEY", "lk")
|
||||
t.Setenv("BRAIN_URL", "http://localhost:3300")
|
||||
t.Setenv("HYPERGUILD_FAST_MODEL", "koala/phi4-14b")
|
||||
t.Setenv("HYPERGUILD_THINKING_MODEL", "iguana/qwen3-14b-think")
|
||||
t.Setenv("HYPERGUILD_ROUTE_LOCAL_FLOOR", "0.85")
|
||||
t.Setenv("HYPERGUILD_ROUTE_LOCAL_CEIL", "0.65")
|
||||
t.Setenv("HYPERGUILD_PASS_RATE_TTL_SECONDS", "30")
|
||||
|
||||
cfg, err := config.LoadRouting()
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, "3250", cfg.Port)
|
||||
assert.Equal(t, "tok-xyz", cfg.MCPAuthToken)
|
||||
assert.Equal(t, "http://localhost:4000", cfg.LiteLLMBaseURL)
|
||||
assert.Equal(t, "lk", cfg.LiteLLMAPIKey)
|
||||
assert.Equal(t, "http://localhost:3300", cfg.BrainURL)
|
||||
assert.Equal(t, "koala/phi4-14b", cfg.FastModel)
|
||||
assert.Equal(t, "iguana/qwen3-14b-think", cfg.ThinkingModel)
|
||||
assert.InDelta(t, 0.85, cfg.RouteLocalFloor, 1e-9)
|
||||
assert.InDelta(t, 0.65, cfg.RouteLocalCeil, 1e-9)
|
||||
assert.Equal(t, 30, cfg.PassRateTTLSeconds)
|
||||
}
|
||||
|
||||
func TestLoadRoutingRejectsBadFloat(t *testing.T) {
|
||||
t.Setenv("HYPERGUILD_ROUTE_LOCAL_FLOOR", "not-a-number")
|
||||
_, err := config.LoadRouting()
|
||||
require.Error(t, err)
|
||||
assert.Contains(t, err.Error(), "HYPERGUILD_ROUTE_LOCAL_FLOOR")
|
||||
}
|
||||
|
||||
func TestLoadRoutingRejectsBadInt(t *testing.T) {
|
||||
t.Setenv("HYPERGUILD_PASS_RATE_TTL_SECONDS", "not-a-number")
|
||||
_, err := config.LoadRouting()
|
||||
require.Error(t, err)
|
||||
assert.Contains(t, err.Error(), "HYPERGUILD_PASS_RATE_TTL_SECONDS")
|
||||
}
|
||||
108
internal/githubclient/client.go
Normal file
108
internal/githubclient/client.go
Normal file
@@ -0,0 +1,108 @@
|
||||
// Package githubclient is a minimal GitHub REST API client. The hyperguild
|
||||
// project_create flow is gitea-first; this client exists only to create an
|
||||
// empty repo on GitHub before the gitea→github push-mirror is configured,
|
||||
// since the mirror cannot push to a non-existent remote.
|
||||
package githubclient
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"io"
|
||||
"net/http"
|
||||
"time"
|
||||
)
|
||||
|
||||
const defaultAPI = "https://api.github.com"
|
||||
|
||||
type Client struct {
|
||||
api string
|
||||
token string
|
||||
http *http.Client
|
||||
}
|
||||
|
||||
// New returns a Client with the given personal access token (repo scope).
|
||||
func New(token string) *Client {
|
||||
return &Client{
|
||||
api: defaultAPI,
|
||||
token: token,
|
||||
http: &http.Client{Timeout: 30 * time.Second},
|
||||
}
|
||||
}
|
||||
|
||||
// WithBaseURL overrides the API base (test injection).
|
||||
func (c *Client) WithBaseURL(u string) *Client {
|
||||
c.api = u
|
||||
return c
|
||||
}
|
||||
|
||||
// Repo is the subset of GitHub's repo response we surface upstream.
|
||||
type Repo struct {
|
||||
FullName string `json:"full_name"`
|
||||
HTMLURL string `json:"html_url"`
|
||||
CloneURL string `json:"clone_url"`
|
||||
Private bool `json:"private"`
|
||||
}
|
||||
|
||||
type createRepoArgs struct {
|
||||
Name string `json:"name"`
|
||||
Description string `json:"description,omitempty"`
|
||||
Private bool `json:"private"`
|
||||
AutoInit bool `json:"auto_init"`
|
||||
}
|
||||
|
||||
// ErrAlreadyExists is returned by CreateRepo when GitHub responds 422 with
|
||||
// "name already exists". Callers treat it as idempotent success.
|
||||
var ErrAlreadyExists = fmt.Errorf("github repo already exists")
|
||||
|
||||
// CreateRepo creates a repo under the authenticated user's account.
|
||||
// auto_init is always false — the push-mirror will populate the repo from
|
||||
// gitea, so an auto-generated README would conflict on first push.
|
||||
func (c *Client) CreateRepo(ctx context.Context, name, description string, private bool) (*Repo, error) {
|
||||
if c.token == "" {
|
||||
return nil, fmt.Errorf("github pat not configured")
|
||||
}
|
||||
body, _ := json.Marshal(createRepoArgs{
|
||||
Name: name,
|
||||
Description: description,
|
||||
Private: private,
|
||||
AutoInit: false,
|
||||
})
|
||||
req, err := http.NewRequestWithContext(ctx, http.MethodPost, c.api+"/user/repos", bytes.NewReader(body))
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("new request: %w", err)
|
||||
}
|
||||
req.Header.Set("Authorization", "token "+c.token)
|
||||
req.Header.Set("Accept", "application/vnd.github+json")
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
req.Header.Set("X-GitHub-Api-Version", "2022-11-28")
|
||||
|
||||
resp, err := c.http.Do(req)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("http: %w", err)
|
||||
}
|
||||
defer func() { _ = resp.Body.Close() }()
|
||||
|
||||
raw, _ := io.ReadAll(resp.Body)
|
||||
switch resp.StatusCode {
|
||||
case http.StatusCreated:
|
||||
var r Repo
|
||||
if err := json.Unmarshal(raw, &r); err != nil {
|
||||
return nil, fmt.Errorf("decode response: %w", err)
|
||||
}
|
||||
return &r, nil
|
||||
case http.StatusUnprocessableEntity:
|
||||
// 422 covers "name already exists" + a handful of other validation
|
||||
// errors. Treat any 422 that mentions "already exists" as idempotent
|
||||
// success; everything else surfaces verbatim.
|
||||
if bytes.Contains(raw, []byte("already exists")) {
|
||||
return nil, ErrAlreadyExists
|
||||
}
|
||||
return nil, fmt.Errorf("github 422: %s", string(raw))
|
||||
case http.StatusUnauthorized, http.StatusForbidden:
|
||||
return nil, fmt.Errorf("github auth %d: PAT missing repo scope or invalid", resp.StatusCode)
|
||||
default:
|
||||
return nil, fmt.Errorf("github %d: %s", resp.StatusCode, string(raw))
|
||||
}
|
||||
}
|
||||
71
internal/githubclient/client_test.go
Normal file
71
internal/githubclient/client_test.go
Normal file
@@ -0,0 +1,71 @@
|
||||
package githubclient_test
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"io"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"testing"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/githubclient"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func TestCreateRepo_Success(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
assert.Equal(t, http.MethodPost, r.Method)
|
||||
assert.Equal(t, "/user/repos", r.URL.Path)
|
||||
assert.Equal(t, "token ghp_test", r.Header.Get("Authorization"))
|
||||
var args map[string]any
|
||||
b, _ := io.ReadAll(r.Body)
|
||||
_ = json.Unmarshal(b, &args)
|
||||
assert.Equal(t, "test-repo", args["name"])
|
||||
assert.Equal(t, true, args["private"])
|
||||
assert.Equal(t, false, args["auto_init"])
|
||||
w.WriteHeader(http.StatusCreated)
|
||||
_, _ = w.Write([]byte(`{"full_name":"mathiasb/test-repo","html_url":"https://github.com/mathiasb/test-repo","clone_url":"https://github.com/mathiasb/test-repo.git","private":true}`))
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
c := githubclient.New("ghp_test").WithBaseURL(srv.URL)
|
||||
r, err := c.CreateRepo(context.Background(), "test-repo", "desc", true)
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, "mathiasb/test-repo", r.FullName)
|
||||
assert.True(t, r.Private)
|
||||
}
|
||||
|
||||
func TestCreateRepo_AlreadyExists(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||
w.WriteHeader(http.StatusUnprocessableEntity)
|
||||
_, _ = w.Write([]byte(`{"message":"Validation Failed","errors":[{"resource":"Repository","code":"custom","field":"name","message":"name already exists on this account"}]}`))
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
c := githubclient.New("ghp_test").WithBaseURL(srv.URL)
|
||||
_, err := c.CreateRepo(context.Background(), "x", "", false)
|
||||
require.Error(t, err)
|
||||
assert.True(t, errors.Is(err, githubclient.ErrAlreadyExists))
|
||||
}
|
||||
|
||||
func TestCreateRepo_Unauthorized(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||
w.WriteHeader(http.StatusUnauthorized)
|
||||
_, _ = w.Write([]byte(`{"message":"Bad credentials"}`))
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
c := githubclient.New("ghp_test").WithBaseURL(srv.URL)
|
||||
_, err := c.CreateRepo(context.Background(), "x", "", false)
|
||||
require.Error(t, err)
|
||||
assert.Contains(t, err.Error(), "PAT missing repo scope")
|
||||
}
|
||||
|
||||
func TestCreateRepo_NoToken(t *testing.T) {
|
||||
c := githubclient.New("")
|
||||
_, err := c.CreateRepo(context.Background(), "x", "", false)
|
||||
require.Error(t, err)
|
||||
assert.Contains(t, err.Error(), "github pat not configured")
|
||||
}
|
||||
@@ -2,9 +2,13 @@ package mcp
|
||||
|
||||
import (
|
||||
"context"
|
||||
"crypto/subtle"
|
||||
"encoding/json"
|
||||
"log/slog"
|
||||
"net/http"
|
||||
"strings"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/auth"
|
||||
"github.com/mathiasbq/supervisor/internal/registry"
|
||||
)
|
||||
|
||||
@@ -29,20 +33,50 @@ type rpcError struct {
|
||||
|
||||
// Server is an HTTP handler implementing the MCP JSON-RPC protocol.
|
||||
type Server struct {
|
||||
reg *registry.Registry
|
||||
reg *registry.Registry
|
||||
token string
|
||||
validator *auth.Validator
|
||||
}
|
||||
|
||||
func NewServer(reg *registry.Registry) *Server {
|
||||
return &Server{reg: reg}
|
||||
// NewServer constructs an MCP HTTP handler. token is the static bearer token
|
||||
// (empty disables static auth). validator is optional; when non-nil, a valid
|
||||
// JWT from Dex is accepted in addition to the static token.
|
||||
func NewServer(reg *registry.Registry, token string, validator *auth.Validator) *Server {
|
||||
return &Server{reg: reg, token: token, validator: validator}
|
||||
}
|
||||
|
||||
func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {
|
||||
if !s.checkAuth(w, r) {
|
||||
return
|
||||
}
|
||||
|
||||
// GET opens the SSE stream for server-to-client events (MCP streamable HTTP).
|
||||
// claude.ai probes with GET before sending initialize, so accept without a session.
|
||||
if r.Method == http.MethodGet {
|
||||
w.Header().Set("Content-Type", "text/event-stream")
|
||||
w.Header().Set("Cache-Control", "no-cache")
|
||||
w.Header().Set("Connection", "keep-alive")
|
||||
w.Header().Set("X-Accel-Buffering", "no")
|
||||
w.WriteHeader(http.StatusOK)
|
||||
if f, ok := w.(http.Flusher); ok {
|
||||
_, _ = w.Write([]byte(": stream open\n\n"))
|
||||
f.Flush()
|
||||
}
|
||||
<-r.Context().Done()
|
||||
return
|
||||
}
|
||||
|
||||
var req request
|
||||
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
|
||||
writeError(w, nil, -32700, "parse error")
|
||||
return
|
||||
}
|
||||
|
||||
// JSON-RPC 2.0 notifications (no id) must not receive a response.
|
||||
if req.ID == nil {
|
||||
return
|
||||
}
|
||||
|
||||
var result any
|
||||
var rpcErr *rpcError
|
||||
|
||||
@@ -88,6 +122,44 @@ func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {
|
||||
})
|
||||
}
|
||||
|
||||
// checkAuth verifies the bearer token. Accepts a valid Dex JWT (when validator
|
||||
// is configured) or the static token. Returns true if the request may proceed.
|
||||
// When neither token nor validator is configured, auth is disabled (default).
|
||||
func (s *Server) checkAuth(w http.ResponseWriter, r *http.Request) bool {
|
||||
if s.token == "" && s.validator == nil {
|
||||
return true
|
||||
}
|
||||
|
||||
rawToken, ok := strings.CutPrefix(r.Header.Get("Authorization"), "Bearer ")
|
||||
if !ok {
|
||||
s.rejectAuth(w, r)
|
||||
return false
|
||||
}
|
||||
|
||||
if s.validator != nil {
|
||||
if _, err := s.validator.Validate(r.Context(), rawToken); err == nil {
|
||||
return true
|
||||
}
|
||||
}
|
||||
|
||||
if s.token != "" && subtle.ConstantTimeCompare([]byte(rawToken), []byte(s.token)) == 1 {
|
||||
return true
|
||||
}
|
||||
|
||||
s.rejectAuth(w, r)
|
||||
return false
|
||||
}
|
||||
|
||||
func (s *Server) rejectAuth(w http.ResponseWriter, r *http.Request) {
|
||||
slog.Warn("mcp auth rejected", "remote", r.RemoteAddr, "method", r.Method)
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusUnauthorized)
|
||||
_ = json.NewEncoder(w).Encode(response{
|
||||
JSONRPC: "2.0",
|
||||
Error: &rpcError{Code: -32001, Message: "unauthorized"},
|
||||
})
|
||||
}
|
||||
|
||||
func writeError(w http.ResponseWriter, id any, code int, msg string) {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_ = json.NewEncoder(w).Encode(response{
|
||||
|
||||
@@ -5,6 +5,7 @@ import (
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/mcp"
|
||||
@@ -22,7 +23,7 @@ func jsonBody(t *testing.T, v any) *bytes.Buffer {
|
||||
|
||||
func TestMCPInitialize(t *testing.T) {
|
||||
reg := registry.New()
|
||||
srv := mcp.NewServer(reg)
|
||||
srv := mcp.NewServer(reg, "", nil)
|
||||
|
||||
req := httptest.NewRequest(http.MethodPost, "/mcp", jsonBody(t, map[string]any{
|
||||
"jsonrpc": "2.0",
|
||||
@@ -44,7 +45,7 @@ func TestMCPInitialize(t *testing.T) {
|
||||
|
||||
func TestMCPToolsList(t *testing.T) {
|
||||
reg := registry.New()
|
||||
srv := mcp.NewServer(reg)
|
||||
srv := mcp.NewServer(reg, "", nil)
|
||||
|
||||
req := httptest.NewRequest(http.MethodPost, "/mcp", jsonBody(t, map[string]any{
|
||||
"jsonrpc": "2.0", "id": 2, "method": "tools/list", "params": map[string]any{},
|
||||
@@ -62,7 +63,7 @@ func TestMCPToolsList(t *testing.T) {
|
||||
|
||||
func TestMCPUnknownMethod(t *testing.T) {
|
||||
reg := registry.New()
|
||||
srv := mcp.NewServer(reg)
|
||||
srv := mcp.NewServer(reg, "", nil)
|
||||
|
||||
req := httptest.NewRequest(http.MethodPost, "/mcp", jsonBody(t, map[string]any{
|
||||
"jsonrpc": "2.0", "id": 3, "method": "unknown/method", "params": map[string]any{},
|
||||
@@ -76,3 +77,82 @@ func TestMCPUnknownMethod(t *testing.T) {
|
||||
require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &resp))
|
||||
assert.NotNil(t, resp["error"])
|
||||
}
|
||||
|
||||
func TestMCPNotificationKnownMethodGetsNoResponseBody(t *testing.T) {
|
||||
reg := registry.New()
|
||||
srv := mcp.NewServer(reg, "", nil)
|
||||
|
||||
// JSON-RPC 2.0 notification: "id" field absent. Per spec, server MUST NOT
|
||||
// reply. notifications/initialized is part of the standard MCP handshake.
|
||||
req := httptest.NewRequest(http.MethodPost, "/mcp", jsonBody(t, map[string]any{
|
||||
"jsonrpc": "2.0",
|
||||
"method": "notifications/initialized",
|
||||
}))
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
rr := httptest.NewRecorder()
|
||||
srv.ServeHTTP(rr, req)
|
||||
|
||||
assert.Equal(t, http.StatusOK, rr.Code)
|
||||
assert.Empty(t, strings.TrimSpace(rr.Body.String()),
|
||||
"notifications must not receive a response body")
|
||||
}
|
||||
|
||||
func TestMCPAuth(t *testing.T) {
|
||||
const token = "s3cr3t"
|
||||
|
||||
cases := []struct {
|
||||
name string
|
||||
token string
|
||||
authHeader string
|
||||
wantStatus int
|
||||
}{
|
||||
{"no token configured passes without header", "", "", http.StatusOK},
|
||||
{"correct bearer passes", token, "Bearer " + token, http.StatusOK},
|
||||
{"wrong bearer rejected", token, "Bearer wrong", http.StatusUnauthorized},
|
||||
{"missing header rejected", token, "", http.StatusUnauthorized},
|
||||
{"wrong scheme rejected", token, "Basic " + token, http.StatusUnauthorized},
|
||||
}
|
||||
|
||||
for _, tc := range cases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
reg := registry.New()
|
||||
srv := mcp.NewServer(reg, tc.token, nil)
|
||||
|
||||
req := httptest.NewRequest(http.MethodPost, "/mcp", jsonBody(t, map[string]any{
|
||||
"jsonrpc": "2.0", "id": 1, "method": "initialize", "params": map[string]any{},
|
||||
}))
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
if tc.authHeader != "" {
|
||||
req.Header.Set("Authorization", tc.authHeader)
|
||||
}
|
||||
rr := httptest.NewRecorder()
|
||||
srv.ServeHTTP(rr, req)
|
||||
|
||||
assert.Equal(t, tc.wantStatus, rr.Code)
|
||||
if tc.wantStatus == http.StatusUnauthorized {
|
||||
var resp map[string]any
|
||||
require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &resp))
|
||||
rpcErr, ok := resp["error"].(map[string]any)
|
||||
require.True(t, ok, "expected error object in response")
|
||||
assert.Equal(t, float64(-32001), rpcErr["code"])
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestMCPNotificationUnknownMethodGetsNoResponseBody(t *testing.T) {
|
||||
reg := registry.New()
|
||||
srv := mcp.NewServer(reg, "", nil)
|
||||
|
||||
req := httptest.NewRequest(http.MethodPost, "/mcp", jsonBody(t, map[string]any{
|
||||
"jsonrpc": "2.0",
|
||||
"method": "notifications/totally-unknown",
|
||||
}))
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
rr := httptest.NewRecorder()
|
||||
srv.ServeHTTP(rr, req)
|
||||
|
||||
assert.Equal(t, http.StatusOK, rr.Code)
|
||||
assert.Empty(t, strings.TrimSpace(rr.Body.String()),
|
||||
"unknown notifications must also receive no response body")
|
||||
}
|
||||
|
||||
135
internal/mcpclient/client.go
Normal file
135
internal/mcpclient/client.go
Normal file
@@ -0,0 +1,135 @@
|
||||
// Package mcpclient is a minimal JSON-RPC over HTTP client for talking to
|
||||
// MCP servers from inside hyperguild components. It only implements
|
||||
// `tools/call` because that's all consumer skills need today.
|
||||
package mcpclient
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"io"
|
||||
"net/http"
|
||||
"time"
|
||||
)
|
||||
|
||||
// Client calls an MCP server over Streamable HTTP / JSON-RPC.
|
||||
type Client struct {
|
||||
url string
|
||||
token string
|
||||
http *http.Client
|
||||
}
|
||||
|
||||
// New returns a Client. token may be empty for unauthenticated servers.
|
||||
func New(url, token string) *Client {
|
||||
return &Client{
|
||||
url: url,
|
||||
token: token,
|
||||
http: &http.Client{Timeout: 60 * time.Second},
|
||||
}
|
||||
}
|
||||
|
||||
// WithHTTPClient overrides the underlying HTTP client (test injection).
|
||||
func (c *Client) WithHTTPClient(h *http.Client) *Client {
|
||||
c.http = h
|
||||
return c
|
||||
}
|
||||
|
||||
type rpcRequest struct {
|
||||
JSONRPC string `json:"jsonrpc"`
|
||||
ID int `json:"id"`
|
||||
Method string `json:"method"`
|
||||
Params map[string]any `json:"params"`
|
||||
}
|
||||
|
||||
type rpcError struct {
|
||||
Code int `json:"code"`
|
||||
Message string `json:"message"`
|
||||
}
|
||||
|
||||
type rpcResponse struct {
|
||||
JSONRPC string `json:"jsonrpc"`
|
||||
ID int `json:"id"`
|
||||
Result json.RawMessage `json:"result,omitempty"`
|
||||
Error *rpcError `json:"error,omitempty"`
|
||||
}
|
||||
|
||||
// Error is returned when the remote MCP server signals a typed failure.
|
||||
// Code follows JSON-RPC conventions; see gitea-mcp internal/mcp/jsonrpc.go
|
||||
// for the codes the server uses (e.g. -32002 NotFound, -32003 Conflict).
|
||||
type Error struct {
|
||||
Code int
|
||||
Message string
|
||||
}
|
||||
|
||||
func (e *Error) Error() string { return fmt.Sprintf("mcp error %d: %s", e.Code, e.Message) }
|
||||
|
||||
// CallTool issues `tools/call`. result is JSON-unmarshalled from the
|
||||
// server's content[0].text field; pass nil to discard.
|
||||
func (c *Client) CallTool(ctx context.Context, name string, args any, result any) error {
|
||||
body, err := json.Marshal(rpcRequest{
|
||||
JSONRPC: "2.0",
|
||||
ID: 1,
|
||||
Method: "tools/call",
|
||||
Params: map[string]any{
|
||||
"name": name,
|
||||
"arguments": args,
|
||||
},
|
||||
})
|
||||
if err != nil {
|
||||
return fmt.Errorf("marshal request: %w", err)
|
||||
}
|
||||
|
||||
req, err := http.NewRequestWithContext(ctx, http.MethodPost, c.url, bytes.NewReader(body))
|
||||
if err != nil {
|
||||
return fmt.Errorf("new request: %w", err)
|
||||
}
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
if c.token != "" {
|
||||
req.Header.Set("Authorization", "Bearer "+c.token)
|
||||
}
|
||||
|
||||
resp, err := c.http.Do(req)
|
||||
if err != nil {
|
||||
return fmt.Errorf("http: %w", err)
|
||||
}
|
||||
defer func() { _ = resp.Body.Close() }()
|
||||
|
||||
raw, err := io.ReadAll(resp.Body)
|
||||
if err != nil {
|
||||
return fmt.Errorf("read body: %w", err)
|
||||
}
|
||||
if resp.StatusCode >= 400 {
|
||||
return fmt.Errorf("mcp http %d: %s", resp.StatusCode, string(raw))
|
||||
}
|
||||
|
||||
var rpc rpcResponse
|
||||
if err := json.Unmarshal(raw, &rpc); err != nil {
|
||||
return fmt.Errorf("decode response: %w (body=%s)", err, string(raw))
|
||||
}
|
||||
if rpc.Error != nil {
|
||||
return &Error{Code: rpc.Error.Code, Message: rpc.Error.Message}
|
||||
}
|
||||
|
||||
if result == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
// MCP success result shape: { content: [{type:"text", text:"<json>"}] }
|
||||
var wrap struct {
|
||||
Content []struct {
|
||||
Type string `json:"type"`
|
||||
Text string `json:"text"`
|
||||
} `json:"content"`
|
||||
}
|
||||
if err := json.Unmarshal(rpc.Result, &wrap); err != nil {
|
||||
return fmt.Errorf("decode wrap: %w (result=%s)", err, string(rpc.Result))
|
||||
}
|
||||
if len(wrap.Content) == 0 {
|
||||
return fmt.Errorf("empty content in tool response")
|
||||
}
|
||||
if err := json.Unmarshal([]byte(wrap.Content[0].Text), result); err != nil {
|
||||
return fmt.Errorf("decode tool result text: %w (text=%s)", err, wrap.Content[0].Text)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
82
internal/mcpclient/client_test.go
Normal file
82
internal/mcpclient/client_test.go
Normal file
@@ -0,0 +1,82 @@
|
||||
package mcpclient_test
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"io"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"testing"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/mcpclient"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func TestCallTool_Success(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
assert.Equal(t, http.MethodPost, r.Method)
|
||||
assert.Equal(t, "Bearer tok", r.Header.Get("Authorization"))
|
||||
b, _ := io.ReadAll(r.Body)
|
||||
var got map[string]any
|
||||
_ = json.Unmarshal(b, &got)
|
||||
assert.Equal(t, "tools/call", got["method"])
|
||||
params := got["params"].(map[string]any)
|
||||
assert.Equal(t, "x_y", params["name"])
|
||||
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_, _ = w.Write([]byte(`{"jsonrpc":"2.0","id":1,"result":{"content":[{"type":"text","text":"{\"ok\":true,\"n\":7}"}]}}`))
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
c := mcpclient.New(srv.URL, "tok")
|
||||
var out struct {
|
||||
OK bool `json:"ok"`
|
||||
N int `json:"n"`
|
||||
}
|
||||
err := c.CallTool(context.Background(), "x_y", map[string]any{"a": 1}, &out)
|
||||
require.NoError(t, err)
|
||||
assert.True(t, out.OK)
|
||||
assert.Equal(t, 7, out.N)
|
||||
}
|
||||
|
||||
func TestCallTool_RPCError(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_, _ = w.Write([]byte(`{"jsonrpc":"2.0","id":1,"error":{"code":-32003,"message":"already exists"}}`))
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
c := mcpclient.New(srv.URL, "")
|
||||
err := c.CallTool(context.Background(), "x", nil, nil)
|
||||
require.Error(t, err)
|
||||
var me *mcpclient.Error
|
||||
require.True(t, errors.As(err, &me))
|
||||
assert.Equal(t, -32003, me.Code)
|
||||
assert.Contains(t, me.Message, "already exists")
|
||||
}
|
||||
|
||||
func TestCallTool_HTTPError(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||
w.WriteHeader(http.StatusUnauthorized)
|
||||
_, _ = w.Write([]byte(`unauthorized`))
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
c := mcpclient.New(srv.URL, "")
|
||||
err := c.CallTool(context.Background(), "x", nil, nil)
|
||||
require.Error(t, err)
|
||||
assert.Contains(t, err.Error(), "401")
|
||||
}
|
||||
|
||||
func TestCallTool_NilResult(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_, _ = w.Write([]byte(`{"jsonrpc":"2.0","id":1,"result":{"content":[{"type":"text","text":"{}"}]}}`))
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
c := mcpclient.New(srv.URL, "")
|
||||
require.NoError(t, c.CallTool(context.Background(), "x", nil, nil))
|
||||
}
|
||||
21
internal/routing/hash.go
Normal file
21
internal/routing/hash.go
Normal file
@@ -0,0 +1,21 @@
|
||||
package routing
|
||||
|
||||
import (
|
||||
"crypto/sha256"
|
||||
"encoding/binary"
|
||||
)
|
||||
|
||||
// CanonicalHash returns a deterministic 64-bit hash of (system, user).
|
||||
// Used to make sample-band routing decisions reproducible: identical input
|
||||
// strings produce the same hash on every call, independent of process state.
|
||||
//
|
||||
// Inputs are joined with a 0x00 byte separator before hashing — distinguishes
|
||||
// (system="ab", user="cd") from (system="abcd", user="").
|
||||
func CanonicalHash(system, user string) uint64 {
|
||||
h := sha256.New()
|
||||
h.Write([]byte(system))
|
||||
h.Write([]byte{0})
|
||||
h.Write([]byte(user))
|
||||
sum := h.Sum(nil)
|
||||
return binary.BigEndian.Uint64(sum[:8])
|
||||
}
|
||||
46
internal/routing/hash_test.go
Normal file
46
internal/routing/hash_test.go
Normal file
@@ -0,0 +1,46 @@
|
||||
package routing_test
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/routing"
|
||||
"github.com/stretchr/testify/assert"
|
||||
)
|
||||
|
||||
func TestCanonicalHashDeterministic(t *testing.T) {
|
||||
a := routing.CanonicalHash("system one", "user one")
|
||||
b := routing.CanonicalHash("system one", "user one")
|
||||
assert.Equal(t, a, b, "same inputs must produce same hash")
|
||||
}
|
||||
|
||||
func TestCanonicalHashDistinguishesInputs(t *testing.T) {
|
||||
cases := [][2]string{
|
||||
{"sys", "user"},
|
||||
{"sys", "user2"},
|
||||
{"sys2", "user"},
|
||||
{"", "system\x00user"}, // separator collision attempt
|
||||
{"system\x00user", ""},
|
||||
}
|
||||
seen := make(map[uint64]bool)
|
||||
for _, c := range cases {
|
||||
h := routing.CanonicalHash(c[0], c[1])
|
||||
assert.False(t, seen[h], "collision on %v", c)
|
||||
seen[h] = true
|
||||
}
|
||||
}
|
||||
|
||||
func TestCanonicalHashLowBitDistribution(t *testing.T) {
|
||||
// Sanity check: across 1000 distinct inputs, low-bit split is roughly even.
|
||||
zeros, ones := 0, 0
|
||||
for i := 0; i < 1000; i++ {
|
||||
h := routing.CanonicalHash("sys", string(rune('a'+(i%26)))+string(rune(i)))
|
||||
if h&1 == 0 {
|
||||
zeros++
|
||||
} else {
|
||||
ones++
|
||||
}
|
||||
}
|
||||
// Allow ±15% deviation from 500/500. Tighter would be flaky on real data.
|
||||
assert.InDelta(t, 500, zeros, 150)
|
||||
assert.InDelta(t, 500, ones, 150)
|
||||
}
|
||||
79
internal/routing/log.go
Normal file
79
internal/routing/log.go
Normal file
@@ -0,0 +1,79 @@
|
||||
package routing
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"net/http"
|
||||
"time"
|
||||
)
|
||||
|
||||
// LogEntry describes a single routing decision to log via the brain MCP.
|
||||
type LogEntry struct {
|
||||
SessionID string
|
||||
Skill string // the original skill the call routed (e.g., "review")
|
||||
Decision string // "local" or "thinking" or "thinking_fallback"
|
||||
Message string // free-form, e.g. "model=qwen35, pass_rate=0.94"
|
||||
ProjectRoot string
|
||||
DurationMs int64
|
||||
Failed bool // true → final_status: "fail"; false → "skip"
|
||||
}
|
||||
|
||||
// Logger posts session_log entries to a brain MCP at BrainURL + /mcp.
|
||||
type Logger struct {
|
||||
BrainURL string
|
||||
HTTP *http.Client
|
||||
}
|
||||
|
||||
// NewLogger creates a Logger with a 2-second HTTP timeout.
|
||||
func NewLogger(brainURL string) *Logger {
|
||||
return &Logger{
|
||||
BrainURL: brainURL,
|
||||
HTTP: &http.Client{Timeout: 2 * time.Second},
|
||||
}
|
||||
}
|
||||
|
||||
// LogDecision posts a session_log MCP call. Errors are returned but the caller
|
||||
// MUST NOT block real work on them — logging is best-effort.
|
||||
func (l *Logger) LogDecision(ctx context.Context, e LogEntry) error {
|
||||
status := "skip"
|
||||
if e.Failed {
|
||||
status = "fail"
|
||||
}
|
||||
payload := map[string]any{
|
||||
"jsonrpc": "2.0",
|
||||
"id": 1,
|
||||
"method": "tools/call",
|
||||
"params": map[string]any{
|
||||
"name": "session_log",
|
||||
"arguments": map[string]any{
|
||||
"session_id": e.SessionID,
|
||||
"skill": "_routing",
|
||||
"phase": "decide",
|
||||
"final_status": status,
|
||||
"message": fmt.Sprintf("%s: %s — %s", e.Skill, e.Decision, e.Message),
|
||||
"duration_ms": e.DurationMs,
|
||||
"project_root": e.ProjectRoot,
|
||||
},
|
||||
},
|
||||
}
|
||||
body, err := json.Marshal(payload)
|
||||
if err != nil {
|
||||
return fmt.Errorf("log: marshal: %w", err)
|
||||
}
|
||||
req, err := http.NewRequestWithContext(ctx, http.MethodPost, l.BrainURL+"/mcp", bytes.NewReader(body))
|
||||
if err != nil {
|
||||
return fmt.Errorf("log: build request: %w", err)
|
||||
}
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
resp, err := l.HTTP.Do(req)
|
||||
if err != nil {
|
||||
return fmt.Errorf("log: request: %w", err)
|
||||
}
|
||||
defer func() { _ = resp.Body.Close() }()
|
||||
if resp.StatusCode != http.StatusOK {
|
||||
return fmt.Errorf("log: server returned status %d", resp.StatusCode)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
81
internal/routing/log_test.go
Normal file
81
internal/routing/log_test.go
Normal file
@@ -0,0 +1,81 @@
|
||||
package routing_test
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"io"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"testing"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/routing"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func TestLoggerLogDecision(t *testing.T) {
|
||||
var captured map[string]any
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
assert.Equal(t, http.MethodPost, r.Method)
|
||||
assert.Equal(t, "/mcp", r.URL.Path)
|
||||
body, _ := io.ReadAll(r.Body)
|
||||
require.NoError(t, json.Unmarshal(body, &captured))
|
||||
_ = json.NewEncoder(w).Encode(map[string]any{"jsonrpc": "2.0", "id": 1, "result": map[string]any{"content": []map[string]any{{"type": "text", "text": "ok"}}}})
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
l := routing.NewLogger(srv.URL)
|
||||
err := l.LogDecision(context.Background(), routing.LogEntry{
|
||||
SessionID: "sess-1",
|
||||
Skill: "review",
|
||||
Decision: "local",
|
||||
Message: "model=qwen35, pass_rate=0.94",
|
||||
ProjectRoot: "/home/x/proj",
|
||||
DurationMs: 1234,
|
||||
Failed: false,
|
||||
})
|
||||
require.NoError(t, err)
|
||||
|
||||
params := captured["params"].(map[string]any)
|
||||
assert.Equal(t, "tools/call", captured["method"])
|
||||
assert.Equal(t, "session_log", params["name"])
|
||||
|
||||
args := params["arguments"].(map[string]any)
|
||||
assert.Equal(t, "_routing", args["skill"])
|
||||
assert.Equal(t, "decide", args["phase"])
|
||||
assert.Equal(t, "skip", args["final_status"])
|
||||
assert.Contains(t, args["message"].(string), "review: local")
|
||||
assert.Equal(t, "sess-1", args["session_id"])
|
||||
assert.Equal(t, "/home/x/proj", args["project_root"])
|
||||
assert.Equal(t, float64(1234), args["duration_ms"])
|
||||
}
|
||||
|
||||
func TestLoggerLogFailure(t *testing.T) {
|
||||
var captured map[string]any
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
body, _ := io.ReadAll(r.Body)
|
||||
_ = json.Unmarshal(body, &captured)
|
||||
_ = json.NewEncoder(w).Encode(map[string]any{"jsonrpc": "2.0", "id": 1, "result": map[string]any{}})
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
l := routing.NewLogger(srv.URL)
|
||||
err := l.LogDecision(context.Background(), routing.LogEntry{
|
||||
SessionID: "s", Skill: "debug", Decision: "local", Message: "litellm down", Failed: true,
|
||||
})
|
||||
require.NoError(t, err)
|
||||
|
||||
args := captured["params"].(map[string]any)["arguments"].(map[string]any)
|
||||
assert.Equal(t, "fail", args["final_status"])
|
||||
}
|
||||
|
||||
func TestLoggerSurfacesUpstreamError(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||
http.Error(w, "down", http.StatusBadGateway)
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
l := routing.NewLogger(srv.URL)
|
||||
err := l.LogDecision(context.Background(), routing.LogEntry{Skill: "x", SessionID: "y", Decision: "local"})
|
||||
require.Error(t, err)
|
||||
}
|
||||
85
internal/routing/passrate.go
Normal file
85
internal/routing/passrate.go
Normal file
@@ -0,0 +1,85 @@
|
||||
package routing
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"net/http"
|
||||
"net/url"
|
||||
"sync"
|
||||
"time"
|
||||
)
|
||||
|
||||
// Fetcher reads /pass-rate from the brain pod with a per-skill TTL cache.
|
||||
type Fetcher struct {
|
||||
BaseURL string
|
||||
Window string
|
||||
TTL time.Duration
|
||||
HTTP *http.Client
|
||||
|
||||
mu sync.Mutex
|
||||
cache map[string]cachedRate
|
||||
}
|
||||
|
||||
type cachedRate struct {
|
||||
value *float64
|
||||
at time.Time
|
||||
}
|
||||
|
||||
type passRateResponse struct {
|
||||
PassRate *float64 `json:"pass_rate"`
|
||||
}
|
||||
|
||||
// NewFetcher returns a Fetcher that calls baseURL + /pass-rate with the
|
||||
// given window string. If ttl is zero, defaults to 60 seconds. The HTTP
|
||||
// client uses a 1-second total timeout.
|
||||
func NewFetcher(baseURL, window string, ttl time.Duration) *Fetcher {
|
||||
if ttl == 0 {
|
||||
ttl = 60 * time.Second
|
||||
}
|
||||
return &Fetcher{
|
||||
BaseURL: baseURL,
|
||||
Window: window,
|
||||
TTL: ttl,
|
||||
HTTP: &http.Client{Timeout: time.Second},
|
||||
cache: make(map[string]cachedRate),
|
||||
}
|
||||
}
|
||||
|
||||
// Get returns the pass rate for the named skill, or nil if no data exists,
|
||||
// or an error if the brain is unreachable. Caches successful results.
|
||||
func (f *Fetcher) Get(ctx context.Context, skill string) (*float64, error) {
|
||||
f.mu.Lock()
|
||||
if c, ok := f.cache[skill]; ok && time.Since(c.at) < f.TTL {
|
||||
v := c.value
|
||||
f.mu.Unlock()
|
||||
return v, nil
|
||||
}
|
||||
f.mu.Unlock()
|
||||
|
||||
u := fmt.Sprintf("%s/pass-rate?skill=%s&window=%s",
|
||||
f.BaseURL, url.QueryEscape(skill), url.QueryEscape(f.Window))
|
||||
req, err := http.NewRequestWithContext(ctx, http.MethodGet, u, nil)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("passrate: build request: %w", err)
|
||||
}
|
||||
resp, err := f.HTTP.Do(req)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("passrate: request: %w", err)
|
||||
}
|
||||
defer func() { _ = resp.Body.Close() }()
|
||||
if resp.StatusCode != http.StatusOK {
|
||||
return nil, fmt.Errorf("passrate: server returned status %d", resp.StatusCode)
|
||||
}
|
||||
|
||||
var body passRateResponse
|
||||
if err := json.NewDecoder(resp.Body).Decode(&body); err != nil {
|
||||
return nil, fmt.Errorf("passrate: decode: %w", err)
|
||||
}
|
||||
|
||||
f.mu.Lock()
|
||||
f.cache[skill] = cachedRate{value: body.PassRate, at: time.Now()}
|
||||
f.mu.Unlock()
|
||||
|
||||
return body.PassRate, nil
|
||||
}
|
||||
94
internal/routing/passrate_test.go
Normal file
94
internal/routing/passrate_test.go
Normal file
@@ -0,0 +1,94 @@
|
||||
package routing_test
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"sync/atomic"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/routing"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func TestFetcherGetReturnsPassRate(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
assert.Equal(t, http.MethodGet, r.Method)
|
||||
assert.Equal(t, "/pass-rate", r.URL.Path)
|
||||
assert.Equal(t, "tdd", r.URL.Query().Get("skill"))
|
||||
assert.Equal(t, "7d", r.URL.Query().Get("window"))
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_ = json.NewEncoder(w).Encode(map[string]any{"skill": "tdd", "pass_rate": 0.94})
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
f := routing.NewFetcher(srv.URL, "7d", time.Minute)
|
||||
pr, err := f.Get(context.Background(), "tdd")
|
||||
require.NoError(t, err)
|
||||
require.NotNil(t, pr)
|
||||
assert.InDelta(t, 0.94, *pr, 1e-9)
|
||||
}
|
||||
|
||||
func TestFetcherGetReturnsNilWhenNoData(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
_ = json.NewEncoder(w).Encode(map[string]any{"skill": "novel", "pass_rate": nil})
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
f := routing.NewFetcher(srv.URL, "7d", time.Minute)
|
||||
pr, err := f.Get(context.Background(), "novel")
|
||||
require.NoError(t, err)
|
||||
assert.Nil(t, pr)
|
||||
}
|
||||
|
||||
func TestFetcherCachesWithinTTL(t *testing.T) {
|
||||
var calls int32
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||
atomic.AddInt32(&calls, 1)
|
||||
_ = json.NewEncoder(w).Encode(map[string]any{"pass_rate": 0.5})
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
f := routing.NewFetcher(srv.URL, "7d", time.Minute)
|
||||
for i := 0; i < 5; i++ {
|
||||
_, err := f.Get(context.Background(), "tdd")
|
||||
require.NoError(t, err)
|
||||
}
|
||||
assert.Equal(t, int32(1), atomic.LoadInt32(&calls), "should hit upstream once and serve four times from cache")
|
||||
}
|
||||
|
||||
func TestFetcherFetchesAgainAfterTTLExpires(t *testing.T) {
|
||||
var calls int32
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||
atomic.AddInt32(&calls, 1)
|
||||
_ = json.NewEncoder(w).Encode(map[string]any{"pass_rate": 0.5})
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
// Tight TTL so the test stays fast.
|
||||
f := routing.NewFetcher(srv.URL, "7d", 5*time.Millisecond)
|
||||
_, err := f.Get(context.Background(), "tdd")
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, int32(1), atomic.LoadInt32(&calls))
|
||||
|
||||
// Sleep past TTL, then a second Get should hit upstream again.
|
||||
time.Sleep(15 * time.Millisecond)
|
||||
_, err = f.Get(context.Background(), "tdd")
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, int32(2), atomic.LoadInt32(&calls), "expected fresh upstream call after TTL expiry")
|
||||
}
|
||||
|
||||
func TestFetcherSurfacesUpstreamError(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||
http.Error(w, "boom", http.StatusInternalServerError)
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
f := routing.NewFetcher(srv.URL, "7d", time.Minute)
|
||||
pr, err := f.Get(context.Background(), "tdd")
|
||||
require.Error(t, err)
|
||||
assert.Nil(t, pr)
|
||||
}
|
||||
47
internal/routing/policy.go
Normal file
47
internal/routing/policy.go
Normal file
@@ -0,0 +1,47 @@
|
||||
package routing
|
||||
|
||||
// Decision is the route picked for a single skill call.
|
||||
type Decision int
|
||||
|
||||
const (
|
||||
DecideLocal Decision = iota
|
||||
DecideClaude
|
||||
)
|
||||
|
||||
func (d Decision) String() string {
|
||||
if d == DecideLocal {
|
||||
return "local"
|
||||
}
|
||||
return "claude"
|
||||
}
|
||||
|
||||
// Policy holds the floor/ceil thresholds for routing decisions.
|
||||
//
|
||||
// Rules (in order):
|
||||
//
|
||||
// 1. passRate == nil → DecideLocal (default-to-local for cost-routable skills)
|
||||
// 2. *passRate >= Floor → DecideLocal (trust local)
|
||||
// 3. *passRate < Ceil → DecideClaude (don't trust local)
|
||||
// 4. otherwise (sample band) → requestHash low bit picks: 0=local, 1=claude
|
||||
type Policy struct {
|
||||
Floor float64
|
||||
Ceil float64
|
||||
}
|
||||
|
||||
// Decide returns the routing decision for a single call.
|
||||
// requestHash is consulted only when passRate is in the sample band [Ceil, Floor).
|
||||
func (p Policy) Decide(passRate *float64, requestHash uint64) Decision {
|
||||
if passRate == nil {
|
||||
return DecideLocal
|
||||
}
|
||||
if *passRate >= p.Floor {
|
||||
return DecideLocal
|
||||
}
|
||||
if *passRate < p.Ceil {
|
||||
return DecideClaude
|
||||
}
|
||||
if requestHash&1 == 0 {
|
||||
return DecideLocal
|
||||
}
|
||||
return DecideClaude
|
||||
}
|
||||
36
internal/routing/policy_test.go
Normal file
36
internal/routing/policy_test.go
Normal file
@@ -0,0 +1,36 @@
|
||||
package routing_test
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/routing"
|
||||
"github.com/stretchr/testify/assert"
|
||||
)
|
||||
|
||||
func ptr(f float64) *float64 { return &f }
|
||||
|
||||
func TestPolicyDecide(t *testing.T) {
|
||||
p := routing.Policy{Floor: 0.9, Ceil: 0.7}
|
||||
|
||||
cases := []struct {
|
||||
name string
|
||||
passRate *float64
|
||||
hash uint64
|
||||
want routing.Decision
|
||||
}{
|
||||
{"null pass rate → local", nil, 0, routing.DecideLocal},
|
||||
{"null pass rate, hash irrelevant → local", nil, 0xDEADBEEF, routing.DecideLocal},
|
||||
{"at floor → local", ptr(0.9), 0, routing.DecideLocal},
|
||||
{"above floor → local", ptr(0.95), 0, routing.DecideLocal},
|
||||
{"below ceil → claude", ptr(0.5), 0, routing.DecideClaude},
|
||||
{"at ceil → sample-band even-hash → local", ptr(0.7), 0, routing.DecideLocal},
|
||||
{"sample band, even hash → local", ptr(0.8), 2, routing.DecideLocal},
|
||||
{"sample band, odd hash → claude", ptr(0.8), 3, routing.DecideClaude},
|
||||
}
|
||||
|
||||
for _, tc := range cases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
assert.Equal(t, tc.want, p.Decide(tc.passRate, tc.hash))
|
||||
})
|
||||
}
|
||||
}
|
||||
84
internal/routing/router.go
Normal file
84
internal/routing/router.go
Normal file
@@ -0,0 +1,84 @@
|
||||
package routing
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"log/slog"
|
||||
)
|
||||
|
||||
// CompleteFunc matches the signature used by every skill package's Config.
|
||||
type CompleteFunc func(ctx context.Context, model, system, user string) (string, int64, error)
|
||||
|
||||
// RunInput captures the per-call inputs the dispatch wrapper needs.
|
||||
type RunInput struct {
|
||||
Skill string
|
||||
System string
|
||||
User string
|
||||
SessionID string
|
||||
ProjectRoot string
|
||||
}
|
||||
|
||||
// Router composes a pass-rate fetcher, a decision policy, a session logger,
|
||||
// and a LiteLLM client. Skill packages receive Router.Run as their CompleteFunc.
|
||||
type Router struct {
|
||||
Fetcher *Fetcher
|
||||
Logger *Logger
|
||||
Policy Policy
|
||||
FastModel string
|
||||
ThinkingModel string
|
||||
Complete CompleteFunc
|
||||
}
|
||||
|
||||
// Run executes one skill call: decides local vs claude, calls LiteLLM, logs the
|
||||
// decision. On local-side error, falls open by retrying once on the Claude model.
|
||||
func (r *Router) Run(ctx context.Context, in RunInput) (string, int64, error) {
|
||||
pr, ferr := r.Fetcher.Get(ctx, in.Skill)
|
||||
if ferr != nil {
|
||||
slog.Warn("router: pass-rate unreachable, defaulting to local", "skill", in.Skill, "err", ferr)
|
||||
pr = nil
|
||||
}
|
||||
hash := CanonicalHash(in.System, in.User)
|
||||
decision := r.Policy.Decide(pr, hash)
|
||||
|
||||
model := r.ThinkingModel
|
||||
if decision == DecideLocal {
|
||||
model = r.FastModel
|
||||
}
|
||||
|
||||
out, ms, err := r.Complete(ctx, model, in.System, in.User)
|
||||
if lerr := r.Logger.LogDecision(ctx, LogEntry{
|
||||
SessionID: in.SessionID,
|
||||
Skill: in.Skill,
|
||||
Decision: decision.String(),
|
||||
Message: fmt.Sprintf("model=%s, pass_rate=%s", model, formatPassRate(pr)),
|
||||
ProjectRoot: in.ProjectRoot,
|
||||
DurationMs: ms,
|
||||
Failed: err != nil,
|
||||
}); lerr != nil {
|
||||
slog.Warn("router: log decision failed", "skill", in.Skill, "err", lerr)
|
||||
}
|
||||
|
||||
if err != nil && decision == DecideLocal {
|
||||
slog.Warn("router: fast failed, falling open to thinking model", "skill", in.Skill, "err", err)
|
||||
out, ms, err = r.Complete(ctx, r.ThinkingModel, in.System, in.User)
|
||||
if lerr := r.Logger.LogDecision(ctx, LogEntry{
|
||||
SessionID: in.SessionID,
|
||||
Skill: in.Skill,
|
||||
Decision: "thinking_fallback",
|
||||
Message: fmt.Sprintf("model=%s, after-fast-error", r.ThinkingModel),
|
||||
ProjectRoot: in.ProjectRoot,
|
||||
DurationMs: ms,
|
||||
Failed: err != nil,
|
||||
}); lerr != nil {
|
||||
slog.Warn("router: log decision failed", "skill", in.Skill, "err", lerr)
|
||||
}
|
||||
}
|
||||
return out, ms, err
|
||||
}
|
||||
|
||||
func formatPassRate(pr *float64) string {
|
||||
if pr == nil {
|
||||
return "null"
|
||||
}
|
||||
return fmt.Sprintf("%.2f", *pr)
|
||||
}
|
||||
136
internal/routing/router_test.go
Normal file
136
internal/routing/router_test.go
Normal file
@@ -0,0 +1,136 @@
|
||||
package routing_test
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"sync"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/routing"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
type fakeLLM struct {
|
||||
mu sync.Mutex
|
||||
calls []struct{ Model, System, User string }
|
||||
resp string
|
||||
err error
|
||||
errOn string // if non-empty, only the named model errors
|
||||
}
|
||||
|
||||
func (f *fakeLLM) Complete(_ context.Context, model, system, user string) (string, int64, error) {
|
||||
f.mu.Lock()
|
||||
defer f.mu.Unlock()
|
||||
f.calls = append(f.calls, struct{ Model, System, User string }{model, system, user})
|
||||
if f.errOn == model {
|
||||
return "", 0, f.err
|
||||
}
|
||||
if f.err != nil && f.errOn == "" {
|
||||
return "", 0, f.err
|
||||
}
|
||||
return f.resp, 100, nil
|
||||
}
|
||||
|
||||
func newRouter(t *testing.T, llm *fakeLLM, passRate float64) (*routing.Router, *httptest.Server, *httptest.Server) {
|
||||
t.Helper()
|
||||
brain := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
switch r.URL.Path {
|
||||
case "/pass-rate":
|
||||
_ = json.NewEncoder(w).Encode(map[string]any{"pass_rate": passRate})
|
||||
case "/mcp":
|
||||
_ = json.NewEncoder(w).Encode(map[string]any{"jsonrpc": "2.0", "id": 1, "result": map[string]any{}})
|
||||
}
|
||||
}))
|
||||
t.Cleanup(brain.Close)
|
||||
|
||||
r := &routing.Router{
|
||||
Fetcher: routing.NewFetcher(brain.URL, "7d", time.Minute),
|
||||
Logger: routing.NewLogger(brain.URL),
|
||||
Policy: routing.Policy{Floor: 0.9, Ceil: 0.7},
|
||||
FastModel: "koala/qwen35-9b-fast",
|
||||
ThinkingModel: "iguana/gemma4-26b",
|
||||
Complete: llm.Complete,
|
||||
}
|
||||
return r, brain, brain
|
||||
}
|
||||
|
||||
func TestRouterRoutesLocalAtHighPassRate(t *testing.T) {
|
||||
llm := &fakeLLM{resp: "ok"}
|
||||
r, _, _ := newRouter(t, llm, 0.95)
|
||||
|
||||
out, _, err := r.Run(context.Background(), routing.RunInput{
|
||||
Skill: "review", System: "sys", User: "user", SessionID: "s1", ProjectRoot: "/p",
|
||||
})
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, "ok", out)
|
||||
|
||||
llm.mu.Lock()
|
||||
defer llm.mu.Unlock()
|
||||
require.Len(t, llm.calls, 1)
|
||||
assert.Equal(t, "koala/qwen35-9b-fast", llm.calls[0].Model)
|
||||
}
|
||||
|
||||
func TestRouterRoutesThinkingAtLowPassRate(t *testing.T) {
|
||||
llm := &fakeLLM{resp: "ok"}
|
||||
r, _, _ := newRouter(t, llm, 0.3)
|
||||
|
||||
_, _, err := r.Run(context.Background(), routing.RunInput{
|
||||
Skill: "review", System: "sys", User: "user", SessionID: "s2",
|
||||
})
|
||||
require.NoError(t, err)
|
||||
|
||||
llm.mu.Lock()
|
||||
defer llm.mu.Unlock()
|
||||
require.Len(t, llm.calls, 1)
|
||||
assert.Equal(t, "iguana/gemma4-26b", llm.calls[0].Model)
|
||||
}
|
||||
|
||||
func TestRouterFailsOpenFastErrorToThinking(t *testing.T) {
|
||||
llm := &fakeLLM{resp: "ok-after-fallback", err: errors.New("fast boom"), errOn: "koala/qwen35-9b-fast"}
|
||||
r, _, _ := newRouter(t, llm, 0.95) // would route fast
|
||||
|
||||
out, _, err := r.Run(context.Background(), routing.RunInput{
|
||||
Skill: "review", System: "sys", User: "user", SessionID: "s3",
|
||||
})
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, "ok-after-fallback", out)
|
||||
|
||||
llm.mu.Lock()
|
||||
defer llm.mu.Unlock()
|
||||
require.Len(t, llm.calls, 2)
|
||||
assert.Equal(t, "koala/qwen35-9b-fast", llm.calls[0].Model)
|
||||
assert.Equal(t, "iguana/gemma4-26b", llm.calls[1].Model)
|
||||
}
|
||||
|
||||
func TestRouterDefaultsToFastWhenBrainUnreachable(t *testing.T) {
|
||||
// Brain returns 500 → fetcher errors → router treats pass rate as nil → fast.
|
||||
brain := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||
http.Error(w, "down", http.StatusInternalServerError)
|
||||
}))
|
||||
defer brain.Close()
|
||||
|
||||
llm := &fakeLLM{resp: "ok"}
|
||||
r := &routing.Router{
|
||||
Fetcher: routing.NewFetcher(brain.URL, "7d", time.Minute),
|
||||
Logger: routing.NewLogger(brain.URL),
|
||||
Policy: routing.Policy{Floor: 0.9, Ceil: 0.7},
|
||||
FastModel: "koala/qwen35-9b-fast",
|
||||
ThinkingModel: "iguana/gemma4-26b",
|
||||
Complete: llm.Complete,
|
||||
}
|
||||
|
||||
_, _, err := r.Run(context.Background(), routing.RunInput{
|
||||
Skill: "review", System: "sys", User: "user", SessionID: "s4",
|
||||
})
|
||||
require.NoError(t, err)
|
||||
|
||||
llm.mu.Lock()
|
||||
defer llm.mu.Unlock()
|
||||
require.Len(t, llm.calls, 1)
|
||||
assert.Equal(t, "koala/qwen35-9b-fast", llm.calls[0].Model)
|
||||
}
|
||||
80
internal/routing/snapshot_test.go
Normal file
80
internal/routing/snapshot_test.go
Normal file
@@ -0,0 +1,80 @@
|
||||
package routing_test
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"os"
|
||||
"sort"
|
||||
"testing"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/registry"
|
||||
"github.com/mathiasbq/supervisor/internal/skills/debug"
|
||||
"github.com/mathiasbq/supervisor/internal/skills/retrospective"
|
||||
"github.com/mathiasbq/supervisor/internal/skills/review"
|
||||
"github.com/mathiasbq/supervisor/internal/skills/trainer"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
// TestToolsListMatchesSupervisorSnapshot pins the four routed skills' tool
|
||||
// definitions to the supervisor's current advertisement. A deliberate schema
|
||||
// change must be reflected here by updating testdata/tools_list.snapshot.json.
|
||||
func TestToolsListMatchesSupervisorSnapshot(t *testing.T) {
|
||||
complete := func(_ context.Context, _, _, _ string) (string, int64, error) {
|
||||
return "", 0, nil
|
||||
}
|
||||
|
||||
reg := registry.New()
|
||||
reg.Register(review.New(review.Config{
|
||||
SkillPrompt: "stub",
|
||||
DefaultModel: "stub",
|
||||
CompleteFunc: complete,
|
||||
}))
|
||||
reg.Register(debug.New(debug.Config{
|
||||
SkillPrompt: "stub",
|
||||
DefaultModel: "stub",
|
||||
CompleteFunc: complete,
|
||||
}))
|
||||
reg.Register(retrospective.New(retrospective.Config{
|
||||
SkillPrompt: "stub",
|
||||
DefaultModel: "stub",
|
||||
CompleteFunc: complete,
|
||||
}))
|
||||
reg.Register(trainer.New(trainer.Config{
|
||||
ReaderPrompt: "stub",
|
||||
WriterPrompt: "stub",
|
||||
DefaultModel: "stub",
|
||||
CompleteFunc: complete,
|
||||
}))
|
||||
|
||||
wanted := map[string]bool{
|
||||
"review": true,
|
||||
"debug": true,
|
||||
"retrospective": true,
|
||||
"trainer": true,
|
||||
}
|
||||
var routed []registry.ToolDef
|
||||
for _, td := range reg.Tools() {
|
||||
if wanted[td.Name] {
|
||||
routed = append(routed, td)
|
||||
}
|
||||
}
|
||||
sort.Slice(routed, func(i, j int) bool { return routed[i].Name < routed[j].Name })
|
||||
|
||||
got, err := json.MarshalIndent(routed, "", " ")
|
||||
require.NoError(t, err)
|
||||
|
||||
want, err := os.ReadFile("testdata/tools_list.snapshot.json")
|
||||
require.NoError(t, err)
|
||||
|
||||
// Normalize both via re-encode so whitespace differences don't dominate.
|
||||
var gotV, wantV any
|
||||
require.NoError(t, json.Unmarshal(got, &gotV))
|
||||
require.NoError(t, json.Unmarshal(want, &wantV))
|
||||
|
||||
gotN, _ := json.MarshalIndent(gotV, "", " ")
|
||||
wantN, _ := json.MarshalIndent(wantV, "", " ")
|
||||
|
||||
assert.Equal(t, string(wantN), string(gotN),
|
||||
"tool advertisement drifted from supervisor snapshot — update testdata/tools_list.snapshot.json deliberately if the schema change is intentional")
|
||||
}
|
||||
97
internal/routing/testdata/tools_list.snapshot.json
vendored
Normal file
97
internal/routing/testdata/tools_list.snapshot.json
vendored
Normal file
@@ -0,0 +1,97 @@
|
||||
[
|
||||
{
|
||||
"name": "debug",
|
||||
"description": "Consult a local model to analyse an error and return hypotheses ordered by likelihood, each with a concrete verification step.",
|
||||
"inputSchema": {
|
||||
"properties": {
|
||||
"context": {
|
||||
"type": "string"
|
||||
},
|
||||
"error": {
|
||||
"type": "string"
|
||||
},
|
||||
"model": {
|
||||
"type": "string"
|
||||
},
|
||||
"project_root": {
|
||||
"type": "string"
|
||||
},
|
||||
"session_id": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"project_root",
|
||||
"error"
|
||||
],
|
||||
"type": "object"
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "retrospective",
|
||||
"description": "Consult a local model to analyse a completed session and identify what is novel or worth preserving as organizational knowledge.",
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"required": [
|
||||
"session_id"
|
||||
],
|
||||
"properties": {
|
||||
"session_id": {
|
||||
"type": "string"
|
||||
},
|
||||
"model": {
|
||||
"type": "string"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "review",
|
||||
"description": "Consult a local model for a structured code review of the specified files. Returns findings with severity levels.",
|
||||
"inputSchema": {
|
||||
"properties": {
|
||||
"context": {
|
||||
"type": "string"
|
||||
},
|
||||
"files": {
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"type": "array"
|
||||
},
|
||||
"model": {
|
||||
"type": "string"
|
||||
},
|
||||
"project_root": {
|
||||
"type": "string"
|
||||
},
|
||||
"session_id": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"project_root",
|
||||
"files"
|
||||
],
|
||||
"type": "object"
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "trainer",
|
||||
"description": "Consult a local model to identify learning moments from a session log and suggest knowledge to preserve in the brain.",
|
||||
"inputSchema": {
|
||||
"properties": {
|
||||
"model": {
|
||||
"type": "string"
|
||||
},
|
||||
"session_id": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"session_id"
|
||||
],
|
||||
"type": "object"
|
||||
}
|
||||
}
|
||||
]
|
||||
@@ -17,6 +17,8 @@ func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (
|
||||
return s.query(ctx, args)
|
||||
case "brain_write":
|
||||
return s.write(ctx, args)
|
||||
case "brain_ingest_raw":
|
||||
return s.ingestRaw(ctx, args)
|
||||
case "brain_ingest":
|
||||
return s.ingest(ctx, args)
|
||||
case "brain_search":
|
||||
@@ -98,6 +100,33 @@ func (s *Skill) ingest(ctx context.Context, args json.RawMessage) (json.RawMessa
|
||||
return nil, fmt.Errorf("either content+source or path is required")
|
||||
}
|
||||
|
||||
type ingestRawArgs struct {
|
||||
Source string `json:"source"`
|
||||
Pages []any `json:"pages"`
|
||||
DryRun bool `json:"dry_run,omitempty"`
|
||||
}
|
||||
|
||||
func (s *Skill) ingestRaw(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
|
||||
var a ingestRawArgs
|
||||
if err := json.Unmarshal(args, &a); err != nil {
|
||||
return nil, fmt.Errorf("parse args: %w", err)
|
||||
}
|
||||
if s.cfg.IngestSvcURL == "" {
|
||||
return nil, fmt.Errorf("brain_ingest_raw: INGEST_SVC_URL not configured")
|
||||
}
|
||||
if a.Source == "" {
|
||||
return nil, fmt.Errorf("source is required")
|
||||
}
|
||||
if len(a.Pages) == 0 {
|
||||
return nil, fmt.Errorf("pages is required and must be non-empty")
|
||||
}
|
||||
return s.postTo(ctx, s.cfg.IngestSvcURL+"/ingest-raw", map[string]any{
|
||||
"source": a.Source,
|
||||
"pages": a.Pages,
|
||||
"dry_run": a.DryRun,
|
||||
})
|
||||
}
|
||||
|
||||
type searchArgs struct {
|
||||
Query string `json:"query"`
|
||||
Collection string `json:"collection,omitempty"`
|
||||
|
||||
@@ -55,6 +55,32 @@ func (s *Skill) Tools() []registry.ToolDef {
|
||||
},
|
||||
}
|
||||
if s.cfg.IngestSvcURL != "" {
|
||||
tools = append(tools, registry.ToolDef{
|
||||
Name: "brain_ingest_raw",
|
||||
Description: "Ingest pre-structured pages into the brain wiki, bypassing the LLM extraction step. " +
|
||||
"Use when you (the calling agent) have already extracted entities, concepts, and content from a source. " +
|
||||
"Provide source (human-readable name) and pages (array of {title, type, subtype, domain, content} objects). " +
|
||||
"The pipeline computes slugs, paths, frontmatter, wikilink canonicalization, and source back-references. " +
|
||||
"Returns the list of wiki pages written.",
|
||||
InputSchema: schema([]string{"source", "pages"}, map[string]any{
|
||||
"source": map[string]any{"type": "string", "description": "human-readable name for the source, e.g. 'shape-up-book'"},
|
||||
"pages": map[string]any{
|
||||
"type": "array",
|
||||
"items": map[string]any{
|
||||
"type": "object",
|
||||
"required": []string{"title", "type", "content"},
|
||||
"properties": map[string]any{
|
||||
"title": map[string]any{"type": "string", "description": "page title, e.g. 'Hash Encoding'"},
|
||||
"type": map[string]any{"type": "string", "enum": []string{"source", "concept", "entity"}, "description": "page type"},
|
||||
"subtype": map[string]any{"type": "string", "description": "entity: person|company|tool|model|framework|technology; source: article|pdf|book|video|note|project"},
|
||||
"domain": map[string]any{"type": "string", "description": "knowledge domain, e.g. 'Machine Learning'"},
|
||||
"content": map[string]any{"type": "string", "description": "markdown body — no frontmatter, use [[Display Name]] for wikilinks"},
|
||||
},
|
||||
},
|
||||
},
|
||||
"dry_run": map[string]any{"type": "boolean"},
|
||||
}),
|
||||
})
|
||||
tools = append(tools, registry.ToolDef{
|
||||
Name: "brain_ingest",
|
||||
Description: "Ingest content into the brain wiki (brain/wiki/). Calls an LLM to produce structured wiki pages. " +
|
||||
|
||||
286
internal/skills/project/handlers.go
Normal file
286
internal/skills/project/handlers.go
Normal file
@@ -0,0 +1,286 @@
|
||||
package project
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"fmt"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/githubclient"
|
||||
"github.com/mathiasbq/supervisor/internal/mcpclient"
|
||||
)
|
||||
|
||||
type createArgs struct {
|
||||
Name string `json:"name"`
|
||||
Description string `json:"description"`
|
||||
Hypothesis string `json:"hypothesis"`
|
||||
Folder string `json:"folder"`
|
||||
Stack string `json:"stack"`
|
||||
Private bool `json:"private"`
|
||||
}
|
||||
|
||||
type createResult struct {
|
||||
GiteaURL string `json:"gitea_url"`
|
||||
GitHubURL string `json:"github_url"`
|
||||
IssueURL string `json:"issue_url"`
|
||||
NextSteps string `json:"next_steps"`
|
||||
|
||||
// Reached records the steps that completed. Populated on partial failure
|
||||
// so callers can resume manually instead of guessing what already ran.
|
||||
Reached []string `json:"reached,omitempty"`
|
||||
|
||||
// FailedStep is non-empty when a downstream gitea-mcp call returned an
|
||||
// error; the error itself is surfaced via the JSON-RPC error response,
|
||||
// this field tells the operator which step it happened in.
|
||||
FailedStep string `json:"failed_step,omitempty"`
|
||||
}
|
||||
|
||||
func errUnknownTool(name string) error { return fmt.Errorf("unknown tool: %s", name) }
|
||||
|
||||
// step names — must match what we surface in failed_step / reached.
|
||||
const (
|
||||
stepCreateRepo = "create_repo"
|
||||
stepCreateGitHub = "create_github_repo"
|
||||
stepMirror = "mirror"
|
||||
stepInfraCommit = "infra_commit"
|
||||
stepIssue = "issue"
|
||||
)
|
||||
|
||||
func (s *Skill) handleCreate(ctx context.Context, raw json.RawMessage) (json.RawMessage, error) {
|
||||
var args createArgs
|
||||
if err := json.Unmarshal(raw, &args); err != nil {
|
||||
return nil, fmt.Errorf("parse args: %w", err)
|
||||
}
|
||||
if err := validate(args); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
tmpl := templateFor(args.Stack)
|
||||
giteaURL := fmt.Sprintf("http://gitea.d-ma.be/%s/%s", s.cfg.GiteaOwner, args.Name)
|
||||
githubURL := fmt.Sprintf("https://github.com/%s/%s", s.cfg.GitHubOwner, args.Name)
|
||||
|
||||
res := createResult{
|
||||
GiteaURL: giteaURL,
|
||||
GitHubURL: githubURL,
|
||||
}
|
||||
|
||||
// Step 1: create_project_from_template. If the repo already exists,
|
||||
// gitea-mcp returns -32003 Conflict; we treat that as idempotent success
|
||||
// and continue to the next steps so re-running self-heals partial runs.
|
||||
existed, err := s.callCreateRepo(ctx, args, tmpl)
|
||||
if err != nil {
|
||||
return marshalPartial(res, stepCreateRepo, err)
|
||||
}
|
||||
res.Reached = append(res.Reached, stepCreateRepo)
|
||||
|
||||
// Step 2: create empty GitHub repo. Gitea's push-mirror cannot push
|
||||
// to a non-existent remote, so the destination must exist before
|
||||
// step 3 configures the mirror. Skipped when GitHub client is unset
|
||||
// (degraded mode — see Config.GitHub doc).
|
||||
if s.cfg.GitHub != nil {
|
||||
if err := s.callCreateGitHubRepo(ctx, args); err != nil && !errors.Is(err, githubclient.ErrAlreadyExists) {
|
||||
return marshalPartial(res, stepCreateGitHub, err)
|
||||
}
|
||||
res.Reached = append(res.Reached, stepCreateGitHub)
|
||||
}
|
||||
|
||||
// Step 3: configure push mirror to GitHub. Idempotent: if a mirror with
|
||||
// the same remote already exists, gitea-mcp returns Conflict; we swallow it.
|
||||
if err := s.callMirror(ctx, args.Name); err != nil {
|
||||
if !isConflict(err) {
|
||||
return marshalPartial(res, stepMirror, err)
|
||||
}
|
||||
}
|
||||
res.Reached = append(res.Reached, stepMirror)
|
||||
|
||||
// Step 3: commit staging namespace manifest to infra repo. Done before
|
||||
// the issue so the staging env is reconciling by the time the issue lands.
|
||||
branch := fmt.Sprintf("staging/%s", args.Name)
|
||||
if err := s.callInfraCommit(ctx, args.Name, branch); err != nil {
|
||||
if !isConflict(err) {
|
||||
return marshalPartial(res, stepInfraCommit, err)
|
||||
}
|
||||
}
|
||||
res.Reached = append(res.Reached, stepInfraCommit)
|
||||
|
||||
// Step 4: open the experiment-brief issue on the new repo.
|
||||
issueURL, err := s.callIssue(ctx, args, existed)
|
||||
if err != nil {
|
||||
return marshalPartial(res, stepIssue, err)
|
||||
}
|
||||
res.IssueURL = issueURL
|
||||
res.Reached = append(res.Reached, stepIssue)
|
||||
|
||||
folder := args.Folder
|
||||
if folder == "" {
|
||||
folder = "."
|
||||
}
|
||||
res.NextSteps = fmt.Sprintf(
|
||||
"cd ~/dev/%s/%s && task new-project -- %s personal %s %s && git remote add origin http://gitea.d-ma.be/%s/%s.git && git push -u origin main",
|
||||
folder, args.Name, args.Name, folder, args.Stack, s.cfg.GiteaOwner, args.Name,
|
||||
)
|
||||
|
||||
return marshalResult(res)
|
||||
}
|
||||
|
||||
// callCreateRepo invokes create_project_from_template. Returns (existed, err)
|
||||
// where existed=true means the destination was already present and we should
|
||||
// treat it as a no-op success (idempotency).
|
||||
func (s *Skill) callCreateRepo(ctx context.Context, args createArgs, template string) (bool, error) {
|
||||
var out struct {
|
||||
HTMLURL string `json:"html_url"`
|
||||
}
|
||||
err := s.cfg.Client.CallTool(ctx, "create_project_from_template", map[string]any{
|
||||
"owner": s.cfg.GiteaOwner,
|
||||
"name": args.Name,
|
||||
"description": args.Description,
|
||||
"private": args.Private,
|
||||
"template_name": template,
|
||||
}, &out)
|
||||
if err == nil {
|
||||
return false, nil
|
||||
}
|
||||
if isConflict(err) {
|
||||
return true, nil
|
||||
}
|
||||
return false, err
|
||||
}
|
||||
|
||||
// callCreateGitHubRepo creates the empty destination repo on GitHub.
|
||||
// auto_init=false in githubclient so first push from gitea doesn't conflict
|
||||
// with an auto-generated README.
|
||||
func (s *Skill) callCreateGitHubRepo(ctx context.Context, args createArgs) error {
|
||||
_, err := s.cfg.GitHub.CreateRepo(ctx, args.Name, args.Description, args.Private)
|
||||
return err
|
||||
}
|
||||
|
||||
// callMirror configures the push mirror to GitHub.
|
||||
func (s *Skill) callMirror(ctx context.Context, name string) error {
|
||||
remote := fmt.Sprintf("https://github.com/%s/%s.git", s.cfg.GitHubOwner, name)
|
||||
return s.cfg.Client.CallTool(ctx, "repo_mirror_push", map[string]any{
|
||||
"owner": s.cfg.GiteaOwner,
|
||||
"name": name,
|
||||
"action": "add",
|
||||
"remote_address": remote,
|
||||
"remote_username": s.cfg.GitHubOwner,
|
||||
"remote_password": s.cfg.GitHubPAT,
|
||||
"interval": "8h0m0s",
|
||||
"sync_on_commit": true,
|
||||
}, nil)
|
||||
}
|
||||
|
||||
// callInfraCommit writes the staging namespace manifest into the infra repo
|
||||
// on a dedicated branch. Flux picks it up after merge.
|
||||
func (s *Skill) callInfraCommit(ctx context.Context, name, branch string) error {
|
||||
manifest := stagingNamespaceManifest(name, time.Now().UTC().Format(time.RFC3339))
|
||||
return s.cfg.Client.CallTool(ctx, "file_write_branch", map[string]any{
|
||||
"owner": s.cfg.GiteaOwner,
|
||||
"name": s.cfg.InfraRepo,
|
||||
"path": fmt.Sprintf("k3s/staging/%s/namespace.yaml", name),
|
||||
"content": manifest,
|
||||
"branch": branch,
|
||||
"base": "main",
|
||||
"message": fmt.Sprintf("feat(staging): add namespace for %s\n\nGenerated by hyperguild project_create.", name),
|
||||
}, nil)
|
||||
}
|
||||
|
||||
// callIssue opens the experiment-brief issue on the newly-created repo.
|
||||
// existed=true (repo pre-existed) still posts a new brief — repeated runs
|
||||
// can intentionally restate intent without colliding.
|
||||
func (s *Skill) callIssue(ctx context.Context, args createArgs, existed bool) (string, error) {
|
||||
body := experimentBrief(args, existed)
|
||||
var out struct {
|
||||
HTMLURL string `json:"html_url"`
|
||||
}
|
||||
err := s.cfg.Client.CallTool(ctx, "issue_create", map[string]any{
|
||||
"owner": s.cfg.GiteaOwner,
|
||||
"name": args.Name,
|
||||
"title": "experiment brief: " + args.Description,
|
||||
"body": body,
|
||||
}, &out)
|
||||
if err != nil {
|
||||
return "", err
|
||||
}
|
||||
return out.HTMLURL, nil
|
||||
}
|
||||
|
||||
func stagingNamespaceManifest(name, createdAt string) string {
|
||||
return fmt.Sprintf(`apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: staging-%s
|
||||
labels:
|
||||
managed-by: hyperguild
|
||||
project: %s
|
||||
created-at: "%s"
|
||||
`, name, name, createdAt)
|
||||
}
|
||||
|
||||
func experimentBrief(args createArgs, existed bool) string {
|
||||
var b strings.Builder
|
||||
b.WriteString("## Hypothesis\n\n")
|
||||
b.WriteString(args.Hypothesis)
|
||||
b.WriteString("\n\n## Description\n\n")
|
||||
b.WriteString(args.Description)
|
||||
b.WriteString("\n\n## Stack\n\n`")
|
||||
b.WriteString(args.Stack)
|
||||
b.WriteString("`\n\n## Provisioning\n\n")
|
||||
b.WriteString("- Repo created from `template-")
|
||||
b.WriteString(args.Stack)
|
||||
b.WriteString("` on Gitea.\n")
|
||||
b.WriteString("- Push-mirror configured to GitHub.\n")
|
||||
b.WriteString("- Staging namespace manifest committed to infra repo.\n\n")
|
||||
if existed {
|
||||
b.WriteString("> Note: this repo already existed when `project_create` ran — provisioning steps were re-applied idempotently.\n")
|
||||
}
|
||||
return b.String()
|
||||
}
|
||||
|
||||
func validate(args createArgs) error {
|
||||
if args.Name == "" {
|
||||
return errors.New("name is required")
|
||||
}
|
||||
if args.Description == "" {
|
||||
return errors.New("description is required")
|
||||
}
|
||||
if args.Hypothesis == "" {
|
||||
return errors.New("hypothesis is required")
|
||||
}
|
||||
if args.Stack != "go-agent" && args.Stack != "go-web" {
|
||||
return fmt.Errorf("stack must be go-agent or go-web, got %q", args.Stack)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func templateFor(stack string) string {
|
||||
switch stack {
|
||||
case "go-agent":
|
||||
return "template-go-agent"
|
||||
default:
|
||||
return "template-go-web"
|
||||
}
|
||||
}
|
||||
|
||||
func isConflict(err error) bool {
|
||||
var me *mcpclient.Error
|
||||
if errors.As(err, &me) && me.Code == -32003 {
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func marshalResult(r createResult) (json.RawMessage, error) {
|
||||
b, err := json.Marshal(r)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("marshal result: %w", err)
|
||||
}
|
||||
return b, nil
|
||||
}
|
||||
|
||||
func marshalPartial(r createResult, step string, inner error) (json.RawMessage, error) {
|
||||
r.FailedStep = step
|
||||
b, _ := json.Marshal(r)
|
||||
return b, fmt.Errorf("project_create step %q failed: %w", step, inner)
|
||||
}
|
||||
349
internal/skills/project/handlers_test.go
Normal file
349
internal/skills/project/handlers_test.go
Normal file
@@ -0,0 +1,349 @@
|
||||
package project_test
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"strings"
|
||||
"sync"
|
||||
"testing"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/githubclient"
|
||||
"github.com/mathiasbq/supervisor/internal/mcpclient"
|
||||
"github.com/mathiasbq/supervisor/internal/skills/project"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
// fakeGitHub captures POST /user/repos calls.
|
||||
type fakeGitHub struct {
|
||||
mu sync.Mutex
|
||||
Calls []map[string]any
|
||||
ReturnError int // 0 = 201 Created, 422 = already exists, etc.
|
||||
}
|
||||
|
||||
func (g *fakeGitHub) handler() http.Handler {
|
||||
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
var args map[string]any
|
||||
_ = json.NewDecoder(r.Body).Decode(&args)
|
||||
g.mu.Lock()
|
||||
g.Calls = append(g.Calls, args)
|
||||
code := g.ReturnError
|
||||
g.mu.Unlock()
|
||||
switch code {
|
||||
case 0:
|
||||
w.WriteHeader(http.StatusCreated)
|
||||
_, _ = w.Write([]byte(`{"full_name":"mathiasb/x","html_url":"https://github.com/mathiasb/x","clone_url":"https://github.com/mathiasb/x.git"}`))
|
||||
case 422:
|
||||
w.WriteHeader(http.StatusUnprocessableEntity)
|
||||
_, _ = w.Write([]byte(`{"errors":[{"message":"name already exists on this account"}]}`))
|
||||
default:
|
||||
w.WriteHeader(code)
|
||||
_, _ = w.Write([]byte(`{"message":"boom"}`))
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
// fakeGiteaMCP implements just enough of the JSON-RPC tools/call surface
|
||||
// to drive project_create end-to-end without an actual gitea-mcp server.
|
||||
type fakeGiteaMCP struct {
|
||||
mu sync.Mutex
|
||||
// Recorded calls in order.
|
||||
Calls []recordedCall
|
||||
// Per-tool response. Default is a generic success object.
|
||||
Responses map[string]any
|
||||
// Per-tool error response, takes precedence over Responses.
|
||||
Errors map[string]rpcErr
|
||||
}
|
||||
|
||||
type rpcErr struct {
|
||||
Code int
|
||||
Message string
|
||||
}
|
||||
|
||||
type recordedCall struct {
|
||||
Tool string
|
||||
Args map[string]any
|
||||
}
|
||||
|
||||
func (f *fakeGiteaMCP) handler() http.Handler {
|
||||
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
var req struct {
|
||||
ID int `json:"id"`
|
||||
Params json.RawMessage `json:"params"`
|
||||
}
|
||||
_ = json.NewDecoder(r.Body).Decode(&req)
|
||||
var p struct {
|
||||
Name string `json:"name"`
|
||||
Arguments json.RawMessage `json:"arguments"`
|
||||
}
|
||||
_ = json.Unmarshal(req.Params, &p)
|
||||
var args map[string]any
|
||||
_ = json.Unmarshal(p.Arguments, &args)
|
||||
|
||||
f.mu.Lock()
|
||||
f.Calls = append(f.Calls, recordedCall{Tool: p.Name, Args: args})
|
||||
errResp, hasErr := f.Errors[p.Name]
|
||||
var resp any
|
||||
if r, ok := f.Responses[p.Name]; ok {
|
||||
resp = r
|
||||
} else {
|
||||
resp = map[string]any{"html_url": "http://gitea.example/" + p.Name}
|
||||
}
|
||||
f.mu.Unlock()
|
||||
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
if hasErr {
|
||||
body, _ := json.Marshal(map[string]any{
|
||||
"jsonrpc": "2.0",
|
||||
"id": req.ID,
|
||||
"error": map[string]any{"code": errResp.Code, "message": errResp.Message},
|
||||
})
|
||||
_, _ = w.Write(body)
|
||||
return
|
||||
}
|
||||
respText, _ := json.Marshal(resp)
|
||||
body, _ := json.Marshal(map[string]any{
|
||||
"jsonrpc": "2.0",
|
||||
"id": req.ID,
|
||||
"result": map[string]any{
|
||||
"content": []map[string]any{{"type": "text", "text": string(respText)}},
|
||||
},
|
||||
})
|
||||
_, _ = w.Write(body)
|
||||
})
|
||||
}
|
||||
|
||||
func newSkill(t *testing.T, f *fakeGiteaMCP) (*project.Skill, *fakeGitHub) {
|
||||
t.Helper()
|
||||
srv := httptest.NewServer(f.handler())
|
||||
t.Cleanup(srv.Close)
|
||||
|
||||
gh := &fakeGitHub{}
|
||||
ghSrv := httptest.NewServer(gh.handler())
|
||||
t.Cleanup(ghSrv.Close)
|
||||
|
||||
return project.New(project.Config{
|
||||
Client: mcpclient.New(srv.URL, ""),
|
||||
GitHub: githubclient.New("ghp_test").WithBaseURL(ghSrv.URL),
|
||||
GiteaOwner: "mathias",
|
||||
GitHubOwner: "mathiasb",
|
||||
GitHubPAT: "ghp_test",
|
||||
InfraRepo: "infra",
|
||||
}), gh
|
||||
}
|
||||
|
||||
// newSkillNoGitHub builds a skill with the GitHub client unset — degraded
|
||||
// mode where the github-repo-creation step is skipped.
|
||||
func newSkillNoGitHub(t *testing.T, f *fakeGiteaMCP) *project.Skill {
|
||||
t.Helper()
|
||||
srv := httptest.NewServer(f.handler())
|
||||
t.Cleanup(srv.Close)
|
||||
return project.New(project.Config{
|
||||
Client: mcpclient.New(srv.URL, ""),
|
||||
GiteaOwner: "mathias",
|
||||
GitHubOwner: "mathiasb",
|
||||
InfraRepo: "infra",
|
||||
})
|
||||
}
|
||||
|
||||
func happyArgs() json.RawMessage {
|
||||
return json.RawMessage(`{
|
||||
"name":"my-experiment",
|
||||
"description":"One-line desc",
|
||||
"hypothesis":"We believe X produces Y",
|
||||
"folder":"AGENTS",
|
||||
"stack":"go-agent",
|
||||
"private":true
|
||||
}`)
|
||||
}
|
||||
|
||||
func TestProjectCreate_HappyPath(t *testing.T) {
|
||||
f := &fakeGiteaMCP{
|
||||
Responses: map[string]any{
|
||||
"issue_create": map[string]any{"html_url": "http://gitea.d-ma.be/mathias/my-experiment/issues/1"},
|
||||
},
|
||||
}
|
||||
skill, gh := newSkill(t, f)
|
||||
|
||||
out, err := skill.Handle(context.Background(), "project_create", happyArgs())
|
||||
require.NoError(t, err)
|
||||
|
||||
var res map[string]any
|
||||
require.NoError(t, json.Unmarshal(out, &res))
|
||||
assert.Equal(t, "http://gitea.d-ma.be/mathias/my-experiment", res["gitea_url"])
|
||||
assert.Equal(t, "https://github.com/mathiasb/my-experiment", res["github_url"])
|
||||
assert.Equal(t, "http://gitea.d-ma.be/mathias/my-experiment/issues/1", res["issue_url"])
|
||||
assert.Contains(t, res["next_steps"], "cd ~/dev/AGENTS/my-experiment")
|
||||
assert.Contains(t, res["next_steps"], "git remote add origin")
|
||||
|
||||
// All 4 gitea-mcp calls in order.
|
||||
require.Len(t, f.Calls, 4)
|
||||
assert.Equal(t, "create_project_from_template", f.Calls[0].Tool)
|
||||
assert.Equal(t, "repo_mirror_push", f.Calls[1].Tool)
|
||||
assert.Equal(t, "file_write_branch", f.Calls[2].Tool)
|
||||
assert.Equal(t, "issue_create", f.Calls[3].Tool)
|
||||
|
||||
// GitHub repo created between create_project_from_template and mirror.
|
||||
require.Len(t, gh.Calls, 1)
|
||||
assert.Equal(t, "my-experiment", gh.Calls[0]["name"])
|
||||
assert.Equal(t, true, gh.Calls[0]["private"])
|
||||
assert.Equal(t, false, gh.Calls[0]["auto_init"])
|
||||
|
||||
// template selection wired from stack
|
||||
assert.Equal(t, "template-go-agent", f.Calls[0].Args["template_name"])
|
||||
// mirror config
|
||||
assert.Equal(t, "add", f.Calls[1].Args["action"])
|
||||
assert.Equal(t, "https://github.com/mathiasb/my-experiment.git", f.Calls[1].Args["remote_address"])
|
||||
assert.Equal(t, "ghp_test", f.Calls[1].Args["remote_password"])
|
||||
// infra commit path
|
||||
assert.Equal(t, "k3s/staging/my-experiment/namespace.yaml", f.Calls[2].Args["path"])
|
||||
assert.Contains(t, f.Calls[2].Args["content"], "name: staging-my-experiment")
|
||||
assert.Contains(t, f.Calls[2].Args["content"], "managed-by: hyperguild")
|
||||
// PAT must NOT appear in the response
|
||||
assert.NotContains(t, string(out), "ghp_test")
|
||||
|
||||
// reached records the github step too.
|
||||
reached := res["reached"].([]any)
|
||||
assert.Equal(t, []any{"create_repo", "create_github_repo", "mirror", "infra_commit", "issue"}, reached)
|
||||
}
|
||||
|
||||
func TestProjectCreate_GitHubExists_Idempotent(t *testing.T) {
|
||||
f := &fakeGiteaMCP{
|
||||
Responses: map[string]any{
|
||||
"issue_create": map[string]any{"html_url": "http://gitea.d-ma.be/mathias/my-experiment/issues/1"},
|
||||
},
|
||||
}
|
||||
skill, gh := newSkill(t, f)
|
||||
gh.ReturnError = 422 // already exists
|
||||
|
||||
_, err := skill.Handle(context.Background(), "project_create", happyArgs())
|
||||
require.NoError(t, err, "422 already-exists should be idempotent")
|
||||
require.Len(t, f.Calls, 4, "all gitea steps still run despite github 422")
|
||||
}
|
||||
|
||||
func TestProjectCreate_GitHubFails(t *testing.T) {
|
||||
f := &fakeGiteaMCP{}
|
||||
skill, gh := newSkill(t, f)
|
||||
gh.ReturnError = 401 // bad PAT
|
||||
|
||||
out, err := skill.Handle(context.Background(), "project_create", happyArgs())
|
||||
require.Error(t, err)
|
||||
var res map[string]any
|
||||
require.NoError(t, json.Unmarshal(out, &res))
|
||||
assert.Equal(t, "create_github_repo", res["failed_step"])
|
||||
assert.Equal(t, []any{"create_repo"}, res["reached"])
|
||||
require.Len(t, f.Calls, 1, "mirror + later steps must not run when github creation fails")
|
||||
}
|
||||
|
||||
func TestProjectCreate_NoGitHubClient_DegradedMode(t *testing.T) {
|
||||
f := &fakeGiteaMCP{
|
||||
Responses: map[string]any{
|
||||
"issue_create": map[string]any{"html_url": "http://gitea.d-ma.be/mathias/my-experiment/issues/1"},
|
||||
},
|
||||
}
|
||||
skill := newSkillNoGitHub(t, f)
|
||||
|
||||
out, err := skill.Handle(context.Background(), "project_create", happyArgs())
|
||||
require.NoError(t, err)
|
||||
var res map[string]any
|
||||
require.NoError(t, json.Unmarshal(out, &res))
|
||||
// reached does NOT include create_github_repo when client is nil.
|
||||
reached := res["reached"].([]any)
|
||||
assert.Equal(t, []any{"create_repo", "mirror", "infra_commit", "issue"}, reached)
|
||||
}
|
||||
|
||||
func TestProjectCreate_Idempotent_RepoExists(t *testing.T) {
|
||||
f := &fakeGiteaMCP{
|
||||
Errors: map[string]rpcErr{
|
||||
"create_project_from_template": {Code: -32003, Message: "already exists"},
|
||||
},
|
||||
Responses: map[string]any{
|
||||
"issue_create": map[string]any{"html_url": "http://gitea.d-ma.be/mathias/my-experiment/issues/1"},
|
||||
},
|
||||
}
|
||||
skill, _ := newSkill(t, f)
|
||||
|
||||
out, err := skill.Handle(context.Background(), "project_create", happyArgs())
|
||||
require.NoError(t, err)
|
||||
|
||||
var res map[string]any
|
||||
require.NoError(t, json.Unmarshal(out, &res))
|
||||
assert.Equal(t, "http://gitea.d-ma.be/mathias/my-experiment", res["gitea_url"])
|
||||
assert.Equal(t, "http://gitea.d-ma.be/mathias/my-experiment/issues/1", res["issue_url"])
|
||||
|
||||
// Still ran all 4 gitea-mcp steps; idempotent flow falls through.
|
||||
require.Len(t, f.Calls, 4)
|
||||
}
|
||||
|
||||
func TestProjectCreate_MirrorFails(t *testing.T) {
|
||||
f := &fakeGiteaMCP{
|
||||
Errors: map[string]rpcErr{
|
||||
"repo_mirror_push": {Code: -32000, Message: "github unreachable"},
|
||||
},
|
||||
}
|
||||
skill, _ := newSkill(t, f)
|
||||
|
||||
out, err := skill.Handle(context.Background(), "project_create", happyArgs())
|
||||
require.Error(t, err)
|
||||
assert.Contains(t, err.Error(), `"mirror" failed`)
|
||||
|
||||
var res map[string]any
|
||||
require.NoError(t, json.Unmarshal(out, &res))
|
||||
assert.Equal(t, "mirror", res["failed_step"])
|
||||
reached := res["reached"].([]any)
|
||||
assert.Equal(t, []any{"create_repo", "create_github_repo"}, reached)
|
||||
|
||||
// Steps 1 (create) + 2 (mirror attempt) reached gitea; github made 1 call.
|
||||
require.Len(t, f.Calls, 2)
|
||||
}
|
||||
|
||||
func TestProjectCreate_InfraCommitFails(t *testing.T) {
|
||||
f := &fakeGiteaMCP{
|
||||
Errors: map[string]rpcErr{
|
||||
"file_write_branch": {Code: -32000, Message: "write rejected"},
|
||||
},
|
||||
}
|
||||
skill, _ := newSkill(t, f)
|
||||
|
||||
out, err := skill.Handle(context.Background(), "project_create", happyArgs())
|
||||
require.Error(t, err)
|
||||
|
||||
var res map[string]any
|
||||
require.NoError(t, json.Unmarshal(out, &res))
|
||||
assert.Equal(t, "infra_commit", res["failed_step"])
|
||||
reached := res["reached"].([]any)
|
||||
assert.Equal(t, []any{"create_repo", "create_github_repo", "mirror"}, reached)
|
||||
require.Len(t, f.Calls, 3)
|
||||
}
|
||||
|
||||
func TestProjectCreate_ValidationErrors(t *testing.T) {
|
||||
f := &fakeGiteaMCP{}
|
||||
skill, _ := newSkill(t, f)
|
||||
cases := []struct {
|
||||
name string
|
||||
body string
|
||||
want string
|
||||
}{
|
||||
{"missing name", `{"description":"d","hypothesis":"h","stack":"go-agent"}`, "name"},
|
||||
{"missing description", `{"name":"x","hypothesis":"h","stack":"go-agent"}`, "description"},
|
||||
{"missing hypothesis", `{"name":"x","description":"d","stack":"go-agent"}`, "hypothesis"},
|
||||
{"bad stack", `{"name":"x","description":"d","hypothesis":"h","stack":"python"}`, "stack"},
|
||||
}
|
||||
for _, tc := range cases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
_, err := skill.Handle(context.Background(), "project_create", json.RawMessage(tc.body))
|
||||
require.Error(t, err)
|
||||
assert.True(t, strings.Contains(err.Error(), tc.want), "want %q in %v", tc.want, err)
|
||||
})
|
||||
}
|
||||
assert.Empty(t, f.Calls, "no upstream calls should occur on validation failure")
|
||||
}
|
||||
|
||||
func TestProjectCreate_UnknownTool(t *testing.T) {
|
||||
f := &fakeGiteaMCP{}
|
||||
skill, _ := newSkill(t, f)
|
||||
_, err := skill.Handle(context.Background(), "nope", happyArgs())
|
||||
require.Error(t, err)
|
||||
}
|
||||
100
internal/skills/project/skill.go
Normal file
100
internal/skills/project/skill.go
Normal file
@@ -0,0 +1,100 @@
|
||||
// Package project implements the `project_create` MCP tool: a single-call
|
||||
// pipeline that creates a Gitea repo from a template, configures push-mirror
|
||||
// to GitHub, commits a staging namespace manifest to the infra repo, and
|
||||
// opens an experiment-brief issue on the new repo. See hyperguild gitea
|
||||
// issue #10 for the design.
|
||||
package project
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/githubclient"
|
||||
"github.com/mathiasbq/supervisor/internal/mcpclient"
|
||||
"github.com/mathiasbq/supervisor/internal/registry"
|
||||
)
|
||||
|
||||
// Config holds the orchestration dependencies for the project skill.
|
||||
type Config struct {
|
||||
// Client talks to the gitea-mcp server. project_create makes
|
||||
// sequential calls (create_project_from_template, repo_mirror_push,
|
||||
// file_write_branch, issue_create) through this client.
|
||||
Client *mcpclient.Client
|
||||
|
||||
// GitHub is the client used to create the empty destination repo on
|
||||
// GitHub before the push-mirror is configured. Gitea's push-mirror
|
||||
// cannot push to a non-existent remote, so this step is mandatory
|
||||
// when GitHubPAT is set. Pass nil to skip github repo creation
|
||||
// entirely (degraded mode — mirror config will land but the actual
|
||||
// sync to github will fail until the repo exists).
|
||||
GitHub *githubclient.Client
|
||||
|
||||
// GiteaOwner is the org/user that owns the new repo and the infra repo
|
||||
// the namespace manifest is committed to (typically "mathias").
|
||||
GiteaOwner string
|
||||
|
||||
// GitHubOwner is the GitHub org/user the push-mirror targets
|
||||
// (typically "mathiasb").
|
||||
GitHubOwner string
|
||||
|
||||
// GitHubPAT is the personal access token used as the push-mirror
|
||||
// password and to create the destination repo on GitHub. Must have
|
||||
// `repo` scope. Never logged.
|
||||
GitHubPAT string
|
||||
|
||||
// InfraRepo is the name of the infra repo on Gitea where the
|
||||
// k3s/staging/<name>/namespace.yaml manifest gets committed
|
||||
// (typically "infra").
|
||||
InfraRepo string
|
||||
}
|
||||
|
||||
// Skill exposes project_create as an MCP tool.
|
||||
type Skill struct{ cfg Config }
|
||||
|
||||
// New constructs the project Skill.
|
||||
func New(cfg Config) *Skill { return &Skill{cfg: cfg} }
|
||||
|
||||
// Name returns the skill identifier.
|
||||
func (s *Skill) Name() string { return "project" }
|
||||
|
||||
// Tools returns the MCP tool definitions for this skill.
|
||||
func (s *Skill) Tools() []registry.ToolDef {
|
||||
schema, _ := json.Marshal(map[string]any{
|
||||
"type": "object",
|
||||
"properties": map[string]any{
|
||||
"name": map[string]any{
|
||||
"type": "string",
|
||||
"pattern": `^[a-z][a-z0-9-]{1,38}[a-z0-9]$`,
|
||||
"description": "Lowercase repo name. 3-40 chars, must start with a letter.",
|
||||
},
|
||||
"description": map[string]any{"type": "string"},
|
||||
"hypothesis": map[string]any{"type": "string"},
|
||||
"folder": map[string]any{
|
||||
"type": "string",
|
||||
"description": "Informational only — appears in next_steps. Example: AGENTS, AI, QKX.",
|
||||
},
|
||||
"stack": map[string]any{
|
||||
"type": "string",
|
||||
"enum": []string{"go-agent", "go-web"},
|
||||
"description": "Selects template-go-agent or template-go-web.",
|
||||
},
|
||||
"private": map[string]any{"type": "boolean"},
|
||||
},
|
||||
"required": []string{"name", "description", "hypothesis", "stack"},
|
||||
})
|
||||
return []registry.ToolDef{
|
||||
{
|
||||
Name: "project_create",
|
||||
Description: "Bootstrap a new project: Gitea repo from template, GitHub push-mirror, staging namespace manifest, experiment-brief issue. Idempotent — re-running with an existing repo returns the existing URLs.",
|
||||
InputSchema: schema,
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
// Handle dispatches the tool call.
|
||||
func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
|
||||
if tool != "project_create" {
|
||||
return nil, errUnknownTool(tool)
|
||||
}
|
||||
return s.handleCreate(ctx, args)
|
||||
}
|
||||
@@ -1,87 +0,0 @@
|
||||
// internal/skills/spec/handlers.go
|
||||
package spec
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"time"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/brain"
|
||||
"github.com/mathiasbq/supervisor/internal/session"
|
||||
)
|
||||
|
||||
type specArgs struct {
|
||||
ProjectRoot string `json:"project_root"`
|
||||
Requirements string `json:"requirements"`
|
||||
OutputPath string `json:"output_path"`
|
||||
Context string `json:"context"`
|
||||
Model string `json:"model"`
|
||||
SessionID string `json:"session_id"`
|
||||
}
|
||||
|
||||
// Handle dispatches the MCP tool call to the appropriate handler.
|
||||
func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
|
||||
if tool != "spec" {
|
||||
return nil, fmt.Errorf("unknown tool: %s", tool)
|
||||
}
|
||||
var a specArgs
|
||||
if err := json.Unmarshal(args, &a); err != nil {
|
||||
return nil, fmt.Errorf("parse args: %w", err)
|
||||
}
|
||||
if a.ProjectRoot == "" {
|
||||
return nil, fmt.Errorf("project_root is required")
|
||||
}
|
||||
if a.Requirements == "" {
|
||||
return nil, fmt.Errorf("requirements is required")
|
||||
}
|
||||
outputPath := a.OutputPath
|
||||
if outputPath == "" {
|
||||
outputPath = "docs/spec.md"
|
||||
}
|
||||
|
||||
model := a.Model
|
||||
if model == "" {
|
||||
model = s.cfg.DefaultModel
|
||||
}
|
||||
|
||||
brainCtx, _ := brain.Query(ctx, s.cfg.IngestBaseURL, a.Requirements+" "+a.Context, 3)
|
||||
|
||||
task := fmt.Sprintf(
|
||||
"phase: spec\nproject_root: %s\nrequirements: %s\noutput_path: %s\ncontext: %s\nmodel: %s",
|
||||
a.ProjectRoot, a.Requirements, outputPath, a.Context, model,
|
||||
)
|
||||
task = session.PrependHistory(s.cfg.SessionsDir, a.SessionID, "spec", task)
|
||||
if brainCtx != "" {
|
||||
task = brainCtx + "\n---\n\n" + task
|
||||
}
|
||||
|
||||
if s.cfg.CompleteFunc == nil {
|
||||
return nil, fmt.Errorf("no executor configured")
|
||||
}
|
||||
t0 := time.Now()
|
||||
text, dur, err := s.cfg.CompleteFunc(ctx, model, s.cfg.SkillPrompt, task)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
if a.SessionID != "" && s.cfg.SessionsDir != "" {
|
||||
msg := text
|
||||
if len(msg) > 200 {
|
||||
msg = msg[:200]
|
||||
}
|
||||
_ = session.Append(s.cfg.SessionsDir, a.SessionID, session.Entry{
|
||||
SessionID: a.SessionID,
|
||||
Timestamp: time.Now(),
|
||||
Skill: "spec",
|
||||
Phase: "spec",
|
||||
ProjectRoot: a.ProjectRoot,
|
||||
FinalStatus: "ok",
|
||||
ModelUsed: model,
|
||||
DurationMs: time.Since(t0).Milliseconds(),
|
||||
Message: msg,
|
||||
})
|
||||
}
|
||||
|
||||
return json.Marshal(map[string]any{"text": text, "model": model, "duration_ms": dur})
|
||||
}
|
||||
@@ -1,53 +0,0 @@
|
||||
// internal/skills/spec/handlers_test.go
|
||||
package spec_test
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"testing"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/skills/spec"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func TestSpecToolRegistered(t *testing.T) {
|
||||
sk := spec.New(spec.Config{SkillPrompt: "spec rules"})
|
||||
names := make([]string, 0)
|
||||
for _, tool := range sk.Tools() {
|
||||
names = append(names, tool.Name)
|
||||
}
|
||||
assert.Contains(t, names, "spec")
|
||||
}
|
||||
|
||||
func TestSpecRequiresProjectRoot(t *testing.T) {
|
||||
sk := spec.New(spec.Config{SkillPrompt: "s"})
|
||||
_, err := sk.Handle(context.Background(), "spec", json.RawMessage(`{"requirements":"add login"}`))
|
||||
assert.ErrorContains(t, err, "project_root")
|
||||
}
|
||||
|
||||
func TestSpecRequiresRequirements(t *testing.T) {
|
||||
sk := spec.New(spec.Config{SkillPrompt: "s"})
|
||||
_, err := sk.Handle(context.Background(), "spec", json.RawMessage(`{"project_root":"/tmp"}`))
|
||||
assert.ErrorContains(t, err, "requirements")
|
||||
}
|
||||
|
||||
func TestSpecCallsCompleteFunc(t *testing.T) {
|
||||
var capturedTask string
|
||||
fakeFn := func(_ context.Context, _, _, user string) (string, int64, error) {
|
||||
capturedTask = user
|
||||
return "# OAuth2 Login Spec\n\n## Overview\nImplement OAuth2 login flow.", 110, nil
|
||||
}
|
||||
|
||||
sk := spec.New(spec.Config{SkillPrompt: "spec rules", CompleteFunc: fakeFn, SessionsDir: t.TempDir()})
|
||||
out, err := sk.Handle(context.Background(), "spec", json.RawMessage(
|
||||
`{"project_root":"/tmp/proj","requirements":"add OAuth2 login","output_path":"docs/login-spec.md"}`,
|
||||
))
|
||||
require.NoError(t, err)
|
||||
assert.Contains(t, capturedTask, "OAuth2 login")
|
||||
assert.Contains(t, capturedTask, "docs/login-spec.md")
|
||||
|
||||
var result map[string]any
|
||||
require.NoError(t, json.Unmarshal(out, &result))
|
||||
assert.Contains(t, result["text"], "OAuth2 Login Spec")
|
||||
}
|
||||
@@ -1,56 +0,0 @@
|
||||
// internal/skills/spec/skill.go
|
||||
package spec
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/registry"
|
||||
)
|
||||
|
||||
// CompleteFunc is the function used to call a local model.
|
||||
type CompleteFunc func(ctx context.Context, model, system, user string) (string, int64, error)
|
||||
|
||||
// Config holds dependencies for the spec skill.
|
||||
type Config struct {
|
||||
SkillPrompt string
|
||||
DefaultModel string
|
||||
CompleteFunc CompleteFunc
|
||||
SessionsDir string
|
||||
IngestBaseURL string
|
||||
}
|
||||
|
||||
// Skill implements the spec MCP tool.
|
||||
type Skill struct{ cfg Config }
|
||||
|
||||
// New creates a new spec Skill.
|
||||
func New(cfg Config) *Skill { return &Skill{cfg: cfg} }
|
||||
|
||||
// Name returns the skill identifier.
|
||||
func (s *Skill) Name() string { return "spec" }
|
||||
|
||||
// Tools returns the MCP tool definitions for this skill.
|
||||
func (s *Skill) Tools() []registry.ToolDef {
|
||||
schema := func(required []string, props map[string]any) json.RawMessage {
|
||||
b, _ := json.Marshal(map[string]any{"type": "object", "required": required, "properties": props})
|
||||
return b
|
||||
}
|
||||
str := map[string]any{"type": "string"}
|
||||
return []registry.ToolDef{
|
||||
{
|
||||
Name: "spec",
|
||||
Description: "Consult a local model to draft a structured implementation spec from requirements. Returns the spec text.",
|
||||
InputSchema: schema(
|
||||
[]string{"project_root", "requirements"},
|
||||
map[string]any{
|
||||
"project_root": str,
|
||||
"requirements": str,
|
||||
"output_path": str,
|
||||
"context": str,
|
||||
"model": str,
|
||||
"session_id": str,
|
||||
},
|
||||
),
|
||||
},
|
||||
}
|
||||
}
|
||||
@@ -1,173 +0,0 @@
|
||||
package tdd
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"time"
|
||||
|
||||
"github.com/mathiasbq/supervisor/internal/brain"
|
||||
"github.com/mathiasbq/supervisor/internal/session"
|
||||
)
|
||||
|
||||
func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
|
||||
switch tool {
|
||||
case "tdd_red":
|
||||
return s.handleRed(ctx, args)
|
||||
case "tdd_green":
|
||||
return s.handleGreen(ctx, args)
|
||||
case "tdd_refactor":
|
||||
return s.handleRefactor(ctx, args)
|
||||
default:
|
||||
return nil, fmt.Errorf("unknown tool: %s", tool)
|
||||
}
|
||||
}
|
||||
|
||||
type redArgs struct {
|
||||
ProjectRoot string `json:"project_root"`
|
||||
Spec string `json:"spec"`
|
||||
Model string `json:"model"`
|
||||
TestCmd string `json:"test_cmd"`
|
||||
}
|
||||
|
||||
func (s *Skill) handleRed(ctx context.Context, raw json.RawMessage) (json.RawMessage, error) {
|
||||
var args redArgs
|
||||
if err := json.Unmarshal(raw, &args); err != nil {
|
||||
return nil, fmt.Errorf("parse args: %w", err)
|
||||
}
|
||||
if args.ProjectRoot == "" {
|
||||
return nil, fmt.Errorf("project_root is required")
|
||||
}
|
||||
if args.Spec == "" {
|
||||
return nil, fmt.Errorf("spec is required")
|
||||
}
|
||||
brainCtx, _ := brain.Query(ctx, s.cfg.IngestBaseURL, args.Spec, 3)
|
||||
|
||||
task := fmt.Sprintf(
|
||||
"phase: red\nproject_root: %s\nspec: %s\nmodel: %s\ntest_cmd: %s",
|
||||
args.ProjectRoot, args.Spec, s.resolveModel(args.Model), args.TestCmd,
|
||||
)
|
||||
if brainCtx != "" {
|
||||
task = brainCtx + "\n---\n\n" + task
|
||||
}
|
||||
return s.complete(ctx, s.resolveModel(args.Model), task)
|
||||
}
|
||||
|
||||
type greenArgs struct {
|
||||
ProjectRoot string `json:"project_root"`
|
||||
TestPath string `json:"test_path"`
|
||||
Model string `json:"model"`
|
||||
TestCmd string `json:"test_cmd"`
|
||||
SessionID string `json:"session_id"`
|
||||
}
|
||||
|
||||
func (s *Skill) handleGreen(ctx context.Context, raw json.RawMessage) (json.RawMessage, error) {
|
||||
var args greenArgs
|
||||
if err := json.Unmarshal(raw, &args); err != nil {
|
||||
return nil, fmt.Errorf("parse args: %w", err)
|
||||
}
|
||||
if args.ProjectRoot == "" {
|
||||
return nil, fmt.Errorf("project_root is required")
|
||||
}
|
||||
if args.TestPath == "" {
|
||||
return nil, fmt.Errorf("test_path is required")
|
||||
}
|
||||
task := fmt.Sprintf(
|
||||
"phase: green\nproject_root: %s\ntest_path: %s\nmodel: %s\ntest_cmd: %s",
|
||||
args.ProjectRoot, args.TestPath, s.resolveModel(args.Model), args.TestCmd,
|
||||
)
|
||||
task = session.PrependHistory(s.cfg.SessionsDir, args.SessionID, "green", task)
|
||||
|
||||
t0 := time.Now()
|
||||
result, err := s.complete(ctx, s.resolveModel(args.Model), task)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
s.logEntry(args.SessionID, args.ProjectRoot, "tdd", "green", s.resolveModel(args.Model), t0, result)
|
||||
return result, nil
|
||||
}
|
||||
|
||||
type refactorArgs struct {
|
||||
ProjectRoot string `json:"project_root"`
|
||||
TestPath string `json:"test_path"`
|
||||
ImplPath string `json:"impl_path"`
|
||||
Model string `json:"model"`
|
||||
TestCmd string `json:"test_cmd"`
|
||||
SessionID string `json:"session_id"`
|
||||
}
|
||||
|
||||
func (s *Skill) handleRefactor(ctx context.Context, raw json.RawMessage) (json.RawMessage, error) {
|
||||
var args refactorArgs
|
||||
if err := json.Unmarshal(raw, &args); err != nil {
|
||||
return nil, fmt.Errorf("parse args: %w", err)
|
||||
}
|
||||
if args.ProjectRoot == "" {
|
||||
return nil, fmt.Errorf("project_root is required")
|
||||
}
|
||||
if args.TestPath == "" {
|
||||
return nil, fmt.Errorf("test_path is required")
|
||||
}
|
||||
if args.ImplPath == "" {
|
||||
return nil, fmt.Errorf("impl_path is required")
|
||||
}
|
||||
task := fmt.Sprintf(
|
||||
"phase: refactor\nproject_root: %s\ntest_path: %s\nimpl_path: %s\nmodel: %s\ntest_cmd: %s",
|
||||
args.ProjectRoot, args.TestPath, args.ImplPath, s.resolveModel(args.Model), args.TestCmd,
|
||||
)
|
||||
task = session.PrependHistory(s.cfg.SessionsDir, args.SessionID, "refactor", task)
|
||||
|
||||
t0 := time.Now()
|
||||
result, err := s.complete(ctx, s.resolveModel(args.Model), task)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
s.logEntry(args.SessionID, args.ProjectRoot, "tdd", "refactor", s.resolveModel(args.Model), t0, result)
|
||||
return result, nil
|
||||
}
|
||||
|
||||
func (s *Skill) resolveModel(override string) string {
|
||||
if override != "" {
|
||||
return override
|
||||
}
|
||||
return s.cfg.DefaultModel
|
||||
}
|
||||
|
||||
// complete calls CompleteFunc and returns the text as JSON.
|
||||
func (s *Skill) complete(ctx context.Context, model, task string) (json.RawMessage, error) {
|
||||
if s.cfg.CompleteFunc == nil {
|
||||
return nil, fmt.Errorf("no executor configured")
|
||||
}
|
||||
text, dur, err := s.cfg.CompleteFunc(ctx, model, s.cfg.SkillPrompt, task)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return json.Marshal(map[string]any{"text": text, "model": model, "duration_ms": dur})
|
||||
}
|
||||
|
||||
// logEntry writes a session.Entry for a completed phase if session_id is set.
|
||||
func (s *Skill) logEntry(sessionID, projectRoot, skill, phase, model string, t0 time.Time, raw json.RawMessage) {
|
||||
if sessionID == "" || s.cfg.SessionsDir == "" {
|
||||
return
|
||||
}
|
||||
var msg string
|
||||
var result struct {
|
||||
Text string `json:"text"`
|
||||
}
|
||||
if err := json.Unmarshal(raw, &result); err == nil && len(result.Text) > 0 {
|
||||
msg = result.Text
|
||||
if len(msg) > 200 {
|
||||
msg = msg[:200]
|
||||
}
|
||||
}
|
||||
_ = session.Append(s.cfg.SessionsDir, sessionID, session.Entry{
|
||||
SessionID: sessionID,
|
||||
Timestamp: time.Now(),
|
||||
Skill: skill,
|
||||
Phase: phase,
|
||||
ProjectRoot: projectRoot,
|
||||
FinalStatus: "ok",
|
||||
ModelUsed: model,
|
||||
DurationMs: time.Since(t0).Milliseconds(),
|
||||
Message: msg,
|
||||
})
|
||||
}
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user