38 Commits

Author SHA1 Message Date
Mathias Bergqvist
cbf5cab5e7 docs(plans): pass-rate logging implementation plan — Plan 5
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 4s
Seven-task plan spanning two worktrees: Phase 1 (local-dev) for
SKILL.md instrumentation (pilot tdd, then rollout to 6 binary-outcome
skills), Phase 2 (hyperguild) for the /pass-rate endpoint + CLI
subcommand. Uses GET for the pure-read endpoint (REST semantics)
despite the brain API's POST-everywhere precedent for legacy
endpoints.

Pilot-then-rollout structure: tdd SKILL.md lands first; the
endpoint TDD task naturally exercises the new logging contract
as the dogfood validation step before the other 6 skills get
instrumented.
2026-05-03 22:29:56 +02:00
Mathias Bergqvist
af52f501fe docs(specs): pass-rate logging design — Plan 5 of hyperguild migration
Skill instrumentation pattern, brain /pass-rate HTTP endpoint, and
optional hyperguild CLI subcommand for shell access. Pilot with tdd
SKILL.md, then roll out to 6 binary-outcome skills (code-review,
debug, feature-spec, session-retrospective, trainer, spec-driven-dev).

Decisions: SKILL.md as source of truth for the logging contract;
on-demand aggregation from JSONL (no materialized counters until
proven necessary); pass|fail|skip vocabulary forward, with
ok|error|skipped accepted by the read-side aggregator for backwards
compat.

Seven success criteria, ten out-of-scope items, six risks.
2026-05-03 22:23:28 +02:00
Mathias Bergqvist
b3b1fde825 feat: hyperguild CLI — Plan 4 of hyperguild migration
All checks were successful
CI / Lint / Test / Vet (push) Successful in 15s
CI / Mirror to GitHub (push) Successful in 3s
A small Go binary at cmd/hyperguild/ providing four subcommands:

  - tier              probe Anthropic + LiteLLM, print operating tier
  - brain query       BM25 search the brain via HTTP REST
  - brain write       write a knowledge entry from stdin
  - mode <name>       bootstrap .mcp.json (cloud|client-local|sovereign)

Reuses internal/tier verbatim. Brain access uses the HTTP REST API
at ${BRAIN_URL:-http://koala:30330} with POST /query and POST /write.
Mode templates include a Plan 6 routing-pod placeholder (client-local)
and a Crush-fallback note (sovereign).

Stdlib only — flag, net/http, encoding/json. 32 unit tests via
httptest.Server fakes; smoke-tested end-to-end against the live brain.

Replaces the supervisor's tier MCP. Plans 5 (pass-rate logging) and 6
(routing pod) build on the brain client and tier subcommand.

Plan: docs/superpowers/plans/2026-05-03-hyperguild-cli.md
Spec: docs/superpowers/specs/2026-05-03-hyperguild-cli-design.md
2026-05-03 22:06:48 +02:00
Mathias Bergqvist
ab4cfaaeb7 fix(hyperguild): remove redundant subcommand prefixes from error messages
dispatch() already prefixes errors with 'hyperguild <subcmd>: ', so
handlers re-prefixing with their own name produced stuttered output
like 'hyperguild brain: brain query: topic required'. Strip the
redundant prefixes from the seven affected errors.New / fmt.Errorf
calls in brain.go and mode.go.

Also fix the LITELLM_BASE_URL usage text — it's optional (empty
falls through to airplane tier), not required, matching the
README.
2026-05-03 22:06:33 +02:00
Mathias Bergqvist
eb844edb29 docs(specs): correct brain Query method (POST not GET)
The brain HTTP REST /query endpoint accepts POST with JSON body
{query, limit}, not GET with URL query string. Surfaced empirically
during Plan 4 Task 7 smoke testing. Spec text now matches the
implementation.
2026-05-03 22:00:05 +02:00
Mathias Bergqvist
317ec20392 fix(hyperguild): brain Query uses POST /query with JSON body
The brain HTTP REST /query endpoint accepts POST with JSON
{query, limit}, not GET with URL query string. Surfaced by
Task 7 smoke testing — GET returned 405 Method Not Allowed.

The response shape ({results:[...]}) is unchanged; only the
request side flips to POST + JSON body. brainClient.Write was
already using POST + JSON body and is unaffected.

Tests updated to assert POST + JSON body on the Query path.
2026-05-03 21:59:17 +02:00
Mathias Bergqvist
eab8775f5f feat(hyperguild): README + Taskfile integration
Adds cmd/hyperguild/README.md (subcommands, env vars, install path)
and three Taskfile targets:

  task hyperguild:dev     — go run from source
  task hyperguild:build   — build into ./bin/hyperguild
  task hyperguild:install — go install into $GOBIN

Concludes Plan 4 of the hyperguild migration. The binary replaces
the supervisor's tier MCP and surfaces brain HTTP REST access plus
mode bootstrap to shell pipelines and ad-hoc agent prompts.
2026-05-03 21:56:20 +02:00
Mathias Bergqvist
a0d0914a85 feat(hyperguild): mode subcommand
'hyperguild mode <cloud|client-local|sovereign>' writes a per-mode
.mcp.json template:

  - cloud:        brain MCP only
  - client-local: brain + routing placeholder with _routing_pending
                  pointer to Plan 6
  - sovereign:    brain only + top-level _mode_note explaining Crush
                  is primary; .mcp.json is Claude Code fallback

Default output is ./.mcp.json; --out overrides; --force overwrites.
Brain URL sourced from BRAIN_URL (default http://koala:30330) so the
template stays in lockstep with the user's brain host.

All three subcommands now wired; notYet/errNotImplemented removed
from main.go.
2026-05-03 21:50:05 +02:00
Mathias Bergqvist
8f9642df69 feat(hyperguild): brain write subcommand
Reads markdown from stdin, POSTs to the brain's /write endpoint with
type + slug, prints the resulting path. Pairs with 'brain query' for
shell-friendly read/write access to the brain HTTP REST API.

Tests cover success, missing args, backend error propagation, and
empty stdin (which produces an empty content payload — the brain
server's responsibility to validate).
2026-05-03 21:42:36 +02:00
Mathias Bergqvist
cd5f3c0175 feat(hyperguild): brain query subcommand
Adds 'hyperguild brain query <topic>' against the brain HTTP REST
/query endpoint. Default human output prints path + score + title;
--json passes through the response envelope. --limit overrides the
default 5-result cap.

runBrainWrite remains a stub for Task 5.
2026-05-03 21:37:39 +02:00
Mathias Bergqvist
ed4966927c feat(hyperguild): brain HTTP REST client
Adds brainClient with Query and Write methods against the brain's
HTTP REST endpoints (/query, /write). Constructor reads BRAIN_URL env
var, defaulting to http://koala:30330 — the Tailscale-exposed
NodePort that serves both MCP and REST.

Tests cover success, transport error, and non-200 cases via
httptest fakes; URL override is verified via t.Setenv.
2026-05-03 21:32:48 +02:00
Mathias Bergqvist
3c4e8e8bb8 feat(hyperguild): tier subcommand
Adds the tier subcommand to the hyperguild CLI. Reuses
internal/tier.Detect verbatim, sources probe URLs from
ANTHROPIC_PROBE_URL (default https://api.anthropic.com) and
LITELLM_BASE_URL (no default — empty triggers airplane).

Human-readable output by default; --json emits the same Info struct
as the supervisor's tier MCP returns. Tests cover all three tier
states via httptest fakes.
2026-05-03 21:27:33 +02:00
Mathias Bergqvist
5c88eff46f feat(hyperguild): subcommand router skeleton
Lays down the cmd/hyperguild/ entry point. Defines the subcommand
contract (ctx, args, stdin, stdout, stderr) error, the dispatch()
function that's testable without os.Exit, and stubs for tier / brain /
mode that return errNotImplemented. Subsequent commits replace each
stub.

Part of Plan 4 (hyperguild CLI) of the hyperguild migration.
2026-05-03 21:21:08 +02:00
Mathias Bergqvist
646a86f2c3 docs(specs): fix brain URL port (3300 → 30330)
Pod-internal port is 3300; Tailscale-exposed NodePort is 30330.
External clients including the planned hyperguild CLI hit 30330.
2026-05-03 21:01:51 +02:00
Mathias Bergqvist
adf0504116 docs(specs): hyperguild CLI design — Plan 4 of hyperguild migration
Implementation-level spec for the hyperguild CLI: a stdlib Go binary
at cmd/hyperguild/ with subcommands tier, brain query/write, and mode
(cloud|client-local|sovereign). Replaces the supervisor's tier MCP and
provides shell-friendly access to the brain HTTP REST API.

Six measurable success criteria, seven out-of-scope items, six risks.
Decisions logged: stdlib flag + inline router (no cobra), reuse
internal/tier verbatim, BRAIN_URL env override, mode subcommand writes
.mcp.json with per-mode template plus placeholder for the Plan 6
routing pod.
2026-05-03 20:59:45 +02:00
Mathias Bergqvist
d44427e71f docs: document brain MCP endpoint at koala:30330
All checks were successful
CI / Lint / Test / Vet (push) Successful in 9s
CI / Mirror to GitHub (push) Successful in 3s
- README architecture diagram now shows two MCP servers (supervisor +
  brain) with the brain hosted by ingestion directly.
- Connect-a-project example includes both servers.
- .context/PROJECT.md replaces the boilerplate "Knowledge base access"
  block with the actual hyperguild MCP endpoints.
- Adapters regenerated via task context:sync.

Captures the transitional state where two MCPs coexist; the supervisor
MCP will shrink as skill workers move to SKILL.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 23:03:45 +02:00
Mathias Bergqvist
2635cdcaa7 chore: add brain MCP server alongside supervisor
The brain MCP at koala:30330 hosts the brain_* and session_log tools
formerly on supervisor. Supervisor stays connected during the
transition; its skill workers and the brain duplication will be
removed in a later plan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 23:02:28 +02:00
Mathias Bergqvist
e922471229 fix(context-sync): short-circuit when root AGENT.md is unreachable
All checks were successful
CI / Lint / Test / Vet (push) Successful in 9s
CI / Mirror to GitHub (push) Successful in 3s
In CI's clean checkout the tree-walk for ~/dev/.context/AGENT.md
finds nothing, leaving ROOT_CONTEXT empty. The script previously
proceeded to regenerate AGENTS.md, .cursorrules,
.aider.conventions.md, and .context/system-prompt.txt as
project-only — but the committed versions are root+project, so
the drift gate added in cc401d9 fails CI on every push.

When no root context is reachable, only regenerate CLAUDE.md
(which is project-only by design — Claude Code walks up the tree
itself to find the root). The root-bearing adapters are left
untouched, eliminating the false-positive drift.

Local runs (with root context reachable) are unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 22:55:43 +02:00
Mathias Bergqvist
87ff1f907c fix(ingestion): silence errcheck on resp.Body.Close in integration test
Some checks failed
CI / Lint / Test / Vet (push) Failing after 3s
CI / Mirror to GitHub (push) Has been skipped
CI's golangci-lint flagged the un-checked deferred Close. Match the
existing project pattern (defer func() { _ = ... }()).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 14:55:29 +02:00
Mathias Bergqvist
9cc179dec6 feat: brain MCP migration — extract brain_* + session_log into ingestion pod
Some checks failed
CI / Lint / Test / Vet (push) Failing after 2s
CI / Mirror to GitHub (push) Has been skipped
Slice 1 of the larger SKILL.md + routing-MCP architecture migration.
Adds an MCP HTTP handler to the ingestion service at POST /mcp,
exposing 5 tools (brain_query, brain_write, brain_ingest_raw,
brain_ingest, session_log). Plan and 9 implementation commits
preserved on the feat/brain-mcp-migration branch.

NodePort 30330 wired in infra repo separately (commit 008548e).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 14:05:08 +02:00
Mathias Bergqvist
370d30e376 feat(ingestion): mount MCP handler at POST /mcp
The ingestion server now exposes both REST and MCP on the same port
(3300). MCP shares brainDir, pipeline config, and LLM client with the
REST handlers — single source of process state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 19:41:05 +02:00
Mathias Bergqvist
bd0c1d75fd feat(ingestion): implement session_log MCP tool
Appends a JSON line to brainDir/sessions/<session_id>.jsonl using the
session package copied in Task 2. Required for upcoming pass-rate
logging.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:04:40 +02:00
Mathias Bergqvist
8c87460bff feat(ingestion): implement brain_ingest MCP tool
Wraps pipeline.Run with the existing LLM client. Mirrors the HTTP
/ingest and /ingest-path semantics — accepts either path or
content+source, validates mutual exclusion, surfaces an explicit error
when the LLM client is not configured (test-mode).

ctx is threaded through to pipeline.Run for cancellation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:02:02 +02:00
Mathias Bergqvist
809d435480 feat(ingestion): implement brain_ingest_raw MCP tool
Wraps pipeline.RunRaw directly. Same dry-run semantics as the HTTP
/ingest-raw endpoint. Test exercises a single concept page; asserts
returned path and that no file is written under dry_run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 10:28:05 +02:00
Mathias Bergqvist
e4a94df4fc feat(ingestion): extract WriteNote helper and add brain_write MCP tool
api.WriteNote captures the file-write logic that was previously inline
in Handler.Write. The existing HTTP endpoint now delegates to it; the
new MCP brain_write tool reuses the same function. Path-traversal
guard is strengthened to explicitly reject filenames containing path
separators or "..", so the rejection is surfaced before filepath.Base
strips the suspicious component (the previous defense-in-depth prefix
check became unreachable for these inputs after Base normalisation).
HTTP error code for caller-input errors shifts from 500 to 400, which
is semantically correct and not exercised by any existing test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 10:25:38 +02:00
Mathias Bergqvist
7dcb5610fe feat(ingestion): implement brain_query MCP tool
Wraps the existing search.Query function. Same BM25 over
brain/knowledge/ and brain/wiki/ that the HTTP /query endpoint serves.
Plan note: handleCall switch replaces the single-line stub from Task 1
— no unknownToolError type to remove since Task 1 inlined the error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 09:56:40 +02:00
Mathias Bergqvist
63c8d114e8 feat(ingestion): add session package for JSONL log persistence
Copy of internal/session from the supervisor module — the ingestion
service needs it for the upcoming session_log MCP tool. The supervisor
copy will be removed in the supervisor-retirement plan; until then
the two packages are intentionally identical and pinned (no edits).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 09:54:24 +02:00
Mathias Bergqvist
54f7d373bd feat(ingestion): add MCP server skeleton with tools/list
Adds an MCP HTTP handler under ingestion/internal/mcp. Implements
initialize, tools/list, and the JSON-RPC notification skip from prior
work. Tool dispatch is stubbed (returns unknown-tool error) and will be
filled in by subsequent tasks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 09:43:23 +02:00
Mathias Bergqvist
a412eee427 docs: add brain MCP migration plan
13 TDD-disciplined tasks moving brain_* and session_log out of the
supervisor pod and into the ingestion pod's MCP handler. Slice 1 of
the larger SKILL.md + routing-MCP architecture migration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 09:27:28 +02:00
Mathias Bergqvist
3d6f33881b chore: remove sources: from context:sync to disable Task source-cache
Some checks failed
CI / Lint / Test / Vet (push) Failing after 1s
CI / Mirror to GitHub (push) Has been skipped
Task's sources: declaration cached the regeneration on the assumption
that adapter outputs depend only on .context/AGENT.md (or PROJECT.md +
.skills/). The cache occasionally skipped legitimate runs after manual
edits to root or template content not detectable from those source paths.
context-sync.sh is idempotent and cheap; running it on every invocation
is the right default. The freshness gate (git status --porcelain) is
unaffected — it always checked the actual git tree state.
2026-04-30 23:14:20 +02:00
Mathias Bergqvist
07e3f341ef chore: re-sync context adapters after root prose cleanup
Some checks failed
CI / Lint / Test / Vet (push) Failing after 1s
CI / Mirror to GitHub (push) Has been skipped
Root AGENT.md dropped a stale paragraph; adapters that embed root
(AGENTS.md, .cursorrules, .aider.conventions.md, system-prompt.txt)
need to be regenerated to match. CLAUDE.md is project-only by design
and unchanged.
2026-04-30 23:00:31 +02:00
Mathias Bergqvist
5c532e708c chore: drop HANDROLLED sentinel machinery from context-sync.sh
Some checks failed
CI / Lint / Test / Vet (push) Failing after 2s
CI / Mirror to GitHub (push) Has been skipped
Nothing in the workspace uses the HANDROLLED escape hatch anymore — the
infra repo (its only consumer) was migrated to the standard pattern.
Removing the dead code: HANDROLLED_MARKER constant, skip_if_handrolled
function, and call sites in each generator. Sync output for any non-
HANDROLLED file (i.e., every file we have) is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 22:48:28 +02:00
Mathias Bergqvist
a34c66d7cd fix: update scripts/context-sync.sh with HANDROLLED sentinel support
Some checks failed
CI / Lint / Test / Vet (push) Failing after 1s
CI / Mirror to GitHub (push) Has been skipped
Phase 4 sweep updated .gitignore and Taskfile.yml but missed this
script. Copy the canonical from ~/dev/project-template/ so the
HANDROLLED escape hatch works in this project (a CLAUDE.md or
adapter file containing '<!-- HANDROLLED: do not regenerate -->'
near the top is now skipped on regeneration).
2026-04-29 16:41:56 +02:00
Mathias Bergqvist
cc401d92d6 chore: commit adapters; add context freshness gate to task check
Adapters are now tracked so non-Mac hosts get full agent context after
a plain git pull. task check runs context:sync first and fails on drift
via git status --porcelain over the 5 adapter paths.
2026-04-29 15:59:52 +02:00
Mathias Bergqvist
9bdf00f51f refactor(mcp): connect Claude Code via direct HTTP, remove stdio bridge
All checks were successful
CI / Lint / Test / Vet (push) Successful in 9s
CI / Mirror to GitHub (push) Successful in 3s
Claude Code now supports MCP servers over HTTP natively (type: "http"
in .mcp.json). The stdio↔HTTP bridge binary was a workaround for the
older stdio-only constraint and is no longer needed — the supervisor
NodePort on koala (30320) is reachable over Tailscale from any client
machine.

Removed:
- cmd/bridge/ (Go source, ~60 lines)
- bin/supervisor-bridge artifact
- Taskfile bridge:build target and the build aggregate's reference
- README "Build the bridge binary" instruction

Updated:
- .mcp.json switched to {type:"http", url:"http://koala:30320/mcp"}
- README architecture diagram and "Connect a project" section

Behavioural prerequisite for this change shipped in 7f7524c
(notifications fix). Verified end-to-end: tier tool call returns
{tier:2, label:"lan-only"} via direct HTTP, no shim.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 10:38:58 +02:00
Mathias Bergqvist
7f7524c859 fix(mcp): do not respond to JSON-RPC notifications
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 3s
The supervisor's MCP HTTP handler was answering every parsed request,
including notifications (messages with no id field). Per JSON-RPC 2.0,
notifications must not receive a response. The Apr-29 incident saw
Claude Code's MCP client receive a -32601 error for the standard
notifications/initialized handshake step and disconnect immediately
after a successful initialize.

Skip writing the response when req.ID == nil. Cover both the
known-method (notifications/initialized) and unknown-method paths
with tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:39:20 +02:00
Mathias Bergqvist
0a70d9e972 feat(pipeline): add POST /ingest-raw for direct batch ingestion without LLM
All checks were successful
CI / Lint / Test / Vet (push) Successful in 9s
CI / Mirror to GitHub (push) Has been skipped
Allows callers to provide pre-structured RawPage data directly, bypassing the
LLM extraction step. The pipeline still handles slug computation, frontmatter,
link canonicalization, source back-references, and dedup — only the extraction
is skipped. Useful when a more capable model or manual curation produces the
structured data.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 11:15:59 +02:00
Mathias Bergqvist
3e9a648115 fix(pipeline): repair invalid JSON escape sequences from LLM output before parsing
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Has been skipped
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 22:04:27 +02:00
44 changed files with 6506 additions and 152 deletions

244
.aider.conventions.md Normal file
View File

@@ -0,0 +1,244 @@
# Agent context — Mathias workspace
<!-- Canonical root context for all AI coding agents.
Lives at: ~/dev/.context/AGENT.md
Applies to every project under ~/dev/ unless overridden.
Run `task context:sync` from ~/dev/ to regenerate harness-specific files.
Project-level context in .context/PROJECT.md layers on top of this. -->
## Who I am
I'm Mathias, a digital product manager and technology consultant based in Sweden.
I build software, research emerging tech, and deliver consulting engagements
for clients under NDA. I work across AI/ML, financial automation, web applications,
and climate/sustainability tech.
## How I work with agents
- I think like a product manager — I care about *why* before *how*
- I want agents to be opinionated and push back, not just execute blindly
- I prefer concise responses; skip ceremony and get to the point
- When I say "build this", I mean production-quality with tests, not a demo
- Ask me before making irreversible changes or adding heavy dependencies
- I work with confidential client data — never send it to cloud APIs unless I explicitly say it's OK
## Behavior rules
These rules apply to every task across every project, regardless of harness.
1. **No assumptions.** Don't hide confusion — surface it. Surface tradeoffs explicitly.
Think before coding; if the problem is unclear, ask or state assumptions before acting.
2. **Minimum viable code.** Solve with the smallest change that works. Nothing
speculative, no "while we're here" cleanups, no premature abstractions. Simplicity first.
3. **Surgical changes.** Touch only what the task requires. Leave unrelated code,
files, and formatting alone. Diffs should be small and reviewable.
4. **Goal-driven execution.** Define clear success criteria up front for every task.
Loop — implement, verify, refine — until those criteria are met. Don't claim
completion without evidence (tests pass, command output, observed behavior).
## Default stack
| Layer | Default | Fallback | Last resort |
|-------|---------|----------|-------------|
| Language | Go | Python | TypeScript, Java, C |
| UI | HTMX + Templ | Server-rendered HTML | React (only if SPA is justified) |
| Build | Task (taskfile.dev) | Make | — |
| Containers | Docker Compose (dev), k3s (prod) | — | — |
| DB | PostgreSQL + sqlc | SQLite | — |
| Search | Qdrant (vector), BM25 | — | — |
| Logging | slog (structured) | — | — |
| Testing | Table-driven, testify | — | — |
Exploratory: Rust, Zig — I'll tell you when I want these.
## Code conventions
- **Go style**: golines, gofumpt, golangci-lint
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
- **Naming**: stdlib conventions, no stuttering
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), one concern per PR, PR describes *why* not *what*
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc are pre-approved; anything else needs justification in the commit message
## Infrastructure
Three machines on Tailscale:
| Machine | Role | Key specs |
|---------|------|-----------|
| koala | GPU inference, heavy compute | RTX 5070, runs llama-swap, Qdrant |
| iguana | Services, builds | M2 Ultra Mac |
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
- **Model routing**: LiteLLM in front of llama-swap (local) + cloud APIs (when permitted)
- **Orchestration**: k3s cluster across all three machines
- **Networking**: Tailscale mesh
## Project landscape
All development repos live at `~/dev/` (softlink from `~/Documents/local-dev/`).
Organized in thematic folders:
| Folder | Focus | Count |
|--------|-------|-------|
| `GO/` | Go web frameworks, API integrations, learning projects | ~10 |
| `AI/` | ML research, AI frameworks (FinRL, DSPy, crawl4ai) | ~6 |
| `AGENTS/` | Autonomous agents, coding agents, MCP servers, infra | ~15 |
| `QKX/` | Invoice processing, financial automation, payment systems | ~13 |
| `XT/` | Climate data, sustainability (Klimatkollen, Garbo) | ~2 |
See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
### Key active projects
- **super-koala** (`AGENTS/`) — multi-component agent stack with LangGraph, DSPy, MCP
- **azure-tiger** (`QKX/`) — invoice extraction → ISO 20022 payment instructions
- **gocrwl** (`AGENTS/`) — Go web crawler with containerized deployment
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
## Knowledge base
When available, agents can query the shared knowledge base:
- **MCP**: `mcp://hyperguild.<TAILNET>.ts.net:3100/knowledge`
- **HTTP**: `http://hyperguild.<TAILNET>.ts.net:3100/api/v1/search`
<!-- TODO: replace <TAILNET> placeholder with the real Tailscale tailnet
name once hyperguild is deployed. Until then, agents that try to
reach the knowledge service on a host where it isn't running will
get DNS NXDOMAIN, which is the desired fail-loudly behavior. -->
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`
## Client work rules
When working on a project tagged with a client name:
1. Never send code, data, or context to cloud APIs — use local models only
2. Never reference other client projects or their data
3. Keep all artifacts within the client's git org / directory
4. Treat everything as confidential unless told otherwise
## Harness-agnostic principles
This context is designed to work with any AI coding tool:
- Claude Code, Cursor, Aider, Open WebUI, Charmbracelet Mods/Crush
- Pi Coding Agent, Mistral Vibe, Antigravity
- Any tool that accepts a system prompt or reads a markdown context file
The canonical source is always `.context/AGENT.md` (root) and `.context/PROJECT.md` (per-project).
Derived files are committed (see *How context propagates* below) so a `git pull` on any host yields full agent context with no setup.
## How context propagates
Canonical sources of truth:
- Universal: `~/dev/.context/AGENT.md` (this file)
- Project: `<repo>/.context/PROJECT.md` (per-repo)
Derived files (committed, regenerated by `task context:sync`):
- `CLAUDE.md`, `AGENTS.md`, `.cursorrules`, `.aider.conventions.md`,
`.context/system-prompt.txt`
Workflow:
1. Edit a canonical file. Run `task context:sync`. Commit canonical and
derived together. Push.
2. On any other host, `git pull` brings both. Claude Code (tree-walking)
uses `CLAUDE.md`; Crush / Pi / Antigravity (cwd-only) use `AGENTS.md`;
Cursor uses `.cursorrules`; Aider uses `.aider.conventions.md`.
3. `task check` runs `context:sync` then asserts `git status --porcelain`
is empty over the derived files (catches both modified-tracked drift
and missing-untracked adapters). A drift fails the check with a
message telling you to stage the regenerated files.
Behavior rules in this file and per-project rules in `PROJECT.md` apply
unconditionally on every host, every harness.
## Engineering Skills
Shared engineering skills are available in `~/dev/.skills/`. Load on demand via the index.
See `~/dev/.skills/SKILLS_INDEX.md` for the full list with descriptions and "use when" triggers.
Key skills:
- **TDD**: always write tests first — load `tdd` skill
- **Code Review**: load `code-review` skill before any review
- **SOLID/Clean Code**: load `solid` or `clean-code` skill for design work
- **Problem first**: load `problem-analysis` skill before coding non-trivial features
---
# Project context
<!-- Canonical project context. Edit this, run `task context:sync`.
Root agent context from ~/dev/.context/AGENT.md is automatically
prepended for harnesses that don't walk the directory tree. -->
## Identity
- **Name**: supervisor
- **Owner**: Mathias
- **Client**: personal
- **Repo**:
- **Status**: active
## Stack
- **Primary language**: Go
- **UI layer**: HTMX + Templ (when applicable)
- **Fallback languages**: Python, TypeScript (justify in PR if used)
- **Build**: Task (taskfile.dev), not Make
- **Containers**: Docker (compose for dev, k3s for deploy)
- **Target infra**: koala (GPU workloads), iguana (services), flamingo (edge)
## Conventions
### Code style
- Go: follow `golines`, `gofumpt`, `golangci-lint` with project config
- Tests: table-driven, in `_test.go` next to source, `testify` for assertions
- Errors: wrap with `fmt.Errorf("operation: %w", err)`, no naked returns
- Naming: stdlib conventions, no stuttering (`http.Client` not `http.HTTPClient`)
### Architecture preferences
- Prefer standard library over frameworks (net/http over gin/echo)
- Dependency injection via constructor functions, not containers
- Configuration via environment variables, parsed at startup into a typed struct
- Structured logging via `slog`
### Git
- Conventional commits: `feat:`, `fix:`, `chore:`, `docs:`, `refactor:`
- Branch naming: `feat/short-description`, `fix/short-description`
- PRs: one concern per PR, description explains *why* not *what*
### Security
- No secrets in code, ever — use env vars or SOPS-encrypted files
- Client data never leaves local network unless explicitly cleared
- Dependencies: audit with `govulncheck` before adding
## MCP endpoints
Two MCP servers expose this project's tooling, both reachable over Tailscale:
- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
`brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
by the ingestion service directly.
- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
migration.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
shell scripts and non-MCP clients.
## Agent instructions
When acting as a coding agent on this project:
1. Read this file and all `SKILL.md` files in `.skills/` before starting work
2. Run `task check` before committing (lint + test + vet)
3. If unsure about a convention, check `DECISIONS.md` or ask
4. Never modify files outside the project root without explicit permission
5. When adding a dependency, explain why in the commit message
6. For client projects: never send code or context to cloud APIs — use local models via LiteLLM

View File

@@ -45,13 +45,21 @@
- Client data never leaves local network unless explicitly cleared
- Dependencies: audit with `govulncheck` before adding
## Knowledge base access
## MCP endpoints
This project can query the shared knowledge base via MCP or HTTP:
Two MCP servers expose this project's tooling, both reachable over Tailscale:
- **MCP endpoint**: `mcp://localhost:3100/knowledge`
- **HTTP fallback**: `http://localhost:3100/api/v1/search`
- **Scoping**: queries are filtered to collection `personal` + `public`
- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
`brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
by the ingestion service directly.
- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
migration.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
shell scripts and non-MCP clients.
## Agent instructions

251
.context/system-prompt.txt Normal file
View File

@@ -0,0 +1,251 @@
You are a coding assistant working on a specific project.
Follow all conventions from both the root agent context and project context.
---
# Agent context — Mathias workspace
<!-- Canonical root context for all AI coding agents.
Lives at: ~/dev/.context/AGENT.md
Applies to every project under ~/dev/ unless overridden.
Run `task context:sync` from ~/dev/ to regenerate harness-specific files.
Project-level context in .context/PROJECT.md layers on top of this. -->
## Who I am
I'm Mathias, a digital product manager and technology consultant based in Sweden.
I build software, research emerging tech, and deliver consulting engagements
for clients under NDA. I work across AI/ML, financial automation, web applications,
and climate/sustainability tech.
## How I work with agents
- I think like a product manager — I care about *why* before *how*
- I want agents to be opinionated and push back, not just execute blindly
- I prefer concise responses; skip ceremony and get to the point
- When I say "build this", I mean production-quality with tests, not a demo
- Ask me before making irreversible changes or adding heavy dependencies
- I work with confidential client data — never send it to cloud APIs unless I explicitly say it's OK
## Behavior rules
These rules apply to every task across every project, regardless of harness.
1. **No assumptions.** Don't hide confusion — surface it. Surface tradeoffs explicitly.
Think before coding; if the problem is unclear, ask or state assumptions before acting.
2. **Minimum viable code.** Solve with the smallest change that works. Nothing
speculative, no "while we're here" cleanups, no premature abstractions. Simplicity first.
3. **Surgical changes.** Touch only what the task requires. Leave unrelated code,
files, and formatting alone. Diffs should be small and reviewable.
4. **Goal-driven execution.** Define clear success criteria up front for every task.
Loop — implement, verify, refine — until those criteria are met. Don't claim
completion without evidence (tests pass, command output, observed behavior).
## Default stack
| Layer | Default | Fallback | Last resort |
|-------|---------|----------|-------------|
| Language | Go | Python | TypeScript, Java, C |
| UI | HTMX + Templ | Server-rendered HTML | React (only if SPA is justified) |
| Build | Task (taskfile.dev) | Make | — |
| Containers | Docker Compose (dev), k3s (prod) | — | — |
| DB | PostgreSQL + sqlc | SQLite | — |
| Search | Qdrant (vector), BM25 | — | — |
| Logging | slog (structured) | — | — |
| Testing | Table-driven, testify | — | — |
Exploratory: Rust, Zig — I'll tell you when I want these.
## Code conventions
- **Go style**: golines, gofumpt, golangci-lint
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
- **Naming**: stdlib conventions, no stuttering
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), one concern per PR, PR describes *why* not *what*
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc are pre-approved; anything else needs justification in the commit message
## Infrastructure
Three machines on Tailscale:
| Machine | Role | Key specs |
|---------|------|-----------|
| koala | GPU inference, heavy compute | RTX 5070, runs llama-swap, Qdrant |
| iguana | Services, builds | M2 Ultra Mac |
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
- **Model routing**: LiteLLM in front of llama-swap (local) + cloud APIs (when permitted)
- **Orchestration**: k3s cluster across all three machines
- **Networking**: Tailscale mesh
## Project landscape
All development repos live at `~/dev/` (softlink from `~/Documents/local-dev/`).
Organized in thematic folders:
| Folder | Focus | Count |
|--------|-------|-------|
| `GO/` | Go web frameworks, API integrations, learning projects | ~10 |
| `AI/` | ML research, AI frameworks (FinRL, DSPy, crawl4ai) | ~6 |
| `AGENTS/` | Autonomous agents, coding agents, MCP servers, infra | ~15 |
| `QKX/` | Invoice processing, financial automation, payment systems | ~13 |
| `XT/` | Climate data, sustainability (Klimatkollen, Garbo) | ~2 |
See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
### Key active projects
- **super-koala** (`AGENTS/`) — multi-component agent stack with LangGraph, DSPy, MCP
- **azure-tiger** (`QKX/`) — invoice extraction → ISO 20022 payment instructions
- **gocrwl** (`AGENTS/`) — Go web crawler with containerized deployment
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
## Knowledge base
When available, agents can query the shared knowledge base:
- **MCP**: `mcp://hyperguild.<TAILNET>.ts.net:3100/knowledge`
- **HTTP**: `http://hyperguild.<TAILNET>.ts.net:3100/api/v1/search`
<!-- TODO: replace <TAILNET> placeholder with the real Tailscale tailnet
name once hyperguild is deployed. Until then, agents that try to
reach the knowledge service on a host where it isn't running will
get DNS NXDOMAIN, which is the desired fail-loudly behavior. -->
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`
## Client work rules
When working on a project tagged with a client name:
1. Never send code, data, or context to cloud APIs — use local models only
2. Never reference other client projects or their data
3. Keep all artifacts within the client's git org / directory
4. Treat everything as confidential unless told otherwise
## Harness-agnostic principles
This context is designed to work with any AI coding tool:
- Claude Code, Cursor, Aider, Open WebUI, Charmbracelet Mods/Crush
- Pi Coding Agent, Mistral Vibe, Antigravity
- Any tool that accepts a system prompt or reads a markdown context file
The canonical source is always `.context/AGENT.md` (root) and `.context/PROJECT.md` (per-project).
Derived files are committed (see *How context propagates* below) so a `git pull` on any host yields full agent context with no setup.
## How context propagates
Canonical sources of truth:
- Universal: `~/dev/.context/AGENT.md` (this file)
- Project: `<repo>/.context/PROJECT.md` (per-repo)
Derived files (committed, regenerated by `task context:sync`):
- `CLAUDE.md`, `AGENTS.md`, `.cursorrules`, `.aider.conventions.md`,
`.context/system-prompt.txt`
Workflow:
1. Edit a canonical file. Run `task context:sync`. Commit canonical and
derived together. Push.
2. On any other host, `git pull` brings both. Claude Code (tree-walking)
uses `CLAUDE.md`; Crush / Pi / Antigravity (cwd-only) use `AGENTS.md`;
Cursor uses `.cursorrules`; Aider uses `.aider.conventions.md`.
3. `task check` runs `context:sync` then asserts `git status --porcelain`
is empty over the derived files (catches both modified-tracked drift
and missing-untracked adapters). A drift fails the check with a
message telling you to stage the regenerated files.
Behavior rules in this file and per-project rules in `PROJECT.md` apply
unconditionally on every host, every harness.
## Engineering Skills
Shared engineering skills are available in `~/dev/.skills/`. Load on demand via the index.
See `~/dev/.skills/SKILLS_INDEX.md` for the full list with descriptions and "use when" triggers.
Key skills:
- **TDD**: always write tests first — load `tdd` skill
- **Code Review**: load `code-review` skill before any review
- **SOLID/Clean Code**: load `solid` or `clean-code` skill for design work
- **Problem first**: load `problem-analysis` skill before coding non-trivial features
---
# Project context
<!-- Canonical project context. Edit this, run `task context:sync`.
Root agent context from ~/dev/.context/AGENT.md is automatically
prepended for harnesses that don't walk the directory tree. -->
## Identity
- **Name**: supervisor
- **Owner**: Mathias
- **Client**: personal
- **Repo**:
- **Status**: active
## Stack
- **Primary language**: Go
- **UI layer**: HTMX + Templ (when applicable)
- **Fallback languages**: Python, TypeScript (justify in PR if used)
- **Build**: Task (taskfile.dev), not Make
- **Containers**: Docker (compose for dev, k3s for deploy)
- **Target infra**: koala (GPU workloads), iguana (services), flamingo (edge)
## Conventions
### Code style
- Go: follow `golines`, `gofumpt`, `golangci-lint` with project config
- Tests: table-driven, in `_test.go` next to source, `testify` for assertions
- Errors: wrap with `fmt.Errorf("operation: %w", err)`, no naked returns
- Naming: stdlib conventions, no stuttering (`http.Client` not `http.HTTPClient`)
### Architecture preferences
- Prefer standard library over frameworks (net/http over gin/echo)
- Dependency injection via constructor functions, not containers
- Configuration via environment variables, parsed at startup into a typed struct
- Structured logging via `slog`
### Git
- Conventional commits: `feat:`, `fix:`, `chore:`, `docs:`, `refactor:`
- Branch naming: `feat/short-description`, `fix/short-description`
- PRs: one concern per PR, description explains *why* not *what*
### Security
- No secrets in code, ever — use env vars or SOPS-encrypted files
- Client data never leaves local network unless explicitly cleared
- Dependencies: audit with `govulncheck` before adding
## MCP endpoints
Two MCP servers expose this project's tooling, both reachable over Tailscale:
- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
`brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
by the ingestion service directly.
- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
migration.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
shell scripts and non-MCP clients.
## Agent instructions
When acting as a coding agent on this project:
1. Read this file and all `SKILL.md` files in `.skills/` before starting work
2. Run `task check` before committing (lint + test + vet)
3. If unsure about a convention, check `DECISIONS.md` or ask
4. Never modify files outside the project root without explicit permission
5. When adding a dependency, explain why in the commit message
6. For client projects: never send code or context to cloud APIs — use local models via LiteLLM
---

247
.cursorrules Normal file
View File

@@ -0,0 +1,247 @@
# Cursor rules — auto-generated
# Do not edit. Run: task context:sync
# Agent context — Mathias workspace
<!-- Canonical root context for all AI coding agents.
Lives at: ~/dev/.context/AGENT.md
Applies to every project under ~/dev/ unless overridden.
Run `task context:sync` from ~/dev/ to regenerate harness-specific files.
Project-level context in .context/PROJECT.md layers on top of this. -->
## Who I am
I'm Mathias, a digital product manager and technology consultant based in Sweden.
I build software, research emerging tech, and deliver consulting engagements
for clients under NDA. I work across AI/ML, financial automation, web applications,
and climate/sustainability tech.
## How I work with agents
- I think like a product manager — I care about *why* before *how*
- I want agents to be opinionated and push back, not just execute blindly
- I prefer concise responses; skip ceremony and get to the point
- When I say "build this", I mean production-quality with tests, not a demo
- Ask me before making irreversible changes or adding heavy dependencies
- I work with confidential client data — never send it to cloud APIs unless I explicitly say it's OK
## Behavior rules
These rules apply to every task across every project, regardless of harness.
1. **No assumptions.** Don't hide confusion — surface it. Surface tradeoffs explicitly.
Think before coding; if the problem is unclear, ask or state assumptions before acting.
2. **Minimum viable code.** Solve with the smallest change that works. Nothing
speculative, no "while we're here" cleanups, no premature abstractions. Simplicity first.
3. **Surgical changes.** Touch only what the task requires. Leave unrelated code,
files, and formatting alone. Diffs should be small and reviewable.
4. **Goal-driven execution.** Define clear success criteria up front for every task.
Loop — implement, verify, refine — until those criteria are met. Don't claim
completion without evidence (tests pass, command output, observed behavior).
## Default stack
| Layer | Default | Fallback | Last resort |
|-------|---------|----------|-------------|
| Language | Go | Python | TypeScript, Java, C |
| UI | HTMX + Templ | Server-rendered HTML | React (only if SPA is justified) |
| Build | Task (taskfile.dev) | Make | — |
| Containers | Docker Compose (dev), k3s (prod) | — | — |
| DB | PostgreSQL + sqlc | SQLite | — |
| Search | Qdrant (vector), BM25 | — | — |
| Logging | slog (structured) | — | — |
| Testing | Table-driven, testify | — | — |
Exploratory: Rust, Zig — I'll tell you when I want these.
## Code conventions
- **Go style**: golines, gofumpt, golangci-lint
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
- **Naming**: stdlib conventions, no stuttering
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), one concern per PR, PR describes *why* not *what*
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc are pre-approved; anything else needs justification in the commit message
## Infrastructure
Three machines on Tailscale:
| Machine | Role | Key specs |
|---------|------|-----------|
| koala | GPU inference, heavy compute | RTX 5070, runs llama-swap, Qdrant |
| iguana | Services, builds | M2 Ultra Mac |
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
- **Model routing**: LiteLLM in front of llama-swap (local) + cloud APIs (when permitted)
- **Orchestration**: k3s cluster across all three machines
- **Networking**: Tailscale mesh
## Project landscape
All development repos live at `~/dev/` (softlink from `~/Documents/local-dev/`).
Organized in thematic folders:
| Folder | Focus | Count |
|--------|-------|-------|
| `GO/` | Go web frameworks, API integrations, learning projects | ~10 |
| `AI/` | ML research, AI frameworks (FinRL, DSPy, crawl4ai) | ~6 |
| `AGENTS/` | Autonomous agents, coding agents, MCP servers, infra | ~15 |
| `QKX/` | Invoice processing, financial automation, payment systems | ~13 |
| `XT/` | Climate data, sustainability (Klimatkollen, Garbo) | ~2 |
See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
### Key active projects
- **super-koala** (`AGENTS/`) — multi-component agent stack with LangGraph, DSPy, MCP
- **azure-tiger** (`QKX/`) — invoice extraction → ISO 20022 payment instructions
- **gocrwl** (`AGENTS/`) — Go web crawler with containerized deployment
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
## Knowledge base
When available, agents can query the shared knowledge base:
- **MCP**: `mcp://hyperguild.<TAILNET>.ts.net:3100/knowledge`
- **HTTP**: `http://hyperguild.<TAILNET>.ts.net:3100/api/v1/search`
<!-- TODO: replace <TAILNET> placeholder with the real Tailscale tailnet
name once hyperguild is deployed. Until then, agents that try to
reach the knowledge service on a host where it isn't running will
get DNS NXDOMAIN, which is the desired fail-loudly behavior. -->
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`
## Client work rules
When working on a project tagged with a client name:
1. Never send code, data, or context to cloud APIs — use local models only
2. Never reference other client projects or their data
3. Keep all artifacts within the client's git org / directory
4. Treat everything as confidential unless told otherwise
## Harness-agnostic principles
This context is designed to work with any AI coding tool:
- Claude Code, Cursor, Aider, Open WebUI, Charmbracelet Mods/Crush
- Pi Coding Agent, Mistral Vibe, Antigravity
- Any tool that accepts a system prompt or reads a markdown context file
The canonical source is always `.context/AGENT.md` (root) and `.context/PROJECT.md` (per-project).
Derived files are committed (see *How context propagates* below) so a `git pull` on any host yields full agent context with no setup.
## How context propagates
Canonical sources of truth:
- Universal: `~/dev/.context/AGENT.md` (this file)
- Project: `<repo>/.context/PROJECT.md` (per-repo)
Derived files (committed, regenerated by `task context:sync`):
- `CLAUDE.md`, `AGENTS.md`, `.cursorrules`, `.aider.conventions.md`,
`.context/system-prompt.txt`
Workflow:
1. Edit a canonical file. Run `task context:sync`. Commit canonical and
derived together. Push.
2. On any other host, `git pull` brings both. Claude Code (tree-walking)
uses `CLAUDE.md`; Crush / Pi / Antigravity (cwd-only) use `AGENTS.md`;
Cursor uses `.cursorrules`; Aider uses `.aider.conventions.md`.
3. `task check` runs `context:sync` then asserts `git status --porcelain`
is empty over the derived files (catches both modified-tracked drift
and missing-untracked adapters). A drift fails the check with a
message telling you to stage the regenerated files.
Behavior rules in this file and per-project rules in `PROJECT.md` apply
unconditionally on every host, every harness.
## Engineering Skills
Shared engineering skills are available in `~/dev/.skills/`. Load on demand via the index.
See `~/dev/.skills/SKILLS_INDEX.md` for the full list with descriptions and "use when" triggers.
Key skills:
- **TDD**: always write tests first — load `tdd` skill
- **Code Review**: load `code-review` skill before any review
- **SOLID/Clean Code**: load `solid` or `clean-code` skill for design work
- **Problem first**: load `problem-analysis` skill before coding non-trivial features
---
# Project context
<!-- Canonical project context. Edit this, run `task context:sync`.
Root agent context from ~/dev/.context/AGENT.md is automatically
prepended for harnesses that don't walk the directory tree. -->
## Identity
- **Name**: supervisor
- **Owner**: Mathias
- **Client**: personal
- **Repo**:
- **Status**: active
## Stack
- **Primary language**: Go
- **UI layer**: HTMX + Templ (when applicable)
- **Fallback languages**: Python, TypeScript (justify in PR if used)
- **Build**: Task (taskfile.dev), not Make
- **Containers**: Docker (compose for dev, k3s for deploy)
- **Target infra**: koala (GPU workloads), iguana (services), flamingo (edge)
## Conventions
### Code style
- Go: follow `golines`, `gofumpt`, `golangci-lint` with project config
- Tests: table-driven, in `_test.go` next to source, `testify` for assertions
- Errors: wrap with `fmt.Errorf("operation: %w", err)`, no naked returns
- Naming: stdlib conventions, no stuttering (`http.Client` not `http.HTTPClient`)
### Architecture preferences
- Prefer standard library over frameworks (net/http over gin/echo)
- Dependency injection via constructor functions, not containers
- Configuration via environment variables, parsed at startup into a typed struct
- Structured logging via `slog`
### Git
- Conventional commits: `feat:`, `fix:`, `chore:`, `docs:`, `refactor:`
- Branch naming: `feat/short-description`, `fix/short-description`
- PRs: one concern per PR, description explains *why* not *what*
### Security
- No secrets in code, ever — use env vars or SOPS-encrypted files
- Client data never leaves local network unless explicitly cleared
- Dependencies: audit with `govulncheck` before adding
## MCP endpoints
Two MCP servers expose this project's tooling, both reachable over Tailscale:
- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
`brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
by the ingestion service directly.
- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
migration.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
shell scripts and non-MCP clients.
## Agent instructions
When acting as a coding agent on this project:
1. Read this file and all `SKILL.md` files in `.skills/` before starting work
2. Run `task check` before committing (lint + test + vet)
3. If unsure about a convention, check `DECISIONS.md` or ask
4. Never modify files outside the project root without explicit permission
5. When adding a dependency, explain why in the commit message
6. For client projects: never send code or context to cloud APIs — use local models via LiteLLM

8
.gitignore vendored
View File

@@ -13,15 +13,7 @@ brain/training-data/**/*.jsonl
# Go
vendor/
# ── Generated context files (adapter outputs) ──
# Canonical sources: .context/PROJECT.md + .skills/*/SKILL.md
# Everything below is disposable — regenerate with: task context:sync
AGENTS.md
CLAUDE.md
.cursorrules
.aider.conventions.md
.aider.conf.yml
.context/system-prompt.txt
# ── Sensitive ──
.env

View File

@@ -1,10 +1,12 @@
{
"mcpServers": {
"supervisor": {
"command": "/Users/mathias/dev/AI/supervisor/bin/supervisor-bridge",
"env": {
"SUPERVISOR_URL": "http://koala:30320/mcp"
}
"type": "http",
"url": "http://koala:30320/mcp"
},
"brain": {
"type": "http",
"url": "http://koala:30330/mcp"
}
}
}

244
AGENTS.md Normal file
View File

@@ -0,0 +1,244 @@
# Agent context — Mathias workspace
<!-- Canonical root context for all AI coding agents.
Lives at: ~/dev/.context/AGENT.md
Applies to every project under ~/dev/ unless overridden.
Run `task context:sync` from ~/dev/ to regenerate harness-specific files.
Project-level context in .context/PROJECT.md layers on top of this. -->
## Who I am
I'm Mathias, a digital product manager and technology consultant based in Sweden.
I build software, research emerging tech, and deliver consulting engagements
for clients under NDA. I work across AI/ML, financial automation, web applications,
and climate/sustainability tech.
## How I work with agents
- I think like a product manager — I care about *why* before *how*
- I want agents to be opinionated and push back, not just execute blindly
- I prefer concise responses; skip ceremony and get to the point
- When I say "build this", I mean production-quality with tests, not a demo
- Ask me before making irreversible changes or adding heavy dependencies
- I work with confidential client data — never send it to cloud APIs unless I explicitly say it's OK
## Behavior rules
These rules apply to every task across every project, regardless of harness.
1. **No assumptions.** Don't hide confusion — surface it. Surface tradeoffs explicitly.
Think before coding; if the problem is unclear, ask or state assumptions before acting.
2. **Minimum viable code.** Solve with the smallest change that works. Nothing
speculative, no "while we're here" cleanups, no premature abstractions. Simplicity first.
3. **Surgical changes.** Touch only what the task requires. Leave unrelated code,
files, and formatting alone. Diffs should be small and reviewable.
4. **Goal-driven execution.** Define clear success criteria up front for every task.
Loop — implement, verify, refine — until those criteria are met. Don't claim
completion without evidence (tests pass, command output, observed behavior).
## Default stack
| Layer | Default | Fallback | Last resort |
|-------|---------|----------|-------------|
| Language | Go | Python | TypeScript, Java, C |
| UI | HTMX + Templ | Server-rendered HTML | React (only if SPA is justified) |
| Build | Task (taskfile.dev) | Make | — |
| Containers | Docker Compose (dev), k3s (prod) | — | — |
| DB | PostgreSQL + sqlc | SQLite | — |
| Search | Qdrant (vector), BM25 | — | — |
| Logging | slog (structured) | — | — |
| Testing | Table-driven, testify | — | — |
Exploratory: Rust, Zig — I'll tell you when I want these.
## Code conventions
- **Go style**: golines, gofumpt, golangci-lint
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
- **Naming**: stdlib conventions, no stuttering
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), one concern per PR, PR describes *why* not *what*
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc are pre-approved; anything else needs justification in the commit message
## Infrastructure
Three machines on Tailscale:
| Machine | Role | Key specs |
|---------|------|-----------|
| koala | GPU inference, heavy compute | RTX 5070, runs llama-swap, Qdrant |
| iguana | Services, builds | M2 Ultra Mac |
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
- **Model routing**: LiteLLM in front of llama-swap (local) + cloud APIs (when permitted)
- **Orchestration**: k3s cluster across all three machines
- **Networking**: Tailscale mesh
## Project landscape
All development repos live at `~/dev/` (softlink from `~/Documents/local-dev/`).
Organized in thematic folders:
| Folder | Focus | Count |
|--------|-------|-------|
| `GO/` | Go web frameworks, API integrations, learning projects | ~10 |
| `AI/` | ML research, AI frameworks (FinRL, DSPy, crawl4ai) | ~6 |
| `AGENTS/` | Autonomous agents, coding agents, MCP servers, infra | ~15 |
| `QKX/` | Invoice processing, financial automation, payment systems | ~13 |
| `XT/` | Climate data, sustainability (Klimatkollen, Garbo) | ~2 |
See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
### Key active projects
- **super-koala** (`AGENTS/`) — multi-component agent stack with LangGraph, DSPy, MCP
- **azure-tiger** (`QKX/`) — invoice extraction → ISO 20022 payment instructions
- **gocrwl** (`AGENTS/`) — Go web crawler with containerized deployment
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
## Knowledge base
When available, agents can query the shared knowledge base:
- **MCP**: `mcp://hyperguild.<TAILNET>.ts.net:3100/knowledge`
- **HTTP**: `http://hyperguild.<TAILNET>.ts.net:3100/api/v1/search`
<!-- TODO: replace <TAILNET> placeholder with the real Tailscale tailnet
name once hyperguild is deployed. Until then, agents that try to
reach the knowledge service on a host where it isn't running will
get DNS NXDOMAIN, which is the desired fail-loudly behavior. -->
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`
## Client work rules
When working on a project tagged with a client name:
1. Never send code, data, or context to cloud APIs — use local models only
2. Never reference other client projects or their data
3. Keep all artifacts within the client's git org / directory
4. Treat everything as confidential unless told otherwise
## Harness-agnostic principles
This context is designed to work with any AI coding tool:
- Claude Code, Cursor, Aider, Open WebUI, Charmbracelet Mods/Crush
- Pi Coding Agent, Mistral Vibe, Antigravity
- Any tool that accepts a system prompt or reads a markdown context file
The canonical source is always `.context/AGENT.md` (root) and `.context/PROJECT.md` (per-project).
Derived files are committed (see *How context propagates* below) so a `git pull` on any host yields full agent context with no setup.
## How context propagates
Canonical sources of truth:
- Universal: `~/dev/.context/AGENT.md` (this file)
- Project: `<repo>/.context/PROJECT.md` (per-repo)
Derived files (committed, regenerated by `task context:sync`):
- `CLAUDE.md`, `AGENTS.md`, `.cursorrules`, `.aider.conventions.md`,
`.context/system-prompt.txt`
Workflow:
1. Edit a canonical file. Run `task context:sync`. Commit canonical and
derived together. Push.
2. On any other host, `git pull` brings both. Claude Code (tree-walking)
uses `CLAUDE.md`; Crush / Pi / Antigravity (cwd-only) use `AGENTS.md`;
Cursor uses `.cursorrules`; Aider uses `.aider.conventions.md`.
3. `task check` runs `context:sync` then asserts `git status --porcelain`
is empty over the derived files (catches both modified-tracked drift
and missing-untracked adapters). A drift fails the check with a
message telling you to stage the regenerated files.
Behavior rules in this file and per-project rules in `PROJECT.md` apply
unconditionally on every host, every harness.
## Engineering Skills
Shared engineering skills are available in `~/dev/.skills/`. Load on demand via the index.
See `~/dev/.skills/SKILLS_INDEX.md` for the full list with descriptions and "use when" triggers.
Key skills:
- **TDD**: always write tests first — load `tdd` skill
- **Code Review**: load `code-review` skill before any review
- **SOLID/Clean Code**: load `solid` or `clean-code` skill for design work
- **Problem first**: load `problem-analysis` skill before coding non-trivial features
---
# Project context
<!-- Canonical project context. Edit this, run `task context:sync`.
Root agent context from ~/dev/.context/AGENT.md is automatically
prepended for harnesses that don't walk the directory tree. -->
## Identity
- **Name**: supervisor
- **Owner**: Mathias
- **Client**: personal
- **Repo**:
- **Status**: active
## Stack
- **Primary language**: Go
- **UI layer**: HTMX + Templ (when applicable)
- **Fallback languages**: Python, TypeScript (justify in PR if used)
- **Build**: Task (taskfile.dev), not Make
- **Containers**: Docker (compose for dev, k3s for deploy)
- **Target infra**: koala (GPU workloads), iguana (services), flamingo (edge)
## Conventions
### Code style
- Go: follow `golines`, `gofumpt`, `golangci-lint` with project config
- Tests: table-driven, in `_test.go` next to source, `testify` for assertions
- Errors: wrap with `fmt.Errorf("operation: %w", err)`, no naked returns
- Naming: stdlib conventions, no stuttering (`http.Client` not `http.HTTPClient`)
### Architecture preferences
- Prefer standard library over frameworks (net/http over gin/echo)
- Dependency injection via constructor functions, not containers
- Configuration via environment variables, parsed at startup into a typed struct
- Structured logging via `slog`
### Git
- Conventional commits: `feat:`, `fix:`, `chore:`, `docs:`, `refactor:`
- Branch naming: `feat/short-description`, `fix/short-description`
- PRs: one concern per PR, description explains *why* not *what*
### Security
- No secrets in code, ever — use env vars or SOPS-encrypted files
- Client data never leaves local network unless explicitly cleared
- Dependencies: audit with `govulncheck` before adding
## MCP endpoints
Two MCP servers expose this project's tooling, both reachable over Tailscale:
- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
`brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
by the ingestion service directly.
- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
migration.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
shell scripts and non-MCP clients.
## Agent instructions
When acting as a coding agent on this project:
1. Read this file and all `SKILL.md` files in `.skills/` before starting work
2. Run `task check` before committing (lint + test + vet)
3. If unsure about a convention, check `DECISIONS.md` or ask
4. Never modify files outside the project root without explicit permission
5. When adding a dependency, explain why in the commit message
6. For client projects: never send code or context to cloud APIs — use local models via LiteLLM

73
CLAUDE.md Normal file
View File

@@ -0,0 +1,73 @@
# Project context
<!-- Canonical project context. Edit this, run `task context:sync`.
Root agent context from ~/dev/.context/AGENT.md is automatically
prepended for harnesses that don't walk the directory tree. -->
## Identity
- **Name**: supervisor
- **Owner**: Mathias
- **Client**: personal
- **Repo**:
- **Status**: active
## Stack
- **Primary language**: Go
- **UI layer**: HTMX + Templ (when applicable)
- **Fallback languages**: Python, TypeScript (justify in PR if used)
- **Build**: Task (taskfile.dev), not Make
- **Containers**: Docker (compose for dev, k3s for deploy)
- **Target infra**: koala (GPU workloads), iguana (services), flamingo (edge)
## Conventions
### Code style
- Go: follow `golines`, `gofumpt`, `golangci-lint` with project config
- Tests: table-driven, in `_test.go` next to source, `testify` for assertions
- Errors: wrap with `fmt.Errorf("operation: %w", err)`, no naked returns
- Naming: stdlib conventions, no stuttering (`http.Client` not `http.HTTPClient`)
### Architecture preferences
- Prefer standard library over frameworks (net/http over gin/echo)
- Dependency injection via constructor functions, not containers
- Configuration via environment variables, parsed at startup into a typed struct
- Structured logging via `slog`
### Git
- Conventional commits: `feat:`, `fix:`, `chore:`, `docs:`, `refactor:`
- Branch naming: `feat/short-description`, `fix/short-description`
- PRs: one concern per PR, description explains *why* not *what*
### Security
- No secrets in code, ever — use env vars or SOPS-encrypted files
- Client data never leaves local network unless explicitly cleared
- Dependencies: audit with `govulncheck` before adding
## MCP endpoints
Two MCP servers expose this project's tooling, both reachable over Tailscale:
- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
`brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
by the ingestion service directly.
- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
migration.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
shell scripts and non-MCP clients.
## Agent instructions
When acting as a coding agent on this project:
1. Read this file and all `SKILL.md` files in `.skills/` before starting work
2. Run `task check` before committing (lint + test + vet)
3. If unsure about a convention, check `DECISIONS.md` or ask
4. Never modify files outside the project root without explicit permission
5. When adding a dependency, explain why in the commit message
6. For client projects: never send code or context to cloud APIs — use local models via LiteLLM

View File

@@ -10,10 +10,11 @@ into a searchable brain.
```
Your Claude Code session (in any project)
│ MCP tools (over stdio bridge → HTTP)
supervisor :3200 — skill workers: tdd, retrospective
ingestion :3300 — brain HTTP API: query wiki, write notes
│ MCP over HTTP (Tailscale)
├──▶ supervisor :3200 (NodePort 30320 on koala) — skill workers: tdd, debug, spec, …
└──▶ brain :3300 (NodePort 30330 on koala) — brain_query, brain_write, brain_ingest, session_log
└─ also serves the legacy REST endpoints (/query, /write, /ingest, …)
brain/
@@ -55,18 +56,28 @@ Create `.mcp.json` in your project root:
{
"mcpServers": {
"supervisor": {
"command": "/Users/mathias/dev/AI/supervisor/bin/supervisor-bridge",
"env": {
"SUPERVISOR_URL": "http://localhost:3200/mcp"
}
"type": "http",
"url": "http://koala:30320/mcp"
},
"brain": {
"type": "http",
"url": "http://koala:30330/mcp"
}
}
}
```
Build the bridge binary once: `task bridge:build`
Two MCP servers are exposed today, both reachable over Tailscale:
Then open Claude Code in your project — run `/mcp` to confirm `supervisor` is listed.
- **`supervisor`** at `koala:30320` — skill workers (`tdd_red/green/refactor`,
`review`, `debug`, `spec`, `retrospective`, `trainer`, `tier`).
- **`brain`** at `koala:30330` — knowledge access (`brain_query`, `brain_write`,
`brain_ingest`, `brain_ingest_raw`) and `session_log`. Hosted by the ingestion
service directly, no separate pod.
No local binary or stdio shim is required — Claude Code talks to both via HTTP.
Open Claude Code in your project — run `/mcp` to confirm both servers are listed.
## A typical TDD session

View File

@@ -12,9 +12,6 @@ tasks:
desc: Regenerate all harness-specific context files
cmds:
- bash scripts/context-sync.sh
sources:
- .context/PROJECT.md
- .skills/*/SKILL.md
context:sync:claude:
cmds: [bash scripts/context-sync.sh claude]
@@ -42,6 +39,22 @@ tasks:
cmds:
- go run ./cmd/supervisor
hyperguild:dev:
desc: Run hyperguild CLI from source (e.g. task hyperguild:dev -- tier)
cmds:
- go run ./cmd/hyperguild {{.CLI_ARGS}}
hyperguild:build:
desc: Build the hyperguild binary into ./bin/hyperguild
cmds:
- mkdir -p bin
- go build -o bin/hyperguild ./cmd/hyperguild
hyperguild:install:
desc: Install hyperguild into $GOBIN
cmds:
- go install ./cmd/hyperguild
ingestion:dev:
desc: Run ingestion server in development mode
dir: ingestion
@@ -57,7 +70,6 @@ tasks:
desc: Build all binaries
cmds:
- task: supervisor:build
- task: bridge:build
- task: ingestion:build
supervisor:build:
@@ -65,11 +77,6 @@ tasks:
cmds:
- go build -trimpath -ldflags="-s -w -X main.version={{.VERSION}}" -o bin/supervisor ./cmd/supervisor
bridge:build:
desc: Build stdio↔HTTP bridge for Claude Code MCP integration
cmds:
- go build -trimpath -ldflags="-s -w" -o bin/supervisor-bridge ./cmd/bridge
ingestion:build:
desc: Build ingestion server binary
dir: ingestion
@@ -79,8 +86,20 @@ tasks:
# ── Quality ────────────────────────────────────────────────────────────────
check:
desc: Run all checks (lint + test + vet) across all modules
desc: Run all checks (context freshness + lint + test + vet) across all modules
cmds:
- task: context:sync
- cmd: |
drift=$(git status --porcelain -- AGENTS.md CLAUDE.md .cursorrules .aider.conventions.md .context/system-prompt.txt 2>/dev/null)
if [ -n "$drift" ]; then
echo "ERROR: derived adapters drifted from canonical context." >&2
echo "$drift" >&2
echo "" >&2
echo "Run: git add AGENTS.md CLAUDE.md .cursorrules .aider.conventions.md .context/system-prompt.txt" >&2
echo " git commit -m 'chore: re-sync context adapters'" >&2
exit 1
fi
echo "✓ context: canonical and adapters are in sync"
- task: lint
- task: test
- task: vet

View File

@@ -1,59 +0,0 @@
// bridge is a stdio↔HTTP adapter that lets Claude Code connect to the
// supervisor MCP server via the stdio transport.
//
// Claude Code spawns this binary as a subprocess and communicates over
// stdin/stdout. Each newline-delimited JSON-RPC message from stdin is
// forwarded to the supervisor HTTP server and the response is written back.
//
// Usage:
//
// SUPERVISOR_URL=http://localhost:3200/mcp bridge
package main
import (
"bufio"
"bytes"
"fmt"
"io"
"net/http"
"os"
)
func main() {
url := os.Getenv("SUPERVISOR_URL")
if url == "" {
url = "http://localhost:3200/mcp"
}
client := &http.Client{}
scanner := bufio.NewScanner(os.Stdin)
scanner.Buffer(make([]byte, 1024*1024), 1024*1024)
for scanner.Scan() {
line := scanner.Bytes()
if len(bytes.TrimSpace(line)) == 0 {
continue
}
req, err := http.NewRequest(http.MethodPost, url, bytes.NewReader(line))
if err != nil {
fmt.Fprintf(os.Stderr, "bridge: build request: %v\n", err)
continue
}
req.Header.Set("Content-Type", "application/json")
resp, err := client.Do(req)
if err != nil {
fmt.Fprintf(os.Stderr, "bridge: request failed: %v\n", err)
continue
}
_, _ = io.Copy(os.Stdout, resp.Body)
_ = resp.Body.Close()
_, _ = os.Stdout.Write([]byte("\n"))
}
if err := scanner.Err(); err != nil {
fmt.Fprintf(os.Stderr, "bridge: scanner: %v\n", err)
os.Exit(1)
}
}

110
cmd/hyperguild/README.md Normal file
View File

@@ -0,0 +1,110 @@
# hyperguild CLI
A small Go binary for tier probing, brain HTTP REST access, and
`.mcp.json` mode bootstrap. Replaces the supervisor's `tier` MCP and
gives shell scripts a stable interface to the brain.
## Install
```bash
task hyperguild:install
# or: go install ./cmd/hyperguild
```
The binary lands at `$(go env GOBIN)/hyperguild` (typically
`~/go/bin/hyperguild`). Make sure that's on your PATH.
## Subcommands
### `hyperguild tier`
Probes Anthropic and LiteLLM and reports the current operating tier.
```bash
$ hyperguild tier
tier 1 (full-online) managed_agents=true
$ hyperguild tier --json
{
"tier": 1,
"label": "full-online",
"available_models": null,
"managed_agents": true
}
```
Probe URLs are read from environment:
| Var | Default |
|-----------------------|-------------------------------|
| `ANTHROPIC_PROBE_URL` | `https://api.anthropic.com` |
| `LITELLM_BASE_URL` | (empty → falls through to airplane) |
### `hyperguild brain query <topic>`
BM25 search over the brain's knowledge + wiki entries.
```bash
$ hyperguild brain query "find -H symlink"
knowledge/2026-05-03-find-h-not-l-symlinked-root.md score=12 Use find -H, not find -L
...
```
Flags:
- `--limit N` — max results (default 5)
- `--json` — emit the raw response envelope
### `hyperguild brain write <type> <slug>`
Reads markdown from stdin, writes a knowledge entry.
```bash
$ cat <<EOF | hyperguild brain write knowledge example-lesson
# Example lesson
## Lesson
...
EOF
knowledge/example-lesson.md
```
### `hyperguild mode <cloud|client-local|sovereign>`
Writes a `.mcp.json` template for the chosen operating mode.
```bash
$ hyperguild mode cloud --out ./.mcp.json
wrote ./.mcp.json (mode: cloud)
```
Flags:
- `--out PATH` — output file (default `./.mcp.json`)
- `--force` — overwrite an existing file
Modes:
- **cloud** — brain MCP only. Claude Code with no routing.
- **client-local** — brain + routing placeholder. The routing entry's
URL points at `koala:30310/mcp`; a `_routing_pending` field marks it
as awaiting Plan 6 of the hyperguild migration.
- **sovereign** — brain only, with a `_mode_note` explaining that this
mode primarily uses Crush + LiteLLM and the `.mcp.json` is a Claude
Code fallback for emergency offline use.
## Environment
| Var | Default | Used by |
|-----------------------|--------------------------|---------------------|
| `BRAIN_URL` | `http://koala:30330` | `brain *`, `mode *` |
| `ANTHROPIC_PROBE_URL` | `https://api.anthropic.com` | `tier` |
| `LITELLM_BASE_URL` | (empty) | `tier` |
Override `BRAIN_URL` if your brain pod is at a different Tailscale name
or port.
## See also
- `docs/superpowers/specs/2026-05-03-hyperguild-cli-design.md` — full spec
- `docs/superpowers/plans/2026-05-03-hyperguild-cli.md` — implementation plan

73
cmd/hyperguild/brain.go Normal file
View File

@@ -0,0 +1,73 @@
package main
import (
"context"
"encoding/json"
"errors"
"flag"
"fmt"
"io"
)
func runBrain(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io.Writer) error {
if len(args) == 0 {
return errors.New("subcommand required (query|write)")
}
switch args[0] {
case "query":
return runBrainQuery(ctx, args[1:], stdin, stdout, stderr)
case "write":
return runBrainWrite(ctx, args[1:], stdin, stdout, stderr)
default:
return fmt.Errorf("unknown subcommand: %s (expected query|write)", args[0])
}
}
func runBrainQuery(ctx context.Context, args []string, _ io.Reader, stdout, stderr io.Writer) error {
fs := flag.NewFlagSet("brain query", flag.ContinueOnError)
fs.SetOutput(stderr)
asJSON := fs.Bool("json", false, "output JSON instead of human-readable")
limit := fs.Int("limit", 5, "maximum number of results")
if err := fs.Parse(args); err != nil {
return fmt.Errorf("parse flags: %w", err)
}
if fs.NArg() < 1 {
return errors.New("topic required")
}
topic := fs.Arg(0)
res, err := newBrainClient().Query(ctx, topic, *limit)
if err != nil {
return err
}
if *asJSON {
enc := json.NewEncoder(stdout)
enc.SetIndent("", " ")
return enc.Encode(res)
}
for _, hit := range res.Results {
fmt.Fprintf(stdout, "%s score=%d %s\n", hit.Path, hit.Score, hit.Title) //nolint:errcheck
}
return nil
}
func runBrainWrite(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io.Writer) error {
fs := flag.NewFlagSet("brain write", flag.ContinueOnError)
fs.SetOutput(stderr)
if err := fs.Parse(args); err != nil {
return fmt.Errorf("parse flags: %w", err)
}
if fs.NArg() < 2 {
return errors.New("type and slug required (e.g. brain write knowledge my-slug)")
}
kind := fs.Arg(0)
slug := fs.Arg(1)
res, err := newBrainClient().Write(ctx, kind, slug, stdin)
if err != nil {
return err
}
fmt.Fprintln(stdout, res.Path) //nolint:errcheck
return nil
}

View File

@@ -0,0 +1,156 @@
package main
import (
"bytes"
"context"
"encoding/json"
"io"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func brainQueryServer(t *testing.T, body string) *httptest.Server {
t.Helper()
return httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(body))
}))
}
func TestRunBrainQuery_Human(t *testing.T) {
srv := brainQueryServer(t, `{"results":[{"path":"knowledge/a.md","title":"A","excerpt":"...","score":9},{"path":"knowledge/b.md","title":"B","excerpt":"...","score":3}]}`)
defer srv.Close()
t.Setenv("BRAIN_URL", srv.URL)
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"query", "topic"}, strings.NewReader(""), &out, &errBuf)
require.NoError(t, err)
got := out.String()
assert.Contains(t, got, "knowledge/a.md")
assert.Contains(t, got, "score=9")
assert.Contains(t, got, "knowledge/b.md")
}
func TestRunBrainQuery_JSON(t *testing.T) {
srv := brainQueryServer(t, `{"results":[{"path":"x.md","title":"X","excerpt":"e","score":5}]}`)
defer srv.Close()
t.Setenv("BRAIN_URL", srv.URL)
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"query", "--json", "topic"}, strings.NewReader(""), &out, &errBuf)
require.NoError(t, err)
assert.Contains(t, out.String(), `"path": "x.md"`)
assert.Contains(t, out.String(), `"score": 5`)
}
func TestRunBrainQuery_Limit(t *testing.T) {
gotLimit := -1
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
body, _ := io.ReadAll(r.Body)
var p struct {
Query string `json:"query"`
Limit int `json:"limit"`
}
_ = json.Unmarshal(body, &p)
gotLimit = p.Limit
_, _ = w.Write([]byte(`{"results":[]}`))
}))
defer srv.Close()
t.Setenv("BRAIN_URL", srv.URL)
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"query", "--limit", "12", "topic"}, strings.NewReader(""), &out, &errBuf)
require.NoError(t, err)
assert.Equal(t, 12, gotLimit)
}
func TestRunBrainQuery_MissingTopic(t *testing.T) {
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"query"}, strings.NewReader(""), &out, &errBuf)
assert.Error(t, err)
}
func TestRunBrain_NoSubsubcommand(t *testing.T) {
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{}, strings.NewReader(""), &out, &errBuf)
assert.Error(t, err)
assert.Contains(t, err.Error(), "subcommand required")
}
func TestRunBrain_UnknownSubsubcommand(t *testing.T) {
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"bogus"}, strings.NewReader(""), &out, &errBuf)
assert.Error(t, err)
}
func TestRunBrainWrite_Success(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
assert.Equal(t, http.MethodPost, r.Method)
assert.Equal(t, "/write", r.URL.Path)
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"path":"knowledge/test-slug.md"}`))
}))
defer srv.Close()
t.Setenv("BRAIN_URL", srv.URL)
var out, errBuf bytes.Buffer
err := runBrain(
context.Background(),
[]string{"write", "knowledge", "test-slug"},
strings.NewReader("# Test\n\nSome body content.\n"),
&out, &errBuf,
)
require.NoError(t, err)
assert.Contains(t, out.String(), "knowledge/test-slug.md")
}
func TestRunBrainWrite_MissingArgs(t *testing.T) {
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"write", "knowledge"}, strings.NewReader("x"), &out, &errBuf)
assert.Error(t, err)
assert.Contains(t, err.Error(), "type and slug required")
}
func TestRunBrainWrite_BackendError(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusBadRequest)
_, _ = w.Write([]byte("invalid slug"))
}))
defer srv.Close()
t.Setenv("BRAIN_URL", srv.URL)
var out, errBuf bytes.Buffer
err := runBrain(
context.Background(),
[]string{"write", "knowledge", "bad slug"},
strings.NewReader("body"),
&out, &errBuf,
)
assert.Error(t, err)
assert.Contains(t, err.Error(), "400")
}
func TestRunBrainWrite_EmptyStdin(t *testing.T) {
gotLen := -1
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
body, _ := io.ReadAll(r.Body)
var p struct {
Content string `json:"content"`
}
_ = json.Unmarshal(body, &p)
gotLen = len(p.Content)
_, _ = w.Write([]byte(`{"path":"x.md"}`))
}))
defer srv.Close()
t.Setenv("BRAIN_URL", srv.URL)
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"write", "knowledge", "empty"}, strings.NewReader(""), &out, &errBuf)
require.NoError(t, err)
assert.Equal(t, 0, gotLen, "empty stdin should produce empty content payload")
}

121
cmd/hyperguild/http.go Normal file
View File

@@ -0,0 +1,121 @@
package main
import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"os"
"time"
)
const defaultBrainURL = "http://koala:30330"
// brainClient calls the brain HTTP REST API exposed alongside the MCP
// endpoint at the same host:port. /mcp serves MCP framing; /query and /write
// serve plain REST. We use the REST surface because the CLI is a
// shell-friendly client; MCP framing is unnecessary.
type brainClient struct {
baseURL string
http *http.Client
}
func newBrainClient() *brainClient {
u := os.Getenv("BRAIN_URL")
if u == "" {
u = defaultBrainURL
}
return &brainClient{
baseURL: u,
http: &http.Client{Timeout: 5 * time.Second},
}
}
// QueryHit mirrors a single result from the brain's /query endpoint.
type QueryHit struct {
Path string `json:"path"`
Title string `json:"title"`
Excerpt string `json:"excerpt"`
Score int `json:"score"`
}
// QueryResult mirrors the /query response envelope.
type QueryResult struct {
Results []QueryHit `json:"results"`
}
func (c *brainClient) Query(ctx context.Context, topic string, limit int) (*QueryResult, error) {
payload, err := json.Marshal(struct {
Query string `json:"query"`
Limit int `json:"limit"`
}{Query: topic, Limit: limit})
if err != nil {
return nil, fmt.Errorf("marshal payload: %w", err)
}
u := c.baseURL + "/query"
req, err := http.NewRequestWithContext(ctx, http.MethodPost, u, bytes.NewReader(payload))
if err != nil {
return nil, fmt.Errorf("build request: %w", err)
}
req.Header.Set("Content-Type", "application/json")
resp, err := c.http.Do(req)
if err != nil {
return nil, fmt.Errorf("brain POST /query: %w", err)
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
return nil, fmt.Errorf("brain POST /query: status %d: %s", resp.StatusCode, string(body))
}
var out QueryResult
if err := json.NewDecoder(resp.Body).Decode(&out); err != nil {
return nil, fmt.Errorf("decode /query response: %w", err)
}
return &out, nil
}
// WriteResult mirrors the /write response envelope.
type WriteResult struct {
Path string `json:"path"`
}
func (c *brainClient) Write(ctx context.Context, kind, slug string, content io.Reader) (*WriteResult, error) {
body, err := io.ReadAll(content)
if err != nil {
return nil, fmt.Errorf("read content: %w", err)
}
payload, err := json.Marshal(struct {
Type string `json:"type"`
Slug string `json:"slug"`
Content string `json:"content"`
}{Type: kind, Slug: slug, Content: string(body)})
if err != nil {
return nil, fmt.Errorf("marshal payload: %w", err)
}
u := c.baseURL + "/write"
req, err := http.NewRequestWithContext(ctx, http.MethodPost, u, bytes.NewReader(payload))
if err != nil {
return nil, fmt.Errorf("build request: %w", err)
}
req.Header.Set("Content-Type", "application/json")
resp, err := c.http.Do(req)
if err != nil {
return nil, fmt.Errorf("brain POST /write: %w", err)
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
respBody, _ := io.ReadAll(resp.Body)
return nil, fmt.Errorf("brain POST /write: status %d: %s", resp.StatusCode, string(respBody))
}
var out WriteResult
if err := json.NewDecoder(resp.Body).Decode(&out); err != nil {
return nil, fmt.Errorf("decode /write response: %w", err)
}
return &out, nil
}

View File

@@ -0,0 +1,97 @@
package main
import (
"context"
"encoding/json"
"io"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestBrainClient_Query_Success(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
assert.Equal(t, http.MethodPost, r.Method)
assert.Equal(t, "/query", r.URL.Path)
body, _ := io.ReadAll(r.Body)
var got struct {
Query string `json:"query"`
Limit int `json:"limit"`
}
require.NoError(t, json.Unmarshal(body, &got))
assert.Equal(t, "find-h", got.Query)
assert.Equal(t, 3, got.Limit)
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"results":[{"path":"knowledge/x.md","title":"x","excerpt":"...","score":7}]}`))
}))
defer srv.Close()
c := &brainClient{baseURL: srv.URL, http: srv.Client()}
res, err := c.Query(context.Background(), "find-h", 3)
require.NoError(t, err)
require.Len(t, res.Results, 1)
assert.Equal(t, "knowledge/x.md", res.Results[0].Path)
assert.Equal(t, 7, res.Results[0].Score)
}
func TestBrainClient_Query_TransportError(t *testing.T) {
c := &brainClient{baseURL: "http://127.0.0.1:1", http: http.DefaultClient}
_, err := c.Query(context.Background(), "x", 5)
assert.Error(t, err)
}
func TestBrainClient_Query_Non200(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusInternalServerError)
_, _ = w.Write([]byte("boom"))
}))
defer srv.Close()
c := &brainClient{baseURL: srv.URL, http: srv.Client()}
_, err := c.Query(context.Background(), "x", 5)
require.Error(t, err)
assert.Contains(t, err.Error(), "500")
}
func TestBrainClient_Write_Success(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
assert.Equal(t, "/write", r.URL.Path)
assert.Equal(t, http.MethodPost, r.Method)
body, _ := io.ReadAll(r.Body)
var got struct {
Type string `json:"type"`
Slug string `json:"slug"`
Content string `json:"content"`
}
require.NoError(t, json.Unmarshal(body, &got))
assert.Equal(t, "knowledge", got.Type)
assert.Equal(t, "find-h", got.Slug)
assert.Equal(t, "# body\n", got.Content)
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"path":"knowledge/find-h.md"}`))
}))
defer srv.Close()
c := &brainClient{baseURL: srv.URL, http: srv.Client()}
res, err := c.Write(context.Background(), "knowledge", "find-h", strings.NewReader("# body\n"))
require.NoError(t, err)
assert.Equal(t, "knowledge/find-h.md", res.Path)
}
func TestNewBrainClient_DefaultURL(t *testing.T) {
t.Setenv("BRAIN_URL", "")
c := newBrainClient()
assert.Equal(t, "http://koala:30330", c.baseURL)
}
func TestNewBrainClient_OverrideURL(t *testing.T) {
t.Setenv("BRAIN_URL", "http://localhost:9999")
c := newBrainClient()
assert.Equal(t, "http://localhost:9999", c.baseURL)
}

71
cmd/hyperguild/main.go Normal file
View File

@@ -0,0 +1,71 @@
// Package main implements the hyperguild CLI: tier probe, brain HTTP REST
// access, and .mcp.json mode bootstrap. See docs/superpowers/specs/.
package main
import (
"context"
"fmt"
"io"
"os"
)
// subcommand is the contract every hyperguild subcommand satisfies.
// Functions take an explicit context, args (without the subcommand name
// itself), and explicit IO so tests can exercise full flows without
// touching os.Stdin / os.Stdout / os.Exit.
type subcommand func(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io.Writer) error
func subcommands() map[string]subcommand {
return map[string]subcommand{
"tier": runTier,
"brain": runBrain,
"mode": runMode,
}
}
const usage = `Usage: hyperguild <subcommand> [options]
Subcommands:
tier Probe Anthropic + LiteLLM, print current operating tier.
brain query <q> BM25 search the brain (HTTP REST).
brain write <t> <s>
Write stdin as a knowledge entry of type <t>, slug <s>.
mode <name> Bootstrap .mcp.json for a chosen mode:
cloud | client-local | sovereign
Environment:
BRAIN_URL Brain HTTP REST + MCP base URL.
Default: http://koala:30330
ANTHROPIC_PROBE_URL Tier probe URL for the Anthropic API.
Default: https://api.anthropic.com
LITELLM_BASE_URL Tier probe URL for the LiteLLM gateway.
Optional; if empty, falls through to airplane tier.
`
// dispatch routes args to a subcommand and returns the process exit code.
// Split from main() so tests can drive it without process exit.
func dispatch(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io.Writer) int {
if len(args) == 0 {
fmt.Fprint(stderr, usage) //nolint:errcheck
return 2
}
switch args[0] {
case "-h", "--help", "help":
fmt.Fprint(stdout, usage) //nolint:errcheck
return 0
}
cmd, ok := subcommands()[args[0]]
if !ok {
fmt.Fprintf(stderr, "hyperguild: unknown subcommand: %s\n%s", args[0], usage) //nolint:errcheck
return 2
}
if err := cmd(ctx, args[1:], stdin, stdout, stderr); err != nil {
fmt.Fprintf(stderr, "hyperguild %s: %v\n", args[0], err) //nolint:errcheck
return 1
}
return 0
}
func main() {
os.Exit(dispatch(context.Background(), os.Args[1:], os.Stdin, os.Stdout, os.Stderr))
}

View File

@@ -0,0 +1,45 @@
package main
import (
"bytes"
"context"
"strings"
"testing"
"github.com/stretchr/testify/assert"
)
func TestDispatch_Help_PrintsUsageAndReturnsZero(t *testing.T) {
var out, errBuf bytes.Buffer
code := dispatch(context.Background(), []string{"--help"}, strings.NewReader(""), &out, &errBuf)
assert.Equal(t, 0, code)
assert.Contains(t, out.String(), "Usage: hyperguild")
assert.Contains(t, out.String(), "tier")
assert.Contains(t, out.String(), "brain")
assert.Contains(t, out.String(), "mode")
}
func TestDispatch_NoArgs_PrintsUsageAndReturnsTwo(t *testing.T) {
var out, errBuf bytes.Buffer
code := dispatch(context.Background(), []string{}, strings.NewReader(""), &out, &errBuf)
assert.Equal(t, 2, code)
assert.Contains(t, errBuf.String(), "Usage: hyperguild")
}
func TestDispatch_UnknownSubcommand_ReturnsTwo(t *testing.T) {
var out, errBuf bytes.Buffer
code := dispatch(context.Background(), []string{"bogus"}, strings.NewReader(""), &out, &errBuf)
assert.Equal(t, 2, code)
assert.Contains(t, errBuf.String(), "unknown subcommand: bogus")
}
func TestDispatch_KnownSubcommand_RoutesToHandler(t *testing.T) {
// "mode" without args fails → exit 1, message on stderr.
// (Confirms dispatch reached the handler rather than printing "unknown
// subcommand: mode".)
var out, errBuf bytes.Buffer
code := dispatch(context.Background(), []string{"mode"}, strings.NewReader(""), &out, &errBuf)
assert.Equal(t, 1, code)
assert.Contains(t, errBuf.String(), "name required")
assert.NotContains(t, errBuf.String(), "unknown subcommand")
}

99
cmd/hyperguild/mode.go Normal file
View File

@@ -0,0 +1,99 @@
package main
import (
"context"
"encoding/json"
"errors"
"flag"
"fmt"
"io"
"os"
)
func runMode(ctx context.Context, args []string, _ io.Reader, stdout, stderr io.Writer) error {
fs := flag.NewFlagSet("mode", flag.ContinueOnError)
fs.SetOutput(stderr)
out := fs.String("out", ".mcp.json", "output file path")
force := fs.Bool("force", false, "overwrite an existing file")
// Pull the first positional (mode name) out so flags after it still parse
// with stdlib flag (which stops at the first non-flag arg).
if len(args) < 1 {
return errors.New("name required (cloud|client-local|sovereign)")
}
name := args[0]
if err := fs.Parse(args[1:]); err != nil {
return fmt.Errorf("parse flags: %w", err)
}
brainURL := os.Getenv("BRAIN_URL")
if brainURL == "" {
brainURL = defaultBrainURL
}
var doc map[string]any
switch name {
case "cloud":
doc = modeCloud(brainURL)
case "client-local":
doc = modeClientLocal(brainURL)
case "sovereign":
doc = modeSovereign(brainURL)
default:
return fmt.Errorf("unknown mode: %s (expected cloud|client-local|sovereign)", name)
}
if !*force {
if _, err := os.Stat(*out); err == nil {
return fmt.Errorf("%s exists (use --force to overwrite)", *out)
}
}
body, err := json.MarshalIndent(doc, "", " ")
if err != nil {
return fmt.Errorf("marshal mode doc: %w", err)
}
if err := os.WriteFile(*out, append(body, '\n'), 0o644); err != nil {
return fmt.Errorf("write %s: %w", *out, err)
}
fmt.Fprintf(stdout, "wrote %s (mode: %s)\n", *out, name) //nolint:errcheck
return nil
}
func modeCloud(brainURL string) map[string]any {
return map[string]any{
"mcpServers": map[string]any{
"brain": map[string]any{
"url": brainURL + "/mcp",
"description": "Brain MCP — knowledge query, write, ingestion, session log",
},
},
}
}
func modeClientLocal(brainURL string) map[string]any {
return map[string]any{
"mcpServers": map[string]any{
"brain": map[string]any{
"url": brainURL + "/mcp",
"description": "Brain MCP — knowledge query, write, ingestion, session log",
},
"routing": map[string]any{
"url": "http://koala:30310/mcp",
"description": "Mode 2 routing pod — routes skill calls to LiteLLM/local",
"_routing_pending": "Plan 6 — routing pod not deployed yet; this URL is a placeholder",
},
},
}
}
func modeSovereign(brainURL string) map[string]any {
return map[string]any{
"_mode_note": "Sovereign mode primarily uses Crush + LiteLLM. This .mcp.json is provided as Claude Code fallback (e.g. emergency offline editing).",
"mcpServers": map[string]any{
"brain": map[string]any{
"url": brainURL + "/mcp",
"description": "Brain MCP — knowledge query, write, ingestion, session log",
},
},
}
}

123
cmd/hyperguild/mode_test.go Normal file
View File

@@ -0,0 +1,123 @@
package main
import (
"bytes"
"context"
"encoding/json"
"os"
"path/filepath"
"strings"
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func readJSON(t *testing.T, path string) map[string]any {
t.Helper()
b, err := os.ReadFile(path)
require.NoError(t, err)
var out map[string]any
require.NoError(t, json.Unmarshal(b, &out))
return out
}
func TestRunMode_Cloud_Default(t *testing.T) {
dir := t.TempDir()
outPath := filepath.Join(dir, ".mcp.json")
t.Setenv("BRAIN_URL", "http://koala:30330")
var stdout, stderr bytes.Buffer
err := runMode(context.Background(), []string{"cloud", "--out", outPath}, strings.NewReader(""), &stdout, &stderr)
require.NoError(t, err)
got := readJSON(t, outPath)
servers, ok := got["mcpServers"].(map[string]any)
require.True(t, ok, "mcpServers must be a JSON object")
assert.Contains(t, servers, "brain")
assert.NotContains(t, servers, "routing")
assert.NotContains(t, got, "_mode_note")
}
func TestRunMode_ClientLocal_HasRoutingPlaceholder(t *testing.T) {
dir := t.TempDir()
outPath := filepath.Join(dir, ".mcp.json")
t.Setenv("BRAIN_URL", "http://koala:30330")
var stdout, stderr bytes.Buffer
err := runMode(context.Background(), []string{"client-local", "--out", outPath}, strings.NewReader(""), &stdout, &stderr)
require.NoError(t, err)
got := readJSON(t, outPath)
servers := got["mcpServers"].(map[string]any)
require.Contains(t, servers, "brain")
require.Contains(t, servers, "routing")
routing := servers["routing"].(map[string]any)
assert.Contains(t, routing, "_routing_pending")
}
func TestRunMode_Sovereign_HasModeNote(t *testing.T) {
dir := t.TempDir()
outPath := filepath.Join(dir, ".mcp.json")
var stdout, stderr bytes.Buffer
err := runMode(context.Background(), []string{"sovereign", "--out", outPath}, strings.NewReader(""), &stdout, &stderr)
require.NoError(t, err)
got := readJSON(t, outPath)
assert.Contains(t, got, "_mode_note")
servers := got["mcpServers"].(map[string]any)
assert.Contains(t, servers, "brain")
assert.NotContains(t, servers, "routing")
}
func TestRunMode_DefaultsOutToCwd(t *testing.T) {
dir := t.TempDir()
t.Chdir(dir) // Go 1.24+ — replaces the older os.Chdir-with-cleanup pattern
var stdout, stderr bytes.Buffer
err := runMode(context.Background(), []string{"cloud"}, strings.NewReader(""), &stdout, &stderr)
require.NoError(t, err)
_, statErr := os.Stat(filepath.Join(dir, ".mcp.json"))
assert.NoError(t, statErr, ".mcp.json should exist in cwd")
}
func TestRunMode_UnknownMode(t *testing.T) {
dir := t.TempDir()
outPath := filepath.Join(dir, ".mcp.json")
var stdout, stderr bytes.Buffer
err := runMode(context.Background(), []string{"bogus", "--out", outPath}, strings.NewReader(""), &stdout, &stderr)
assert.Error(t, err)
assert.Contains(t, err.Error(), "unknown mode")
}
func TestRunMode_NoArgs(t *testing.T) {
var stdout, stderr bytes.Buffer
err := runMode(context.Background(), []string{}, strings.NewReader(""), &stdout, &stderr)
assert.Error(t, err)
}
func TestRunMode_RefusesToOverwrite(t *testing.T) {
dir := t.TempDir()
outPath := filepath.Join(dir, ".mcp.json")
require.NoError(t, os.WriteFile(outPath, []byte(`{"existing":"file"}`), 0o644))
var stdout, stderr bytes.Buffer
err := runMode(context.Background(), []string{"cloud", "--out", outPath}, strings.NewReader(""), &stdout, &stderr)
require.Error(t, err)
assert.Contains(t, err.Error(), "exists")
}
func TestRunMode_Force(t *testing.T) {
dir := t.TempDir()
outPath := filepath.Join(dir, ".mcp.json")
require.NoError(t, os.WriteFile(outPath, []byte(`{"existing":"file"}`), 0o644))
var stdout, stderr bytes.Buffer
err := runMode(context.Background(), []string{"cloud", "--out", outPath, "--force"}, strings.NewReader(""), &stdout, &stderr)
require.NoError(t, err)
got := readJSON(t, outPath)
assert.Contains(t, got, "mcpServers")
assert.NotContains(t, got, "existing")
}

42
cmd/hyperguild/tier.go Normal file
View File

@@ -0,0 +1,42 @@
package main
import (
"context"
"encoding/json"
"flag"
"fmt"
"io"
"os"
"github.com/mathiasbq/supervisor/internal/tier"
)
const defaultAnthropicProbe = "https://api.anthropic.com"
func runTier(ctx context.Context, args []string, _ io.Reader, stdout, stderr io.Writer) error {
fs := flag.NewFlagSet("tier", flag.ContinueOnError)
fs.SetOutput(stderr)
asJSON := fs.Bool("json", false, "output JSON instead of human-readable")
if err := fs.Parse(args); err != nil {
return fmt.Errorf("parse flags: %w", err)
}
anthropicURL := os.Getenv("ANTHROPIC_PROBE_URL")
if anthropicURL == "" {
anthropicURL = defaultAnthropicProbe
}
liteLLMURL := os.Getenv("LITELLM_BASE_URL") // empty → tier falls through to airplane
info := tier.Detect(ctx, anthropicURL, liteLLMURL)
if *asJSON {
enc := json.NewEncoder(stdout)
enc.SetIndent("", " ")
if err := enc.Encode(info); err != nil {
return fmt.Errorf("encode json: %w", err)
}
return nil
}
fmt.Fprintf(stdout, "tier %d (%s) managed_agents=%t\n", int(info.Tier), info.Label, info.ManagedAgents) //nolint:errcheck
return nil
}

View File

@@ -0,0 +1,77 @@
package main
import (
"bytes"
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func okServer(t *testing.T) *httptest.Server {
t.Helper()
return httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
}))
}
func TestRunTier_Full_Human(t *testing.T) {
anthropic := okServer(t)
defer anthropic.Close()
litellm := okServer(t)
defer litellm.Close()
t.Setenv("ANTHROPIC_PROBE_URL", anthropic.URL)
t.Setenv("LITELLM_BASE_URL", litellm.URL)
var out, errBuf bytes.Buffer
err := runTier(context.Background(), []string{}, strings.NewReader(""), &out, &errBuf)
require.NoError(t, err)
assert.Contains(t, out.String(), "tier 1")
assert.Contains(t, out.String(), "full-online")
assert.Contains(t, out.String(), "managed_agents=true")
}
func TestRunTier_LANOnly_JSON(t *testing.T) {
litellm := okServer(t)
defer litellm.Close()
t.Setenv("ANTHROPIC_PROBE_URL", "http://127.0.0.1:1") // unreachable
t.Setenv("LITELLM_BASE_URL", litellm.URL)
var out, errBuf bytes.Buffer
err := runTier(context.Background(), []string{"--json"}, strings.NewReader(""), &out, &errBuf)
require.NoError(t, err)
var got struct {
Tier int `json:"tier"`
Label string `json:"label"`
ManagedAgents bool `json:"managed_agents"`
}
require.NoError(t, json.Unmarshal(out.Bytes(), &got))
assert.Equal(t, 2, got.Tier)
assert.Equal(t, "lan-only", got.Label)
assert.False(t, got.ManagedAgents)
}
func TestRunTier_Airplane_NoLiteLLMBaseURL(t *testing.T) {
t.Setenv("ANTHROPIC_PROBE_URL", "http://127.0.0.1:1")
t.Setenv("LITELLM_BASE_URL", "")
var out, errBuf bytes.Buffer
err := runTier(context.Background(), []string{}, strings.NewReader(""), &out, &errBuf)
require.NoError(t, err)
assert.Contains(t, out.String(), "tier 3")
assert.Contains(t, out.String(), "airplane")
}
func TestRunTier_UnknownFlag_ReturnsError(t *testing.T) {
var out, errBuf bytes.Buffer
err := runTier(context.Background(), []string{"--bogus"}, strings.NewReader(""), &out, &errBuf)
assert.Error(t, err)
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,79 @@
# Spec: hyperguild CLI
> Plan 4 of 7 — Hyperguild Skill Migration. Loaded after `feature-spec` skill.
## Problem Statement
Three needs converge on a single small Go binary:
1. **Tier probing as MCP is overkill.** The supervisor's `tier` MCP runs on `koala:30320` and answers a one-shot question (which models are reachable right now?). Pulling Claude Code through MCP startup, tool listing, and a JSON-RPC call for a 2-second probe is wasteful and adds a network hop the answer doesn't need.
2. **Brain access from shell scripts has no good front door.** The brain's HTTP REST API exists (Plan 1) at `koala:3300` for non-MCP clients, but every shell script that wants to query or write to the brain re-implements the curl invocation. A CLI gives shell pipelines, ad-hoc agent prompts, and quick-debug scenarios a stable interface.
3. **Mode bootstrap is manual.** Each new project that wants to operate in a chosen mode (cloud / client-local / sovereign) needs a `.mcp.json` written by hand. Without automation, mode adoption is gated on remembering the right MCP server URLs.
**Why now:** Plans 13 are merged. The CLI is the next building block in shrinking the supervisor pod toward a thin Mode-2 routing layer. Plans 5 and 6 build on the CLI's tier and brain helpers.
## Success Criteria
- [ ] `hyperguild tier` returns the same `tier.Info` that `internal/tier.Detect` produces for the same probe URLs, in < 3 s under all three tier conditions, with both human-readable and `--json` output.
- [ ] `hyperguild brain query <topic>` returns BM25 results from the brain HTTP REST `/query` endpoint, exit 0 on success and non-zero on transport failure.
- [ ] `hyperguild brain write <type> <slug>` reads markdown content from stdin, posts to `/write` with the type and slug, and creates `brain/knowledge/<slug>.md`. A round-trip (`hyperguild brain query <slug>` immediately after) finds the entry.
- [ ] `hyperguild mode <cloud|client-local|sovereign>` writes a parseable JSON file at the target path with the per-mode `mcpServers` entries; `jq -e .mcpServers` succeeds on the output.
- [ ] All commands print usage on `--help`, exit 2 on unknown flags, exit non-zero on operational errors.
- [ ] `task check` passes (lint + test + vet) on each task and on the merged branch.
## Constraints
- **Stdlib only.** No `cobra`, `urfave/cli`, `viper`, etc. CLI router and flag parsing use `flag.NewFlagSet`.
- **Go 1.26.1**, project default.
- **Module:** `github.com/mathiasbq/supervisor`, peer to `cmd/supervisor/`. New code at `cmd/hyperguild/`. The module name keeps its historical `supervisor` value — renaming the module is out of scope and would touch every import.
- **Reuse `internal/tier`** unchanged. The CLI is a thin wrapper around `tier.Detect`.
- **Brain endpoint configurable** via `BRAIN_URL` env var (default `http://koala:30330` — Tailscale-exposed NodePort, both MCP at `/mcp` and HTTP REST at `/query`, `/write`, etc., share the port). No hostname literals embedded in the CLI body — sourced from env per the existing "logical-addresses-in-instructions" memory.
- **Test discipline:** table-driven, testify, fakes for HTTP and tier probing. No live network in tests.
- **Errors:** wrapped via `fmt.Errorf("op: %w", err)`. No naked returns. Stderr for errors, stdout for results.
## Out of Scope
- The Mode 6 routing pod itself — `mode client-local` writes a placeholder entry pointing at the future routing URL with a `_routing_pending` annotation; the CLI does not provision the pod.
- Pass-rate logging (Plan 5) — the CLI's `brain write` does not emit `session_log` events.
- Skill worker CLIs (`hyperguild tdd_red`, `hyperguild review`, etc.) — those stay on the supervisor MCP until Plan 7.
- Brain HTTP server changes — the REST endpoints already exist.
- Authentication / TLS — Tailscale provides network isolation; no auth currently.
- Windows/Linux binaries — macOS-only per the user's setup. `go build` is portable but no cross-compilation in CI.
- A `crush` config writer for Mode 3 — Mode 3 (sovereign) writes a Claude-Code-compatible `.mcp.json` with brain-only MCP, on the assumption that even Crush-primary users may fall back to Claude Code with brain access. Crush's own config is owned by the user manually.
- A unified `--config` file for the CLI — env var + flags is enough today.
## Technical Approach
- **Single binary, inline subcommand router.** `cmd/hyperguild/main.go` dispatches on `os.Args[1]` to per-subcommand functions, each owning its own `flag.NewFlagSet`. Rationale: 4 top-level subcommands (`tier`, `brain`, `mode`, plus `--help`) and one nested level (`brain query`, `brain write`); ~80 lines of routing plumbing in stdlib beats pulling cobra's ~3 KLOC of dependencies for a tiny CLI. The router is testable by injecting `args []string` instead of reading `os.Args` directly.
- **`tier` subcommand reuses `internal/tier.Detect` verbatim.** Probe URLs (`https://api.anthropic.com` and the LiteLLM base URL) come from environment: `ANTHROPIC_PROBE_URL` (default the literal Anthropic URL) and `LITELLM_BASE_URL` (no default — error if `--mode-needs-llm` and unset). Rationale: matching the supervisor's existing wiring means the CLI cannot disagree with the supervisor about tier; a single source of truth.
- **`brain` subcommand calls the HTTP REST API.** Two nested subcommands:
- `brain query <topic>` issues `POST /query` with JSON body `{query, limit}` (default `--limit 5`), prints results in human-readable form by default and with `--json` for machine consumption.
- `brain write <type> <slug>` reads stdin, posts `POST /write` with JSON body `{type, slug, content}`, prints the resulting path on success.
Rationale: HTTP REST is simpler than MCP framing for a CLI. Per CLAUDE.md, the REST endpoints are documented as the official non-MCP interface.
- **`mode <name>` writes a per-mode `.mcp.json` template.** Defaults to writing `./.mcp.json` (cwd); accepts `--out <path>`. Per-mode bodies:
- `cloud``mcpServers` contains only `brain` at `http://koala:30330/mcp`.
- `client-local``mcpServers` contains `brain` at `http://koala:30330/mcp` and a `routing` placeholder entry with `url` set to a marker (`http://koala:30310/mcp`) and an extra field `"_routing_pending": "Plan 6 — routing pod not deployed yet"`. Rationale: keeping strict-JSON parseable means using a placeholder field rather than a JSON comment, which the spec parser would reject.
- `sovereign``mcpServers` contains only `brain`, plus a top-level `"_mode_note": "Sovereign mode primarily uses Crush + LiteLLM. This .mcp.json is provided as Claude Code fallback."`.
All three are valid JSON and all three round-trip through `jq` for verification.
Rationale: a single subcommand with three clearly-different outputs is easier to evolve than three nearly-duplicate subcommands. The placeholder fields are intentional documentation in the file itself, which the user actually opens and edits.
- **No global state.** Each subcommand is a function `(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io.Writer) error`, allowing table-driven tests to exercise full subcommand flows without `os.Exit` or fd capture.
- **HTTP client injection.** A package-level `http.Client` with 5s timeout for `brain` calls, overridable in tests via a constructor. Real client for `main`, `httptest.Server` for tests.
## Risks
- **`.mcp.json` schema may evolve.** Claude Code's MCP config format is defined by the harness, and Anthropic could change it. Mitigation: document the format in the CLI's `--help` text and in the spec; if it breaks, the fix is local to one template function.
- **Brain endpoint hostname drift.** If the brain moves off `koala`, the env-var override avoids breaking the CLI but the `mode` template's hardcoded `koala:30330` becomes stale. Mitigation: source the URL in the `mode` template from the same env var (`BRAIN_URL`) so all three subcommands stay in lockstep with the user's actual environment.
- **`tier` probe URL gap.** The CLI inherits the supervisor's hardcoded `https://api.anthropic.com` probe URL via `internal/tier`. If Anthropic changes the URL, both supervisor and CLI break together. Mitigation: env-var override `ANTHROPIC_PROBE_URL`; default unchanged.
- **No HTTP retry logic.** The CLI returns first-error to the user. For ad-hoc shell use this is fine; for automation a future `--retry` flag may be needed. Out of scope for this iteration.
- **Tests don't cover live network.** Pure-fake tests catch regression but not "does the brain pod actually answer." Mitigation: add a smoke-test `task hyperguild:smoke` in a follow-up that runs against the real brain — separate concern, not in Plan 4.
- **Mode 3 sovereign output may surprise users** who expect Mode 3 to skip writing a `.mcp.json` entirely (since Crush is the primary harness). Mitigation: the `_mode_note` field explains the choice; the `--out /dev/null` escape hatch lets users skip the write if they want.

View File

@@ -0,0 +1,125 @@
# Spec: Pass-rate logging
> Plan 5 of 7 — Hyperguild Skill Migration. Loaded after `feature-spec` skill.
## Problem Statement
Plan 6 (Mode 2 routing pod) needs a per-skill signal to decide whether to route a call to the local model or keep it on Claude. The natural signal is recent pass rate: a skill that succeeds 95% of the time on local is safe to route; a skill that succeeds 60% is not. Today there is no such signal — the `session_log` MCP exists (shipped in Plan 1) but skills don't reliably call it, and no endpoint computes pass rate from the resulting logs.
Two consequences:
1. **Plan 6 cannot be trusted without baseline data.** Routing decisions made on guesses will produce regressions that erode confidence in Mode 2 entirely.
2. **The skill library has no observability.** When a skill regresses (model swap, prompt drift, environment change), there's no way to notice until a downstream task explicitly fails.
**Why now:** Plans 14 are merged. Plan 5 instruments the discipline that Plan 6 will consume. Several weeks of usage data between Plan 5 merge and Plan 6 deploy will mean Plan 6 lands on real numbers, not synthetic.
## Success Criteria
- [ ] After Plan 5 merges, every invocation of `tdd` (pilot skill) calls `session_log` at the end of each phase (red, green, refactor) with `final_status` ∈ {pass, fail, skip}.
- [ ] At least 6 of the remaining "binary-outcome" skills get the same treatment: `code-review`, `debug`, `feature-spec`, `session-retrospective`, `trainer`, `spec-driven-dev`. (Skills with no clear pass/fail — `clean-code`, `cognitive-load`, `solid`, `refactoring`, `test-design`, `problem-analysis`, `user-stories`, `planning`, `atdd`, `gitea-ci` — are out of scope.)
- [ ] A new HTTP REST endpoint `GET /pass-rate?skill=X&window=7d` on the brain pod returns valid JSON `{skill, window, pass, fail, skip, total, pass_rate}` for any skill name. Skills with no logged invocations return zeros (not 404, not error). Pass rate is `pass / (pass + fail)`; if `pass + fail == 0`, returns `pass_rate: null`.
- [ ] The endpoint's aggregator normalizes legacy values: `pass``ok`, `fail``error`, `skip``skipped`. No data loss when scanning historical logs.
- [ ] An optional CLI subcommand `hyperguild brain pass-rate <skill> [--window 7d] [--json]` calls the endpoint and prints either human-readable (`tdd: 47 / 50 = 94% (window: 7d)`) or JSON.
- [ ] `task check` passes (lint + test + vet + drift + govulncheck) on each task and on the merged branch.
- [ ] One week post-merge, `GET /pass-rate?skill=tdd&window=7d` returns non-zero counts and a real `pass_rate`.
## Constraints
- **Stdlib + existing deps only.** The endpoint adds to the existing ingestion pod's HTTP handler (Go, `net/http`). No new service, no new pod, no new persistence layer.
- **No auth on `/pass-rate`.** Same model as the rest of the brain HTTP REST API: Tailscale-only network, no token.
- **Schema:** the SKILL.md template uses `pass | fail | skip` for `final_status`. The aggregator treats `pass` and `ok` as equivalent, `fail` and `error` as equivalent, `skip` and `skipped` as equivalent. New writes from skills MUST use the new vocabulary; the aggregator handles both for read-back.
- **Storage:** continues to use the existing JSONL files at `<pod>/brain/sessions/*.jsonl`. No format change. No materialized aggregates. If on-demand scans become slow (>500ms p99), revisit in a follow-up; not now.
- **Backwards compatibility:** the existing `session_log` MCP tool's signature does not change. Its docstring should be updated to reflect the new vocabulary, but argument types stay the same.
- **Pilot-before-rollout:** the first SKILL.md instrumentation (`tdd`) must dogfood successfully — at least one real `tdd` invocation post-instrumentation produces a session log entry — before the other six skills get their updates.
## Out of Scope
- Plan 6 routing pod itself (the consumer of `/pass-rate`).
- Materialized rolling counters (compute on-demand for now).
- Auth, rate limiting, or per-user filtering on `/pass-rate`.
- Dashboards or visualization (`hyperguild brain pass-rate` text/JSON is the only UI).
- Real-time streaming or push notifications (`/pass-rate` is poll-only).
- Skills with no clear binary outcome (the 10 skills listed in Success Criteria).
- Per-model or per-mode breakdown (`session_log` already records `model_used`; the endpoint aggregates across all models for now). Plan 6 may want sharper aggregation; we'll add fields when it lands.
- Migration of the one historical entry in `2026-04-17-validate-hyperguild.jsonl` from `pass` (which is the new vocabulary, by accident) — no migration needed.
## Technical Approach
### Component A — SKILL.md instrumentation pattern
Each instrumented skill gets a standardized "Logging" subsection under its existing "Brain MCP Integration" section. The subsection names the required `session_log` fields with explicit copy-paste examples:
```
**At each phase end:** call `session_log` with:
- `skill`: "<this-skill-name>"
- `phase`: "<the-phase>"
- `final_status`: "pass" | "fail" | "skip"
- `message`: "<one-line summary>"
- `duration_ms`: <wall clock>
- `project_root`: "<absolute path to the project under work>"
```
The pilot SKILL.md (`~/dev/.skills/tdd/SKILL.md`) gets instrumented first. The implementation defines the contract; the rollout commits replicate the pattern across the other six SKILL.md files.
Rationale: SKILL.md as the source of truth means the contract is visible to every agent that loads the skill — no hidden middleware. Mode-agnostic: the agent calls `session_log` whether it's Claude (Mode 1), Claude+routing (Mode 2), or Crush (Mode 3). The pattern is uniform; only the skill name + phase set differ.
### Component B — `/pass-rate` HTTP endpoint
New handler at the existing ingestion pod, peer to `/query`, `/write`, `/ingest`, etc.
```
GET /pass-rate?skill=<name>&window=<duration>
→ 200 { "skill": "tdd", "window": "7d", "pass": 47, "fail": 3, "skip": 0, "total": 50, "pass_rate": 0.94 }
```
Algorithm:
1. Parse `skill` (required) and `window` (default `7d`, accept Go-style `1h`, `12h`, `7d`, `30d`).
2. Walk `brain/sessions/*.jsonl` in the pod's volume. For each line: parse JSON, filter by `skill == query.skill` and `timestamp >= now - window`.
3. Tally `pass` (counts both `pass` and `ok`), `fail` (`fail` and `error`), `skip` (`skip` and `skipped`).
4. Compute `pass_rate = pass / (pass + fail)`; if `pass + fail == 0`, return `pass_rate: null`.
5. Return JSON.
Rationale for on-demand: the JSONL files are append-only and small (one entry per skill phase, kilobytes per session at most). For the first months of Plan 5 usage, scanning all sessions for a single query is fast enough. If it ever isn't, a materialized index is a follow-up — the endpoint shape doesn't change.
### Component C — Optional CLI subcommand
`hyperguild brain pass-rate <skill> [--window 7d] [--json]`. Adds a third nested verb under `brain` (sibling to `query` and `write`). Calls `GET /pass-rate?skill=<>&window=<>` via the existing `brainClient` infrastructure. Default human output: `tdd: 47 / 50 = 94% (window: 7d)`. `--json` passes through the response envelope.
Rationale: shell access to pass-rate without curl + jq. Optional in the strict sense — Plan 6's routing pod will call the endpoint directly, not via the CLI — but cheap to add (one new method on `brainClient`, one new dispatch case in `runBrain`).
### Schema and normalization
`session_log` JSONL line shape (unchanged today, codified by this plan):
```json
{
"session_id": "<id>",
"timestamp": "2026-05-03T20:30:00Z",
"skill": "tdd",
"phase": "red",
"project_root": "/abs/path",
"final_status": "pass",
"duration_ms": 12345,
"message": "Test written, function undefined, red confirmed."
}
```
`final_status` values:
- New writes (this plan onward): `pass | fail | skip`
- Read aggregator accepts both new and legacy: `pass`/`ok` → pass, `fail`/`error` → fail, `skip`/`skipped` → skip
- Anything else → counted as `skip` for safety (don't pollute pass/fail with malformed entries)
### Tests
- Endpoint: table-driven tests with a temp `brain/sessions/` directory containing JSONL files spanning multiple skills, multiple statuses (both vocabularies), edge cases (empty file, malformed line, timestamp outside window, future timestamp). Tests run via `httptest.NewServer` against the real handler.
- CLI: tests for `runBrainPassRate` against `httptest.Server` fake of `/pass-rate`. Human and `--json` output paths.
- Pilot dogfood: after instrumenting `tdd/SKILL.md`, one real TDD task in this plan exercises the logging path. The corresponding session log entry verifies end-to-end.
- `task check` per task.
## Risks
- **Skills that don't reliably log produce missing data.** The aggregator returns zero counts for those, which Plan 6 may misread as "this skill always passes" or "this skill is broken". Mitigation: the endpoint returns `pass_rate: null` when `pass + fail == 0`, signalling "no data" distinct from "always passes". Plan 6 must check for null.
- **Agents may forget to call `session_log` mid-skill.** No way to enforce in cloud Mode 1 — Claude may skip the call if instructions are unclear. Mitigation: SKILL.md template makes the call literal and copy-pasteable. After 1 week, if instrumentation rate is < 80% of expected calls, escalate; consider a wrapper at the routing-pod layer in Plan 6 as belt-and-suspenders.
- **Schema drift between legacy `ok` and new `pass`.** Mitigation: the aggregator's normalization rule. Documented in the endpoint's response and in the `session_log` tool docstring update.
- **`/pass-rate` walks all session files for each request.** With ~1 file per session and tens of sessions per week, this is microseconds today. At hundreds of files per day, may need a date-bounded directory layout. Mitigation: monitor; if scan time > 100ms p99, revisit. Not in this plan.
- **The pilot may fail on the first dogfood.** If `tdd` instrumentation doesn't produce a log entry (e.g. agent didn't call `session_log`, JSON shape wrong, file permissions), the rollout to the other six skills is blocked until the pilot succeeds. Mitigation: explicit "pilot validates end-to-end" gate as the last step of Component A.
- **Adding a third verb under `brain` slightly stretches the inline-router pattern.** Three verbs in a switch is still simple; if it grows to five, the CLI may want a per-verb registration map. Mitigation: deferred — three is fine.

View File

@@ -12,6 +12,7 @@ import (
"github.com/mathiasbq/hyperguild/ingestion/internal/api"
"github.com/mathiasbq/hyperguild/ingestion/internal/llm"
"github.com/mathiasbq/hyperguild/ingestion/internal/mcp"
"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
"github.com/mathiasbq/hyperguild/ingestion/internal/watcher"
)
@@ -54,6 +55,8 @@ func main() {
h := api.NewHandler(brainDir, logger, pipelineCfg)
mcpSrv := mcp.NewServer(brainDir, &pipelineCfg, llmClient.Complete)
ctx := context.Background()
if watchInterval > 0 {
watcher.Start(ctx, watcher.Config{
@@ -68,7 +71,9 @@ func main() {
mux.HandleFunc("POST /write", h.Write)
mux.HandleFunc("POST /ingest", h.Ingest)
mux.HandleFunc("POST /ingest-path", h.IngestPath)
mux.HandleFunc("POST /ingest-raw", h.IngestRaw)
mux.HandleFunc("POST /backfill-refs", h.BackfillRefs)
mux.Handle("POST /mcp", mcpSrv)
addr := ":" + port
watchIntervalLog := "disabled"
@@ -82,6 +87,7 @@ func main() {
"llm_model", llmModel,
"chunk_size", chunkSize,
"watch_interval", watchIntervalLog,
"mcp_enabled", true,
)
if err := http.ListenAndServe(addr, mux); err != nil {
logger.Error("server stopped", "err", err)

View File

@@ -85,6 +85,57 @@ func (h *Handler) Query(w http.ResponseWriter, r *http.Request) {
writeJSON(w, map[string]any{"results": results})
}
// WriteNote writes a markdown file to brainDir/knowledge/<filename>, optionally
// prefixed with YAML frontmatter built from typ and domain. Returns the path
// relative to brainDir (forward-slashed). Filename traversal is rejected.
func WriteNote(brainDir, content, filename, typ, domain string) (string, error) {
if content == "" {
return "", fmt.Errorf("content is required")
}
if filename == "" {
filename = fmt.Sprintf("%s-auto.md", time.Now().UTC().Format("2006-01-02-150405"))
}
rawDir := filepath.Join(brainDir, "knowledge")
if err := os.MkdirAll(rawDir, 0o755); err != nil {
return "", fmt.Errorf("create raw dir: %w", err)
}
finalContent := content
if typ != "" || domain != "" {
var fm strings.Builder
fm.WriteString("---\n")
if typ != "" {
fmt.Fprintf(&fm, "type: %s\n", typ)
}
if domain != "" {
fmt.Fprintf(&fm, "domain: %s\n", domain)
}
fm.WriteString("---\n")
finalContent = fm.String() + content
}
// Reject path separators outright; any non-flat filename is misuse.
if strings.ContainsAny(filename, `/\`) {
return "", fmt.Errorf("invalid filename")
}
base := filepath.Base(filename)
// After Base, "." and ".." remain. Reject those before adding .md.
if base == "." || base == ".." || base == "" {
return "", fmt.Errorf("invalid filename")
}
if !strings.HasSuffix(base, ".md") {
base += ".md"
}
dest := filepath.Join(rawDir, base)
if err := os.WriteFile(dest, []byte(finalContent), 0o644); err != nil {
return "", fmt.Errorf("write: %w", err)
}
rel, _ := filepath.Rel(brainDir, dest)
return filepath.ToSlash(rel), nil
}
// Write handles POST /write — write raw content to brain/knowledge/.
func (h *Handler) Write(w http.ResponseWriter, r *http.Request) {
var req writeRequest
@@ -92,53 +143,13 @@ func (h *Handler) Write(w http.ResponseWriter, r *http.Request) {
writeError(w, http.StatusBadRequest, "invalid JSON")
return
}
if req.Content == "" {
writeError(w, http.StatusBadRequest, "content is required")
return
}
filename := req.Filename
if filename == "" {
filename = fmt.Sprintf("%s-auto.md", time.Now().UTC().Format("2006-01-02-150405"))
}
rawDir := filepath.Join(h.brainDir, "knowledge")
if err := os.MkdirAll(rawDir, 0o755); err != nil {
writeError(w, http.StatusInternalServerError, "failed to create raw dir")
return
}
finalContent := req.Content
if req.Type != "" || req.Domain != "" {
var fm strings.Builder
fm.WriteString("---\n")
if req.Type != "" {
fmt.Fprintf(&fm, "type: %s\n", req.Type)
}
if req.Domain != "" {
fmt.Fprintf(&fm, "domain: %s\n", req.Domain)
}
fm.WriteString("---\n")
finalContent = fm.String() + req.Content
}
base := filepath.Base(filename)
if !strings.HasSuffix(base, ".md") {
base += ".md"
}
dest := filepath.Join(rawDir, base)
if !strings.HasPrefix(filepath.Clean(dest)+string(os.PathSeparator), filepath.Clean(rawDir)+string(os.PathSeparator)) {
writeError(w, http.StatusBadRequest, "invalid filename")
return
}
if err := os.WriteFile(dest, []byte(finalContent), 0o644); err != nil {
relPath, err := WriteNote(h.brainDir, req.Content, req.Filename, req.Type, req.Domain)
if err != nil {
h.logger.Error("write failed", "err", err)
writeError(w, http.StatusInternalServerError, "write error")
writeError(w, http.StatusBadRequest, err.Error())
return
}
rel, _ := filepath.Rel(h.brainDir, dest)
writeJSON(w, map[string]string{"path": filepath.ToSlash(rel)})
writeJSON(w, map[string]string{"path": relPath})
}
// Ingest handles POST /ingest — run the pipeline on provided content.
@@ -272,6 +283,48 @@ func (h *Handler) IngestPath(w http.ResponseWriter, r *http.Request) {
writeJSON(w, ingestResponse{Pages: allPages, Warnings: allWarnings})
}
type ingestRawRequest struct {
Source string `json:"source"`
Pages []pipeline.RawPage `json:"pages"`
DryRun bool `json:"dry_run"`
}
// IngestRaw handles POST /ingest-raw — run the pipeline on pre-parsed RawPages,
// skipping the LLM extraction step. Use when the caller has already produced
// structured page data (e.g. from a more capable model or manual curation).
func (h *Handler) IngestRaw(w http.ResponseWriter, r *http.Request) {
var req ingestRawRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
writeError(w, http.StatusBadRequest, "invalid JSON")
return
}
if strings.TrimSpace(req.Source) == "" {
writeError(w, http.StatusBadRequest, "source is required")
return
}
if len(req.Pages) == 0 {
writeError(w, http.StatusBadRequest, "pages is required and must be non-empty")
return
}
result, err := pipeline.RunRaw(h.brainDir, req.Source, req.Pages, req.DryRun)
if err != nil {
h.logger.Error("ingest-raw failed", "source", req.Source, "err", err)
writeError(w, http.StatusInternalServerError, "ingest error")
return
}
pages := result.Pages
if pages == nil {
pages = []string{}
}
warnings := result.Warnings
if warnings == nil {
warnings = []string{}
}
writeJSON(w, ingestResponse{Pages: pages, Warnings: warnings})
}
// BackfillRefs handles POST /backfill-refs — injects source back-references
// into all concept and entity pages based on existing wiki/sources/ pages.
func (h *Handler) BackfillRefs(w http.ResponseWriter, r *http.Request) {

View File

@@ -226,6 +226,85 @@ func TestIngestPath_File(t *testing.T) {
assert.NotEmpty(t, pagesSlice)
}
// ---------------------------------------------------------------------------
// POST /ingest-raw
// ---------------------------------------------------------------------------
func TestIngestRaw_Validation(t *testing.T) {
cases := []struct {
name string
body map[string]any
}{
{"missing source", map[string]any{"pages": []any{map[string]any{"title": "X", "type": "concept", "content": "x"}}}},
{"missing pages", map[string]any{"source": "test-source"}},
{"empty pages", map[string]any{"source": "test-source", "pages": []any{}}},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
_, h := setup(t)
body, _ := json.Marshal(tc.body)
req := httptest.NewRequest(http.MethodPost, "/ingest-raw", bytes.NewReader(body))
rec := httptest.NewRecorder()
h.IngestRaw(rec, req)
assert.Equal(t, http.StatusBadRequest, rec.Code)
})
}
}
func TestIngestRaw_Success(t *testing.T) {
dir, h := setup(t)
body, _ := json.Marshal(map[string]any{
"source": "test-article",
"pages": []any{
map[string]any{"title": "Test Article", "type": "source", "subtype": "article", "domain": "Testing", "content": "## Summary\n\nThis is a test article about [[Test Concept]].\n"},
map[string]any{"title": "Test Concept", "type": "concept", "domain": "Testing", "content": "A concept for testing.\n"},
},
})
req := httptest.NewRequest(http.MethodPost, "/ingest-raw", bytes.NewReader(body))
rec := httptest.NewRecorder()
h.IngestRaw(rec, req)
require.Equal(t, http.StatusOK, rec.Code)
var resp map[string]any
require.NoError(t, json.Unmarshal(rec.Body.Bytes(), &resp))
pages := resp["pages"].([]any)
assert.Len(t, pages, 2)
// Verify files were written
sourcePath := filepath.Join(dir, "wiki", "sources", "test-article.md")
assert.FileExists(t, sourcePath)
conceptPath := filepath.Join(dir, "wiki", "concepts", "test-concept.md")
assert.FileExists(t, conceptPath)
}
func TestIngestRaw_DryRun(t *testing.T) {
dir, h := setup(t)
body, _ := json.Marshal(map[string]any{
"source": "dry-run-test",
"pages": []any{
map[string]any{"title": "Dry Run Source", "type": "source", "subtype": "article", "content": "Content."},
},
"dry_run": true,
})
req := httptest.NewRequest(http.MethodPost, "/ingest-raw", bytes.NewReader(body))
rec := httptest.NewRecorder()
h.IngestRaw(rec, req)
require.Equal(t, http.StatusOK, rec.Code)
var resp map[string]any
require.NoError(t, json.Unmarshal(rec.Body.Bytes(), &resp))
pages := resp["pages"].([]any)
assert.NotEmpty(t, pages)
// Verify no files were written
sourcePath := filepath.Join(dir, "wiki", "sources", "dry-run-test.md")
assert.NoFileExists(t, sourcePath)
}
func TestIngestPath_Directory(t *testing.T) {
_, h := setup(t)

View File

@@ -0,0 +1,256 @@
package mcp
import (
"context"
"encoding/json"
"fmt"
"path/filepath"
"strings"
"time"
"github.com/mathiasbq/hyperguild/ingestion/internal/api"
"github.com/mathiasbq/hyperguild/ingestion/internal/extract"
"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
"github.com/mathiasbq/hyperguild/ingestion/internal/search"
"github.com/mathiasbq/hyperguild/ingestion/internal/session"
)
// tools returns the tool descriptors. Handler bodies for each tool are filled
// in subsequent tasks; this file currently only provides the descriptors.
func (s *Server) tools() []map[string]any {
str := func(desc string) map[string]any {
return map[string]any{"type": "string", "description": desc}
}
int_ := func(desc string) map[string]any {
return map[string]any{"type": "integer", "description": desc}
}
schema := func(required []string, props map[string]any) json.RawMessage {
b, _ := json.Marshal(map[string]any{
"type": "object", "required": required, "properties": props,
})
return b
}
return []map[string]any{
{
"name": "brain_query",
"description": "BM25 full-text search across brain/knowledge/ and brain/wiki/ markdown files.",
"inputSchema": schema([]string{"query"}, map[string]any{
"query": str("search terms"),
"limit": int_("max results, default 5"),
}),
},
{
"name": "brain_write",
"description": "Write a raw knowledge note to brain/knowledge/.",
"inputSchema": schema([]string{"content"}, map[string]any{
"content": str("markdown content"),
"filename": str("optional filename"),
"type": str("optional frontmatter type"),
"domain": str("optional frontmatter domain"),
}),
},
{
"name": "brain_ingest_raw",
"description": "Ingest pre-structured pages into the brain wiki, bypassing the LLM extraction step.",
"inputSchema": schema([]string{"source", "pages"}, map[string]any{
"source": str("source name"),
"pages": map[string]any{"type": "array"},
"dry_run": map[string]any{"type": "boolean"},
}),
},
{
"name": "brain_ingest",
"description": "Ingest content into the brain wiki via the LLM extraction pipeline.",
"inputSchema": schema([]string{}, map[string]any{
"content": str("raw content; required when path is empty"),
"source": str("source name; required when path is empty"),
"path": str("file path; mutually exclusive with content+source"),
"dry_run": map[string]any{"type": "boolean"},
}),
},
{
"name": "session_log",
"description": "Append a structured entry to brain/sessions/<session_id>.jsonl.",
"inputSchema": schema([]string{"session_id"}, map[string]any{
"session_id": str("session identifier"),
"skill": str("skill name"),
"phase": str("phase within the skill"),
"project_root": str("absolute project root"),
"final_status": str("ok | error | skipped"),
"file_path": str("optional file produced"),
"model_used": str("optional model identifier"),
"duration_ms": int_("optional duration in ms"),
"message": str("optional free-text"),
}),
},
}
}
type brainQueryArgs struct {
Query string `json:"query"`
Limit int `json:"limit,omitempty"`
}
func (s *Server) brainQuery(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
var a brainQueryArgs
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if a.Query == "" {
return nil, fmt.Errorf("query is required")
}
if a.Limit == 0 {
a.Limit = 5
}
results, err := search.Query(s.brainDir, a.Query, a.Limit)
if err != nil {
return nil, fmt.Errorf("search: %w", err)
}
return json.Marshal(map[string]any{"results": results})
}
type brainWriteArgs struct {
Content string `json:"content"`
Filename string `json:"filename,omitempty"`
Type string `json:"type,omitempty"`
Domain string `json:"domain,omitempty"`
}
func (s *Server) brainWrite(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
var a brainWriteArgs
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
relPath, err := api.WriteNote(s.brainDir, a.Content, a.Filename, a.Type, a.Domain)
if err != nil {
return nil, err
}
return json.Marshal(map[string]string{"path": relPath})
}
type brainIngestRawArgs struct {
Source string `json:"source"`
Pages []pipeline.RawPage `json:"pages"`
DryRun bool `json:"dry_run,omitempty"`
}
func (s *Server) brainIngestRaw(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
var a brainIngestRawArgs
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if a.Source == "" {
return nil, fmt.Errorf("source is required")
}
if len(a.Pages) == 0 {
return nil, fmt.Errorf("pages must be non-empty")
}
result, err := pipeline.RunRaw(s.brainDir, a.Source, a.Pages, a.DryRun)
if err != nil {
return nil, fmt.Errorf("ingest: %w", err)
}
pages := result.Pages
if pages == nil {
pages = []string{}
}
warnings := result.Warnings
if warnings == nil {
warnings = []string{}
}
return json.Marshal(map[string]any{"pages": pages, "warnings": warnings})
}
type brainIngestArgs struct {
Content string `json:"content,omitempty"`
Source string `json:"source,omitempty"`
Path string `json:"path,omitempty"`
DryRun bool `json:"dry_run,omitempty"`
}
func (s *Server) brainIngest(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
var a brainIngestArgs
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if a.Path != "" && a.Content != "" {
return nil, fmt.Errorf("path and content+source are mutually exclusive")
}
if a.Path == "" && a.Content == "" {
return nil, fmt.Errorf("either path or content+source is required")
}
if s.pipeline.Complete == nil {
return nil, fmt.Errorf("LLM not configured: set INGEST_LLM_URL")
}
if a.Path != "" {
text, err := extract.Text(a.Path)
if err != nil {
return nil, fmt.Errorf("extract: %w", err)
}
source := a.Source
if source == "" {
source = filepath.Base(strings.TrimSuffix(a.Path, filepath.Ext(a.Path)))
}
return s.runIngest(ctx, text, source, a.DryRun)
}
if a.Source == "" {
return nil, fmt.Errorf("source is required when content is provided")
}
return s.runIngest(ctx, a.Content, a.Source, a.DryRun)
}
type sessionLogArgs struct {
SessionID string `json:"session_id"`
Skill string `json:"skill,omitempty"`
Phase string `json:"phase,omitempty"`
ProjectRoot string `json:"project_root,omitempty"`
FinalStatus string `json:"final_status,omitempty"`
FilePath string `json:"file_path,omitempty"`
ModelUsed string `json:"model_used,omitempty"`
DurationMs int64 `json:"duration_ms,omitempty"`
Message string `json:"message,omitempty"`
}
func (s *Server) sessionLog(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
var a sessionLogArgs
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if a.SessionID == "" {
return nil, fmt.Errorf("session_id is required")
}
entry := session.Entry{
SessionID: a.SessionID,
Timestamp: time.Now().UTC(),
Skill: a.Skill,
Phase: a.Phase,
ProjectRoot: a.ProjectRoot,
FinalStatus: a.FinalStatus,
FilePath: a.FilePath,
ModelUsed: a.ModelUsed,
DurationMs: a.DurationMs,
Message: a.Message,
}
dir := filepath.Join(s.brainDir, "sessions")
if err := session.Append(dir, a.SessionID, entry); err != nil {
return nil, fmt.Errorf("append: %w", err)
}
return json.Marshal(map[string]string{"status": "ok", "session_id": a.SessionID})
}
func (s *Server) runIngest(ctx context.Context, content, source string, dryRun bool) (json.RawMessage, error) {
result, err := pipeline.Run(ctx, s.pipeline, s.brainDir, content, source, dryRun)
if err != nil {
return nil, fmt.Errorf("ingest: %w", err)
}
pages := result.Pages
if pages == nil {
pages = []string{}
}
warnings := result.Warnings
if warnings == nil {
warnings = []string{}
}
return json.Marshal(map[string]any{"pages": pages, "warnings": warnings})
}

View File

@@ -0,0 +1,196 @@
package mcp_test
import (
"bytes"
"encoding/json"
"net/http"
"net/http/httptest"
"os"
"path/filepath"
"testing"
"github.com/mathiasbq/hyperguild/ingestion/internal/mcp"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func toolCall(t *testing.T, srv http.Handler, name string, args map[string]any) map[string]any {
t.Helper()
bodyBytes, err := json.Marshal(map[string]any{
"jsonrpc": "2.0", "id": 1, "method": "tools/call",
"params": map[string]any{"name": name, "arguments": args},
})
require.NoError(t, err)
req := httptest.NewRequest(http.MethodPost, "/mcp", bytes.NewReader(bodyBytes))
rr := httptest.NewRecorder()
srv.ServeHTTP(rr, req)
require.Equal(t, http.StatusOK, rr.Code)
var resp map[string]any
require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &resp))
return resp
}
func TestBrainQueryReturnsResults(t *testing.T) {
brainDir := t.TempDir()
knowledge := filepath.Join(brainDir, "knowledge")
require.NoError(t, os.MkdirAll(knowledge, 0o755))
require.NoError(t, os.WriteFile(
filepath.Join(knowledge, "tdd.md"),
[]byte("# TDD\n\nTest-driven development is iterative.\n"),
0o644,
))
srv := mcp.NewServer(brainDir, nil, nil)
resp := toolCall(t, srv, "brain_query", map[string]any{"query": "tdd"})
require.Nil(t, resp["error"])
result := resp["result"].(map[string]any)
content := result["content"].([]any)
require.NotEmpty(t, content)
text := content[0].(map[string]any)["text"].(string)
assert.Contains(t, text, "tdd.md")
}
func TestBrainWriteCreatesFile(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
resp := toolCall(t, srv, "brain_write", map[string]any{
"content": "# Test\n\nbody",
"filename": "test.md",
"type": "note",
"domain": "personal",
})
require.Nil(t, resp["error"])
got, err := os.ReadFile(filepath.Join(brainDir, "knowledge", "test.md"))
require.NoError(t, err)
assert.Contains(t, string(got), "type: note")
assert.Contains(t, string(got), "domain: personal")
assert.Contains(t, string(got), "# Test")
}
func TestBrainWriteRejectsTraversal(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
resp := toolCall(t, srv, "brain_write", map[string]any{
"content": "x",
"filename": "../escape.md",
})
require.NotNil(t, resp["error"])
}
func TestBrainWriteAcceptsDoubleDotInName(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
resp := toolCall(t, srv, "brain_write", map[string]any{
"content": "x",
"filename": "notes..draft.md",
})
require.Nil(t, resp["error"])
_, err := os.Stat(filepath.Join(brainDir, "knowledge", "notes..draft.md"))
require.NoError(t, err, "filename with embedded .. should be allowed")
}
func TestBrainIngestRawDryRun(t *testing.T) {
brainDir := t.TempDir()
require.NoError(t, os.MkdirAll(filepath.Join(brainDir, "wiki", "concepts"), 0o755))
srv := mcp.NewServer(brainDir, nil, nil)
resp := toolCall(t, srv, "brain_ingest_raw", map[string]any{
"source": "test-source",
"dry_run": true,
"pages": []map[string]any{
{
"title": "Test Concept",
"type": "concept",
"content": "## Definition\nA test concept.",
},
},
})
require.Nil(t, resp["error"])
result := resp["result"].(map[string]any)
content := result["content"].([]any)
text := content[0].(map[string]any)["text"].(string)
var parsed struct {
Pages []string `json:"pages"`
}
require.NoError(t, json.Unmarshal([]byte(text), &parsed))
require.NotEmpty(t, parsed.Pages, "expected at least one page path")
assert.Contains(t, parsed.Pages[0], "wiki/concepts/test-concept.md")
// dry_run: no file should exist
_, err := os.Stat(filepath.Join(brainDir, "wiki", "concepts", "test-concept.md"))
assert.True(t, os.IsNotExist(err))
}
func TestBrainIngestRejectsBoth(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
resp := toolCall(t, srv, "brain_ingest", map[string]any{
"content": "x",
"source": "y",
"path": "/tmp/foo.md",
})
require.NotNil(t, resp["error"])
}
func TestBrainIngestRequiresOne(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
resp := toolCall(t, srv, "brain_ingest", map[string]any{})
require.NotNil(t, resp["error"])
}
func TestBrainIngestRejectsContentWithoutSource(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
resp := toolCall(t, srv, "brain_ingest", map[string]any{
"content": "x",
})
require.NotNil(t, resp["error"])
}
func TestBrainIngestRequiresLLMConfigured(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil) // nil pipelineCfg → no LLM
resp := toolCall(t, srv, "brain_ingest", map[string]any{
"content": "some content",
"source": "test",
})
require.NotNil(t, resp["error"])
errObj := resp["error"].(map[string]any)
assert.Contains(t, errObj["message"].(string), "LLM not configured")
}
func TestSessionLogAppends(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
resp := toolCall(t, srv, "session_log", map[string]any{
"session_id": "session-x",
"skill": "tdd",
"phase": "red",
"final_status": "ok",
})
require.Nil(t, resp["error"])
got, err := os.ReadFile(filepath.Join(brainDir, "sessions", "session-x.jsonl"))
require.NoError(t, err)
assert.Contains(t, string(got), `"skill":"tdd"`)
assert.Contains(t, string(got), `"phase":"red"`)
}
func TestSessionLogRequiresSessionID(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil)
resp := toolCall(t, srv, "session_log", map[string]any{"skill": "tdd"})
require.NotNil(t, resp["error"])
}

View File

@@ -0,0 +1,35 @@
package mcp_test
import (
"bytes"
"encoding/json"
"io"
"net/http"
"net/http/httptest"
"testing"
"github.com/mathiasbq/hyperguild/ingestion/internal/mcp"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestMCPMountedHandler(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil)
mux := http.NewServeMux()
mux.Handle("POST /mcp", srv)
ts := httptest.NewServer(mux)
defer ts.Close()
body, err := json.Marshal(map[string]any{
"jsonrpc": "2.0", "id": 1, "method": "tools/list",
})
require.NoError(t, err)
resp, err := http.Post(ts.URL+"/mcp", "application/json", bytes.NewReader(body))
require.NoError(t, err)
defer func() { _ = resp.Body.Close() }()
assert.Equal(t, http.StatusOK, resp.StatusCode)
out, _ := io.ReadAll(resp.Body)
assert.Contains(t, string(out), `"brain_query"`)
}

View File

@@ -0,0 +1,132 @@
// Package mcp implements an MCP HTTP handler for the ingestion service.
// Exposed tools: brain_query, brain_write, brain_ingest, brain_ingest_raw, session_log.
package mcp
import (
"context"
"encoding/json"
"fmt"
"net/http"
"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
)
type request struct {
JSONRPC string `json:"jsonrpc"`
ID any `json:"id"`
Method string `json:"method"`
Params json.RawMessage `json:"params"`
}
type response struct {
JSONRPC string `json:"jsonrpc"`
ID any `json:"id,omitempty"`
Result any `json:"result,omitempty"`
Error *rpcError `json:"error,omitempty"`
}
type rpcError struct {
Code int `json:"code"`
Message string `json:"message"`
}
// Server handles MCP JSON-RPC over HTTP for the ingestion service.
type Server struct {
brainDir string
pipeline pipeline.Config
llm pipeline.CompleteFunc
}
// NewServer constructs a Server bound to brainDir. pipelineCfg supplies the
// LLM-backed pipeline; llm may be nil for non-LLM tools only.
func NewServer(brainDir string, pipelineCfg *pipeline.Config, llm pipeline.CompleteFunc) *Server {
cfg := pipeline.Config{}
if pipelineCfg != nil {
cfg = *pipelineCfg
}
return &Server{brainDir: brainDir, pipeline: cfg, llm: llm}
}
func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {
var req request
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
writeError(w, nil, -32700, "parse error")
return
}
// JSON-RPC 2.0 notifications (no id) must not receive a response.
if req.ID == nil {
return
}
var result any
var rpcErr *rpcError
switch req.Method {
case "initialize":
result = map[string]any{
"protocolVersion": "2024-11-05",
"capabilities": map[string]any{"tools": map[string]any{}},
"serverInfo": map[string]any{"name": "ingestion-brain", "version": "0.1.0"},
}
case "tools/list":
result = map[string]any{"tools": s.tools()}
case "tools/call":
var p struct {
Name string `json:"name"`
Arguments json.RawMessage `json:"arguments"`
}
if err := json.Unmarshal(req.Params, &p); err != nil {
rpcErr = &rpcError{Code: -32602, Message: "invalid params"}
break
}
out, err := s.handleCall(r.Context(), p.Name, p.Arguments)
if err != nil {
rpcErr = &rpcError{Code: -32000, Message: err.Error()}
break
}
result = map[string]any{
"content": []map[string]any{{"type": "text", "text": string(out)}},
}
default:
rpcErr = &rpcError{Code: -32601, Message: "method not found: " + req.Method}
}
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(response{
JSONRPC: "2.0",
ID: req.ID,
Result: result,
Error: rpcErr,
})
}
func writeError(w http.ResponseWriter, id any, code int, msg string) {
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(response{
JSONRPC: "2.0",
ID: id,
Error: &rpcError{Code: code, Message: msg},
})
}
// handleCall dispatches a tools/call to the appropriate tool handler.
func (s *Server) handleCall(ctx context.Context, name string, args json.RawMessage) (json.RawMessage, error) {
switch name {
case "brain_query":
return s.brainQuery(ctx, args)
case "brain_write":
return s.brainWrite(ctx, args)
case "brain_ingest_raw":
return s.brainIngestRaw(ctx, args)
case "brain_ingest":
return s.brainIngest(ctx, args)
case "session_log":
return s.sessionLog(ctx, args)
default:
return nil, fmt.Errorf("unknown tool: %s", name)
}
}

View File

@@ -0,0 +1,91 @@
package mcp_test
import (
"bytes"
"encoding/json"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/mathiasbq/hyperguild/ingestion/internal/mcp"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func body(t *testing.T, v any) *bytes.Buffer {
t.Helper()
b, err := json.Marshal(v)
require.NoError(t, err)
return bytes.NewBuffer(b)
}
func TestServerInitialize(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil)
req := httptest.NewRequest(http.MethodPost, "/mcp", body(t, map[string]any{
"jsonrpc": "2.0", "id": 1, "method": "initialize",
"params": map[string]any{},
}))
rr := httptest.NewRecorder()
srv.ServeHTTP(rr, req)
assert.Equal(t, http.StatusOK, rr.Code)
var resp map[string]any
require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &resp))
result := resp["result"].(map[string]any)
assert.Equal(t, "2024-11-05", result["protocolVersion"])
}
func TestServerToolsList(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil)
req := httptest.NewRequest(http.MethodPost, "/mcp", body(t, map[string]any{
"jsonrpc": "2.0", "id": 2, "method": "tools/list",
}))
rr := httptest.NewRecorder()
srv.ServeHTTP(rr, req)
assert.Equal(t, http.StatusOK, rr.Code)
var resp map[string]any
require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &resp))
tools := resp["result"].(map[string]any)["tools"].([]any)
names := make([]string, 0, len(tools))
for _, t := range tools {
names = append(names, t.(map[string]any)["name"].(string))
}
assert.ElementsMatch(t, []string{
"brain_query", "brain_write", "brain_ingest_raw", "brain_ingest", "session_log",
}, names)
}
func TestServerNotificationGetsNoBody(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil)
req := httptest.NewRequest(http.MethodPost, "/mcp", body(t, map[string]any{
"jsonrpc": "2.0", "method": "notifications/initialized",
}))
rr := httptest.NewRecorder()
srv.ServeHTTP(rr, req)
assert.Equal(t, http.StatusOK, rr.Code)
assert.Empty(t, strings.TrimSpace(rr.Body.String()))
}
func TestServerUnknownMethodReturnsError(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil)
req := httptest.NewRequest(http.MethodPost, "/mcp", body(t, map[string]any{
"jsonrpc": "2.0", "id": 3, "method": "unknown/method",
}))
rr := httptest.NewRecorder()
srv.ServeHTTP(rr, req)
assert.Equal(t, http.StatusOK, rr.Code)
var resp map[string]any
require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &resp))
require.NotNil(t, resp["error"])
errObj := resp["error"].(map[string]any)
assert.Equal(t, float64(-32601), errObj["code"])
assert.Contains(t, errObj["message"].(string), "unknown/method")
}

View File

@@ -18,7 +18,8 @@ type RawPage struct {
}
// ParseRawPages parses LLM output as a JSON array of RawPage objects.
// If the array is truncated mid-object (token limit), it salvages all complete objects.
// If the output contains invalid JSON escape sequences (e.g. \. from Markdown),
// it attempts repair before falling back to truncation recovery.
func ParseRawPages(output string) ([]RawPage, []string) {
output = strings.TrimSpace(output)
if output == "" {
@@ -27,23 +28,30 @@ func ParseRawPages(output string) ([]RawPage, []string) {
output = stripFences(output)
// Fast path: valid JSON.
var pages []RawPage
if err := json.Unmarshal([]byte(output), &pages); err == nil {
return pages, nil
}
// Repair pass: fix invalid escape sequences (e.g. \. \d from Markdown content).
repaired := repairJSON(output)
if err := json.Unmarshal([]byte(repaired), &pages); err == nil {
return pages, []string{"repaired invalid JSON escape sequences in LLM output"}
}
// Truncation recovery: find last `}` that closes a complete object.
idx := strings.LastIndex(output, "}")
idx := strings.LastIndex(repaired, "}")
if idx < 0 {
return nil, []string{"LLM output contained no complete JSON objects"}
}
start := strings.Index(output, "[")
start := strings.Index(repaired, "[")
if start < 0 {
return nil, []string{"LLM output contained no JSON array opening bracket"}
}
candidate := output[start:idx+1] + "]"
candidate := repaired[start:idx+1] + "]"
if err := json.Unmarshal([]byte(candidate), &pages); err != nil {
return nil, []string{fmt.Sprintf("truncation recovery failed: %v", err)}
}
@@ -51,6 +59,45 @@ func ParseRawPages(output string) ([]RawPage, []string) {
return pages, []string{fmt.Sprintf("LLM output was truncated; recovered %d page(s)", len(pages))}
}
// repairJSON replaces invalid JSON escape sequences (e.g. \. \d \p) with
// a properly escaped backslash followed by the same character.
// It iterates byte-by-byte to correctly skip already-valid escape sequences
// (including \\) without requiring lookbehind support.
func repairJSON(s string) string {
var b strings.Builder
b.Grow(len(s))
i := 0
for i < len(s) {
if s[i] != '\\' {
b.WriteByte(s[i])
i++
continue
}
// We have a backslash. Peek at the next character.
if i+1 >= len(s) {
// Trailing backslash — emit as-is.
b.WriteByte(s[i])
i++
continue
}
next := s[i+1]
switch next {
case '"', '\\', '/', 'b', 'f', 'n', 'r', 't', 'u':
// Valid JSON escape sequence — emit both characters as-is.
b.WriteByte(s[i])
b.WriteByte(next)
i += 2
default:
// Invalid escape — double the backslash.
b.WriteByte('\\')
b.WriteByte('\\')
b.WriteByte(next)
i += 2
}
}
return b.String()
}
func stripFences(s string) string {
for _, prefix := range []string{"```json\n", "```json\r\n", "```\n", "```\r\n"} {
if strings.HasPrefix(s, prefix) {

View File

@@ -59,3 +59,29 @@ func TestParseRawPages_MissingTitle(t *testing.T) {
assert.Empty(t, warnings)
assert.Empty(t, pages[0].Title)
}
func TestParseRawPages_InvalidEscapeRepaired(t *testing.T) {
// LLM copied markdown escaped list numbers (\.) into JSON — invalid escape
raw := "[{\"title\":\"Foo\",\"type\":\"concept\",\"content\":\"Step 4\\. Do it.\"}]"
pages, warnings := ParseRawPages(raw)
require.Len(t, pages, 1)
assert.Equal(t, "Foo", pages[0].Title)
assert.Contains(t, pages[0].Content, `4\.`)
assert.NotEmpty(t, warnings) // repair warning
}
func TestRepairJSON_FixesInvalidEscapes(t *testing.T) {
cases := []struct {
in string
want string
}{
{`{"a":"foo\.bar"}`, `{"a":"foo\\.bar"}`},
{`{"a":"\\n is fine"}`, `{"a":"\\n is fine"}`}, // valid \n untouched
{`{"a":"\d+ items"}`, `{"a":"\\d+ items"}`},
{`{"a":"already \\ escaped"}`, `{"a":"already \\ escaped"}`}, // valid \\ untouched
}
for _, tc := range cases {
got := repairJSON(tc.in)
assert.Equal(t, tc.want, got, "input: %s", tc.in)
}
}

View File

@@ -59,11 +59,31 @@ func Run(ctx context.Context, cfg Config, brainDir, content, source string, dryR
allWarnings = append(allWarnings, warnings...)
}
pages, buildWarnings := BuildPages(allRaw, sourceSlug, date)
allWarnings = append(allWarnings, buildWarnings...)
return buildAndWrite(allRaw, sourceSlug, date, brainDir, source, inventory, allWarnings, dryRun)
}
// RunRaw runs the pipeline on pre-parsed RawPages, skipping the LLM extraction
// step. Use this when the caller has already produced the structured RawPage data
// (e.g. from a more capable model or manual curation).
func RunRaw(brainDir, source string, rawPages []RawPage, dryRun bool) (Result, error) {
inventory, err := wiki.LoadInventory(brainDir)
if err != nil {
return Result{}, fmt.Errorf("load inventory: %w", err)
}
sourceSlug := wiki.Slug(source)
date := time.Now().UTC().Format("2006-01-02")
return buildAndWrite(rawPages, sourceSlug, date, brainDir, source, inventory, nil, dryRun)
}
// buildAndWrite runs BuildPages through write for both Run and RunRaw.
func buildAndWrite(rawPages []RawPage, sourceSlug, date, brainDir, source string, inventory map[wiki.PageType][]wiki.Entry, warnings []string, dryRun bool) (Result, error) {
pages, buildWarnings := BuildPages(rawPages, sourceSlug, date)
warnings = append(warnings, buildWarnings...)
resolved := Resolve(pages, inventory)
canonicalized, linkWarnings := CanonicalizeLinks(resolved, inventory)
allWarnings = append(allWarnings, linkWarnings...)
warnings = append(warnings, linkWarnings...)
withRefs := injectSourceRefs(canonicalized, inventory, brainDir)
merged := mergeAll(withRefs)
@@ -83,14 +103,14 @@ func Run(ctx context.Context, cfg Config, brainDir, content, source string, dryR
if !dryRun {
if err := wiki.RebuildIndex(brainDir, date); err != nil {
allWarnings = append(allWarnings, fmt.Sprintf("rebuild index: %v", err))
warnings = append(warnings, fmt.Sprintf("rebuild index: %v", err))
}
if err := wiki.AppendLog(brainDir, source, written, allWarnings, date); err != nil {
allWarnings = append(allWarnings, fmt.Sprintf("append log: %v", err))
if err := wiki.AppendLog(brainDir, source, written, warnings, date); err != nil {
warnings = append(warnings, fmt.Sprintf("append log: %v", err))
}
}
return Result{Pages: written, Warnings: allWarnings}, nil
return Result{Pages: written, Warnings: warnings}, nil
}
// mergeAll deduplicates pages by path, merging content from later occurrences.

View File

@@ -0,0 +1,98 @@
// ingestion/internal/session/session.go
package session
import (
"bufio"
"encoding/json"
"errors"
"fmt"
"io/fs"
"os"
"path/filepath"
"time"
)
// Entry is one skill invocation record, appended to the session JSONL log.
type Entry struct {
SessionID string `json:"session_id"`
Timestamp time.Time `json:"timestamp"`
Skill string `json:"skill"`
Phase string `json:"phase,omitempty"`
ProjectRoot string `json:"project_root,omitempty"`
Input json.RawMessage `json:"input,omitempty"`
Attempts []Attempt `json:"attempts,omitempty"`
FinalStatus string `json:"final_status"`
FilePath string `json:"file_path,omitempty"`
ModelUsed string `json:"model_used,omitempty"`
DurationMs int64 `json:"duration_ms,omitempty"`
Message string `json:"message,omitempty"`
}
// Attempt represents one subprocess invocation within a skill call.
type Attempt struct {
Attempt int `json:"attempt"`
Model string `json:"model"`
Tier string `json:"tier"` // local | subagent | managed
DurationMs int64 `json:"duration_ms"`
WarmStart bool `json:"warm_start"` // model already loaded in llama-swap
Verified bool `json:"verified"`
Verdict string `json:"verdict,omitempty"` // accept | escalate | error
Feedback string `json:"feedback,omitempty"` // verifier feedback on escalation
OutputSummary string `json:"output_summary,omitempty"`
RunnerOutput string `json:"runner_output,omitempty"`
}
// Append writes entry as a single JSON line to sessionsDir/{sessionID}.jsonl.
func Append(sessionsDir, sessionID string, entry Entry) error {
if err := os.MkdirAll(sessionsDir, 0o755); err != nil {
return fmt.Errorf("create sessions dir: %w", err)
}
path := filepath.Join(sessionsDir, sessionID+".jsonl")
f, err := os.OpenFile(path, os.O_CREATE|os.O_APPEND|os.O_WRONLY, 0o644)
if err != nil {
return fmt.Errorf("open session log: %w", err)
}
line, err := json.Marshal(entry)
if err != nil {
_ = f.Close()
return fmt.Errorf("marshal entry: %w", err)
}
if _, err = fmt.Fprintf(f, "%s\n", line); err != nil {
_ = f.Close()
return fmt.Errorf("write entry: %w", err)
}
if err = f.Close(); err != nil {
return fmt.Errorf("close session log: %w", err)
}
return nil
}
// Read returns all entries for sessionID. Returns empty slice if no log exists.
func Read(sessionsDir, sessionID string) ([]Entry, error) {
path := filepath.Join(sessionsDir, sessionID+".jsonl")
f, err := os.Open(path)
if errors.Is(err, fs.ErrNotExist) {
return []Entry{}, nil
}
if err != nil {
return nil, fmt.Errorf("open session log: %w", err)
}
defer f.Close() //nolint:errcheck
var entries []Entry
scanner := bufio.NewScanner(f)
scanner.Buffer(make([]byte, 0, 256*1024), 1<<20) // up to 1 MB per line
for scanner.Scan() {
line := scanner.Bytes()
if len(line) == 0 {
continue
}
var e Entry
if err := json.Unmarshal(line, &e); err != nil {
return nil, fmt.Errorf("parse entry: %w", err)
}
entries = append(entries, e)
}
return entries, scanner.Err()
}

View File

@@ -0,0 +1,50 @@
package session_test
import (
"os"
"path/filepath"
"testing"
"time"
"github.com/mathiasbq/hyperguild/ingestion/internal/session"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestAppendAndRead(t *testing.T) {
dir := t.TempDir()
sid := "test-session"
e1 := session.Entry{
SessionID: sid,
Timestamp: time.Now().UTC().Truncate(time.Second),
Skill: "tdd",
Phase: "red",
FinalStatus: "ok",
}
e2 := session.Entry{
SessionID: sid,
Timestamp: time.Now().UTC().Truncate(time.Second),
Skill: "tdd",
Phase: "green",
FinalStatus: "ok",
}
require.NoError(t, session.Append(dir, sid, e1))
require.NoError(t, session.Append(dir, sid, e2))
got, err := session.Read(dir, sid)
require.NoError(t, err)
require.Len(t, got, 2)
assert.Equal(t, "red", got[0].Phase)
assert.Equal(t, "green", got[1].Phase)
_, statErr := os.Stat(filepath.Join(dir, sid+".jsonl"))
require.NoError(t, statErr, "session file should exist on disk")
}
func TestReadMissingReturnsEmpty(t *testing.T) {
got, err := session.Read(t.TempDir(), "nope")
require.NoError(t, err)
assert.Empty(t, got)
}

View File

@@ -43,6 +43,11 @@ func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {
return
}
// JSON-RPC 2.0 notifications (no id) must not receive a response.
if req.ID == nil {
return
}
var result any
var rpcErr *rpcError

View File

@@ -5,6 +5,7 @@ import (
"encoding/json"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/mathiasbq/supervisor/internal/mcp"
@@ -76,3 +77,39 @@ func TestMCPUnknownMethod(t *testing.T) {
require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &resp))
assert.NotNil(t, resp["error"])
}
func TestMCPNotificationKnownMethodGetsNoResponseBody(t *testing.T) {
reg := registry.New()
srv := mcp.NewServer(reg)
// JSON-RPC 2.0 notification: "id" field absent. Per spec, server MUST NOT
// reply. notifications/initialized is part of the standard MCP handshake.
req := httptest.NewRequest(http.MethodPost, "/mcp", jsonBody(t, map[string]any{
"jsonrpc": "2.0",
"method": "notifications/initialized",
}))
req.Header.Set("Content-Type", "application/json")
rr := httptest.NewRecorder()
srv.ServeHTTP(rr, req)
assert.Equal(t, http.StatusOK, rr.Code)
assert.Empty(t, strings.TrimSpace(rr.Body.String()),
"notifications must not receive a response body")
}
func TestMCPNotificationUnknownMethodGetsNoResponseBody(t *testing.T) {
reg := registry.New()
srv := mcp.NewServer(reg)
req := httptest.NewRequest(http.MethodPost, "/mcp", jsonBody(t, map[string]any{
"jsonrpc": "2.0",
"method": "notifications/totally-unknown",
}))
req.Header.Set("Content-Type", "application/json")
rr := httptest.NewRecorder()
srv.ServeHTTP(rr, req)
assert.Equal(t, http.StatusOK, rr.Code)
assert.Empty(t, strings.TrimSpace(rr.Body.String()),
"unknown notifications must also receive no response body")
}

View File

@@ -17,6 +17,8 @@ func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (
return s.query(ctx, args)
case "brain_write":
return s.write(ctx, args)
case "brain_ingest_raw":
return s.ingestRaw(ctx, args)
case "brain_ingest":
return s.ingest(ctx, args)
case "brain_search":
@@ -98,6 +100,33 @@ func (s *Skill) ingest(ctx context.Context, args json.RawMessage) (json.RawMessa
return nil, fmt.Errorf("either content+source or path is required")
}
type ingestRawArgs struct {
Source string `json:"source"`
Pages []any `json:"pages"`
DryRun bool `json:"dry_run,omitempty"`
}
func (s *Skill) ingestRaw(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
var a ingestRawArgs
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if s.cfg.IngestSvcURL == "" {
return nil, fmt.Errorf("brain_ingest_raw: INGEST_SVC_URL not configured")
}
if a.Source == "" {
return nil, fmt.Errorf("source is required")
}
if len(a.Pages) == 0 {
return nil, fmt.Errorf("pages is required and must be non-empty")
}
return s.postTo(ctx, s.cfg.IngestSvcURL+"/ingest-raw", map[string]any{
"source": a.Source,
"pages": a.Pages,
"dry_run": a.DryRun,
})
}
type searchArgs struct {
Query string `json:"query"`
Collection string `json:"collection,omitempty"`

View File

@@ -55,6 +55,32 @@ func (s *Skill) Tools() []registry.ToolDef {
},
}
if s.cfg.IngestSvcURL != "" {
tools = append(tools, registry.ToolDef{
Name: "brain_ingest_raw",
Description: "Ingest pre-structured pages into the brain wiki, bypassing the LLM extraction step. " +
"Use when you (the calling agent) have already extracted entities, concepts, and content from a source. " +
"Provide source (human-readable name) and pages (array of {title, type, subtype, domain, content} objects). " +
"The pipeline computes slugs, paths, frontmatter, wikilink canonicalization, and source back-references. " +
"Returns the list of wiki pages written.",
InputSchema: schema([]string{"source", "pages"}, map[string]any{
"source": map[string]any{"type": "string", "description": "human-readable name for the source, e.g. 'shape-up-book'"},
"pages": map[string]any{
"type": "array",
"items": map[string]any{
"type": "object",
"required": []string{"title", "type", "content"},
"properties": map[string]any{
"title": map[string]any{"type": "string", "description": "page title, e.g. 'Hash Encoding'"},
"type": map[string]any{"type": "string", "enum": []string{"source", "concept", "entity"}, "description": "page type"},
"subtype": map[string]any{"type": "string", "description": "entity: person|company|tool|model|framework|technology; source: article|pdf|book|video|note|project"},
"domain": map[string]any{"type": "string", "description": "knowledge domain, e.g. 'Machine Learning'"},
"content": map[string]any{"type": "string", "description": "markdown body — no frontmatter, use [[Display Name]] for wikilinks"},
},
},
},
"dry_run": map[string]any{"type": "boolean"},
}),
})
tools = append(tools, registry.ToolDef{
Name: "brain_ingest",
Description: "Ingest content into the brain wiki (brain/wiki/). Calls an LLM to produce structured wiki pages. " +

View File

@@ -39,7 +39,17 @@ fi
if [ -n "$ROOT_CONTEXT" ] && [ -f "$ROOT_CONTEXT" ]; then
echo " Root context: $ROOT_CONTEXT"
else
echo " No root AGENT.md found (project context only)"
# No reachable root AGENT.md — common in CI's clean checkout. The root+project
# adapters (AGENTS.md, .cursorrules, .aider.conventions.md, system-prompt.txt)
# require the root context to regenerate correctly, so we skip them entirely
# and only regenerate CLAUDE.md (which is project-only and inherits root via
# tree walk in Claude Code itself).
echo " No root AGENT.md found — regenerating CLAUDE.md only"
echo "Syncing project context from $PROJECT_FILE..."
cat "$PROJECT_FILE" > CLAUDE.md
echo " → CLAUDE.md (project-only)"
echo "Done."
exit 0
fi
# Emit root context + separator