Reflects Plan 7 (supervisor retirement) and brain_answer/brain_classify addition. Supervisor MCP endpoint removed; brain now exposes HTTPS domain with Dex JWT auth. Routing decisions documented for LLM berget→iguana chain. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
6.8 KiB
Decisions log
Record why things are the way they are. Future-you will thank present-you.
2026-04-08 — AGENTS.md as cross-tool standard, not CLAUDE.md
Context: Multiple tools (Crush, Pi, Antigravity) read AGENTS.md natively. Claude Code reads CLAUDE.md. Building on CLAUDE.md as the primary format locks into one vendor.
Decision: Canonical source is .context/AGENT.md (root) and .context/PROJECT.md (per-project). The adapter script generates both AGENTS.md and CLAUDE.md — identical content, two filenames. Crush, Pi, and Antigravity read AGENTS.md; Claude Code reads CLAUDE.md.
Consequences: One canonical file serves five+ tools. Adding a new tool that reads AGENTS.md requires zero adapter work.
2026-04-08 — Agent Skills standard (SKILL.md in folders) over flat markdown
Context: Claude Code, Pi, Crush, and Antigravity all support the Agent Skills open standard: a folder containing SKILL.md with frontmatter (name, description). Skills are discovered on-demand — only the description enters context, full instructions load when triggered.
Decision: Skills live in .skills/{name}/SKILL.md at project level. This replaces the earlier .context/skills/{name}.md flat-file approach.
Consequences: Skills are cross-compatible without adaptation. Pi auto-discovers them from .pi/skills/ (symlink). Crush reads them natively. Progressive disclosure keeps context window lean.
2026-04-08 — Go + HTMX as default stack
Context: Need a default that's fast to prototype, easy to deploy as a single binary, and doesn't require a Node/npm toolchain for the UI layer.
Decision: Go with HTMX + Templ for server-rendered UI. Python as fallback for ML/data tasks. TypeScript only when a project genuinely needs a rich client-side SPA.
Consequences: Simpler deployment and dependency management. Agents need Go-specific skills.
2026-04-08 — Task over Make
Context: Makefiles have arcane syntax and poor cross-platform support.
Decision: Use Taskfile (taskfile.dev) — YAML-based, cross-platform, supports task dependencies.
Consequences: One extra binary to install. All project automation in Taskfile.yml.
2026-04-08 — Qdrant over ChromaDB for vector store
Context: Need collection-level isolation for client separation, payload filtering, runs well in k3s.
Decision: Qdrant. Native collection isolation, rich filtering, mature gRPC API.
Consequences: More operational complexity than Chroma, but isolation is non-negotiable for client work.
2026-04-22 — Hyperguild scope reset: drop parametric learning, simplify brain
Context: After shipping Phases 1–4 (MCP server, 6 skills, model orchestration, session logging, CD pipeline), we critically reviewed what was theater vs genuinely useful.
Decisions:
-
Drop the parametric learning pipeline. SFT/DPO/RL extraction,
brain/training-data/directory structure, Axolotl/LLaMA-Factory fine-tuning loop — all cut. The loop requires thousands of high-quality examples to move the needle, which a solo consultant won't generate. Better base models ship faster than any fine-tuning effort could keep up with. This is a research project, not a productivity tool. -
Simplify the brain to plain markdown.
brain/knowledge/replacesbrain/wiki/ + brain/raw/ + brain/training-data/. The trainer and retrospective workers write markdown entries.brain_querysearches markdown. No ingestion pipeline, no tagging for significance review, no structured JSONL formats. -
Measure the escalation chain before assuming it's useful. Local model (phi4) only belongs in a skill's chain if it passes Claude verification at a meaningful rate. Where it fails >70% of the time, it adds cost not value. Per-skill hit rate logging is the prerequisite to honest chain configuration.
-
Keep what's real: MCP tool surface, session logging with attempt records, tier detection, CD pipeline, bridge to Claude Code.
What to build next (in priority order):
brain_queryinjection into skill handlers before spawning workers — this makes the declarative brain actually functionprotocols.md— behavioral contract injected into every worker prompt- Per-skill pass rate logging and chain tuning
Consequences: Simpler system with a shorter feedback loop. The brain becomes real only when skill handlers query it. Training data ambitions deferred indefinitely — revisit if local model capabilities improve enough that fine-tuning becomes worthwhile.
Plan 6: routing pod reuses internal/skills/{review,debug,retrospective,trainer}
Plan 6 (Mode 2 routing pod, 2026-05-04) introduces a second consumer of
the four cost-routable skill packages. The routing pod constructs each
skill via <pkg>.New(Config{...}) and hands it routing.Router.Run as
the CompleteFunc.
Preserved code (do not delete):
internal/skills/{review,debug,retrospective,trainer}/internal/registry,internal/mcp,internal/exec/litellm.gointernal/routing/,cmd/routing/
Plan 7: supervisor pod retired (2026-05-12)
What was deleted: cmd/supervisor/, internal/skills/{tdd,spec}/,
root Dockerfile, supervisor k8s manifests (Deployment, Service, Ingress,
NodePort 30320), supervisor entry removed from all .mcp.json configs.
Coverage: tdd/spec → SKILL.md files in ~/dev/.skills/; review,
debug, retrospective, trainer → routing pod; brain_*/session_log →
brain MCP; tier → hyperguild tier CLI.
2026-05-12 — brain_answer and brain_classify: LLM routing via berget.ai → iguana
Context: Brain MCP returned raw BM25 excerpts with no synthesis. Adding LLM-backed tools enables Q&A and ingestion enrichment without a separate service.
Decision: Two new MCP tools in the ingestion service (ingestion/internal/mcp/):
brain_answer(query)— BM25 top-10 → LLM synthesis → answer + sourcesbrain_classify(text)— LLM classifies doc into type/title/tags
Primary LLM: berget.ai gemma4:31b (EU cloud, spend tokens while available).
Fallback: iguana gemma4:31b (local Ollama). Reranker deferred to follow-up.
Router lives in ingestion/internal/llm.Router; opt-in via BRAIN_LLM_PRIMARY_URL.
Consequences: Brain becomes a knowledge assistant, not just a search index.
When berget.ai tokens run out, flip BRAIN_LLM_PRIMARY_URL to iguana.
2026-04-08 — Mistral Vibe gets its own adapter
Context: Vibe doesn't read AGENTS.md — it uses ~/.vibe/prompts/ and ~/.vibe/agents/ with TOML config.
Decision: The root context-sync generates a mathias.md prompt and mathias.toml agent config in ~/.vibe/. This is the one tool that needs a custom adapter path.
Consequences: Run vibe --agent mathias to use your conventions. Other Vibe users on the machine aren't affected.