Files
hyperguild/DECISIONS.md
Mathias 937355cabe
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 3s
fix(project_create): commit staging namespace directly to infra main
Drops the intermediate `staging/<name>` branch so Flux begins reconciling the
namespace within ~60s of `project_create` instead of waiting on a human PR
merge. Consistent with project-wide trunk-based development.

Rationale: ADR 2026-05-18 in DECISIONS.md.

Closes hyperguild#14 (item 1). Item 2 (GITEA_MCP_TOKEN in SOPS) verified
already-present in infra@408a527 secrets.enc.yaml.

Note: TestRoutingPodEndToEnd is failing on main pre-existing this commit
(context deadline waiting for port 33310 in <5s). Not caused by this change;
project skill tests pass. To track in a separate issue.
2026-05-18 17:20:53 +02:00

7.8 KiB
Raw Blame History

Decisions log

Record why things are the way they are. Future-you will thank present-you.


2026-04-08 — AGENTS.md as cross-tool standard, not CLAUDE.md

Context: Multiple tools (Crush, Pi, Antigravity) read AGENTS.md natively. Claude Code reads CLAUDE.md. Building on CLAUDE.md as the primary format locks into one vendor.

Decision: Canonical source is .context/AGENT.md (root) and .context/PROJECT.md (per-project). The adapter script generates both AGENTS.md and CLAUDE.md — identical content, two filenames. Crush, Pi, and Antigravity read AGENTS.md; Claude Code reads CLAUDE.md.

Consequences: One canonical file serves five+ tools. Adding a new tool that reads AGENTS.md requires zero adapter work.

2026-04-08 — Agent Skills standard (SKILL.md in folders) over flat markdown

Context: Claude Code, Pi, Crush, and Antigravity all support the Agent Skills open standard: a folder containing SKILL.md with frontmatter (name, description). Skills are discovered on-demand — only the description enters context, full instructions load when triggered.

Decision: Skills live in .skills/{name}/SKILL.md at project level. This replaces the earlier .context/skills/{name}.md flat-file approach.

Consequences: Skills are cross-compatible without adaptation. Pi auto-discovers them from .pi/skills/ (symlink). Crush reads them natively. Progressive disclosure keeps context window lean.

2026-04-08 — Go + HTMX as default stack

Context: Need a default that's fast to prototype, easy to deploy as a single binary, and doesn't require a Node/npm toolchain for the UI layer.

Decision: Go with HTMX + Templ for server-rendered UI. Python as fallback for ML/data tasks. TypeScript only when a project genuinely needs a rich client-side SPA.

Consequences: Simpler deployment and dependency management. Agents need Go-specific skills.

2026-04-08 — Task over Make

Context: Makefiles have arcane syntax and poor cross-platform support.

Decision: Use Taskfile (taskfile.dev) — YAML-based, cross-platform, supports task dependencies.

Consequences: One extra binary to install. All project automation in Taskfile.yml.

2026-04-08 — Qdrant over ChromaDB for vector store

Context: Need collection-level isolation for client separation, payload filtering, runs well in k3s.

Decision: Qdrant. Native collection isolation, rich filtering, mature gRPC API.

Consequences: More operational complexity than Chroma, but isolation is non-negotiable for client work.

2026-04-22 — Hyperguild scope reset: drop parametric learning, simplify brain

Context: After shipping Phases 14 (MCP server, 6 skills, model orchestration, session logging, CD pipeline), we critically reviewed what was theater vs genuinely useful.

Decisions:

  1. Drop the parametric learning pipeline. SFT/DPO/RL extraction, brain/training-data/ directory structure, Axolotl/LLaMA-Factory fine-tuning loop — all cut. The loop requires thousands of high-quality examples to move the needle, which a solo consultant won't generate. Better base models ship faster than any fine-tuning effort could keep up with. This is a research project, not a productivity tool.

  2. Simplify the brain to plain markdown. brain/knowledge/ replaces brain/wiki/ + brain/raw/ + brain/training-data/. The trainer and retrospective workers write markdown entries. brain_query searches markdown. No ingestion pipeline, no tagging for significance review, no structured JSONL formats.

  3. Measure the escalation chain before assuming it's useful. Local model (phi4) only belongs in a skill's chain if it passes Claude verification at a meaningful rate. Where it fails >70% of the time, it adds cost not value. Per-skill hit rate logging is the prerequisite to honest chain configuration.

  4. Keep what's real: MCP tool surface, session logging with attempt records, tier detection, CD pipeline, bridge to Claude Code.

What to build next (in priority order):

  • brain_query injection into skill handlers before spawning workers — this makes the declarative brain actually function
  • protocols.md — behavioral contract injected into every worker prompt
  • Per-skill pass rate logging and chain tuning

Consequences: Simpler system with a shorter feedback loop. The brain becomes real only when skill handlers query it. Training data ambitions deferred indefinitely — revisit if local model capabilities improve enough that fine-tuning becomes worthwhile.


Plan 6: routing pod reuses internal/skills/{review,debug,retrospective,trainer}

Plan 6 (Mode 2 routing pod, 2026-05-04) introduces a second consumer of the four cost-routable skill packages. The routing pod constructs each skill via <pkg>.New(Config{...}) and hands it routing.Router.Run as the CompleteFunc.

Preserved code (do not delete):

  • internal/skills/{review,debug,retrospective,trainer}/
  • internal/registry, internal/mcp, internal/exec/litellm.go
  • internal/routing/, cmd/routing/

Plan 7: supervisor pod retired (2026-05-12)

What was deleted: cmd/supervisor/, internal/skills/{tdd,spec}/, root Dockerfile, supervisor k8s manifests (Deployment, Service, Ingress, NodePort 30320), supervisor entry removed from all .mcp.json configs.

Coverage: tdd/spec → SKILL.md files in ~/dev/.skills/; review, debug, retrospective, trainer → routing pod; brain_*/session_log → brain MCP; tierhyperguild tier CLI.


2026-05-12 — brain_answer and brain_classify: LLM routing via berget.ai → iguana

Context: Brain MCP returned raw BM25 excerpts with no synthesis. Adding LLM-backed tools enables Q&A and ingestion enrichment without a separate service.

Decision: Two new MCP tools in the ingestion service (ingestion/internal/mcp/):

  • brain_answer(query) — BM25 top-10 → LLM synthesis → answer + sources
  • brain_classify(text) — LLM classifies doc into type/title/tags

Primary LLM: berget.ai gemma4:31b (EU cloud, spend tokens while available). Fallback: iguana gemma4:31b (local Ollama). Reranker deferred to follow-up. Router lives in ingestion/internal/llm.Router; opt-in via BRAIN_LLM_PRIMARY_URL.

Consequences: Brain becomes a knowledge assistant, not just a search index. When berget.ai tokens run out, flip BRAIN_LLM_PRIMARY_URL to iguana.


2026-04-08 — Mistral Vibe gets its own adapter

Context: Vibe doesn't read AGENTS.md — it uses ~/.vibe/prompts/ and ~/.vibe/agents/ with TOML config.

Decision: The root context-sync generates a mathias.md prompt and mathias.toml agent config in ~/.vibe/. This is the one tool that needs a custom adapter path.

Consequences: Run vibe --agent mathias to use your conventions. Other Vibe users on the machine aren't affected.


2026-05-18 — project_create commits staging namespace directly to infra main

Context: project_create writes a k8s namespace manifest into the infra repo so Flux brings up a staging environment for the new project. Initial implementation pushed to a staging/<name> branch, which required manual PR merge before Flux saw the namespace — defeating the "one tool call, project exists, staging reconciling within 60s" goal.

Decision: Option A — commit directly to main. callInfraCommit passes branch: "main" to gitea-mcp's file_write_branch; no PR, no merge step.

Consequences: Staging namespace appears in cluster within ~60s of the project_create call. Consistent with project-wide TBD policy (CLAUDE.md): commit directly to main, every commit deployable. Acceptable because the manifest is a fresh namespace under k3s/staging/<name>/ — isolated, low blast-radius, and Flux will simply recreate it if the file is bad. Manual review gating was friction for no compensating safety gain on experiment namespaces.