Opt-in by setting CLAUDE_SESSIONS_DIR to the ~/.claude/projects path.
When set, the server starts claudewatcher.Watch in a goroutine that
ticks every CLAUDE_INGEST_INTERVAL seconds (default 60). Requires
BRAIN_PG_DSN for the cursor table — fail-fast if missing.
Each Batch becomes one wiki note at:
brain/wiki/claude-sessions/facts/session-<host>-<session_id>.md
with frontmatter type=source + domain=<project basename>. Per-turn
content capped at 2000 chars (full transcripts stay in
~/.claude/projects already); the brain entry is a digest, not a
mirror.
CLAUDE_INGEST_HOST overrides the os.Hostname()-derived host label,
useful when multiple ingestion pods consume the same DSN from
different machines.
Closes hyperguild#27.
Bump-Type: minor
New package internal/claudewatcher. The volume gate (24 turns/week of
agentsquad logs vs 500/week gate) exposed that the real signal lives
in daily Claude Code usage at ~/.claude/projects/*/<uuid>.jsonl, not
in agentsquad output. This package captures that signal. See infra#73
Track E + hyperguild#27 for the full reframe.
Components:
- parser: tolerant JSONL parser over the observed Claude Code session
schema (user / assistant / attachment / system + bookkeeping types).
Skip-flag fast-paths queue-operation, last-prompt, permission-mode,
ai-title, bridge-session, file-history-snapshot.
- scrubber: 11-rule fail-closed regex set for credential shapes
(bearer, postgres URIs, PEM, ssh-key, ghp_/sk-/sk-ant-/AKIA, homelab
env tokens, SOPS markers). Drop turn + log on match.
- cursor: postgres-backed claude_session_cursors table, keyed by
(host, file_path) with byte_offset. Resumable across pod restarts.
- watcher: poll loop. Walks SessionsDir, processes each .jsonl from
its cursor offset, runs scrubber, emits a Batch per file to a
Sink interface, advances cursor on successful Ingest.
No classifier integration in this commit — every kept turn is emitted
in a per-session batch. The cmd/server wiring (next commit) routes
batches to brain/wiki/claude-sessions/facts/. Classifier-driven hall
routing (decisions / failures / hypotheses) is a follow-up.
19 unit tests across parser + scrubber + watcher. task check green.
Refs: infra#73, hyperguild#27
Returns top-N relevant brain entries for a project context. Combines
BM25 hits on project name with 2-hop graph expansion via Track A's
graphstore (when BRAIN_GRAPH_ENABLED). Closes hyperguild#28.
Notes on implementation choices that deviate slightly from the spec:
- Excerpt length: 200 chars per spec (vs the 300 used by search.Result).
truncateExcerpt clamps the already-stripped BM25 excerpt; graph-only
neighbours load their excerpt from disk via a private readExcerpt
helper (search.hydrate is unexported).
- Graph scoring: 0.6 / max(1, distance) per neighbour, so distance-1
contributes 0.6 and distance-2 contributes 0.3. BM25 hits decay
linearly from 3.0 (rank-0) to 1.0 (rank-2), giving BM25 hits a
natural ceiling above pure-graph hits while still letting a doc
surfaced via both edge types outrank a BM25-only one.
- Test placement: package mcp (internal) rather than mcp_test, because
graphReader is unexported and WithGraph only accepts *PGStore; an
internal test can install a dual-interface fake directly on s.graph
without spinning up postgres.
Bump-Type: minor
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tier-weighted retrieval against the qa-2026-05.md 20-question set:
| run | top-1 | top-3 |
|--------------------------------|-------|-------|
| baseline (pre-phase-1) | 20% | 65% |
| post phase 1 (parser+content) | 20% | 70% |
| post M4 (tier weighting) | 30% | 75% |
| post M4b (entities → K tier) | 35% | 80% |
Net Phase 2 lift: +15pt top-1, +15pt top-3 — comfortably above the
≥10pt close-gate set in infra#72.
Three remaining misses are content-keyword issues, not structure
issues (the questions don't share enough lexical surface with the
target entries to surface via BM25 alone). Vector search would
help here but the iguana embedder is off-mesh (see infra#64).
Initial M4 mapping put wiki/entities/* in tier=note. Post-M4 eval
regressed qwen35-9b-fast from rank 2 → off top-5: knowledge entries
that cite the entity in passing now outscore the entity page itself
(1.5× weight vs 1.0×).
Entity anchor pages are durable facts about concrete things — they
map cleanly to the knowledge/facts/ slot in the post-M3 layout
target. Promote them now so the path inference matches.
Eval re-run after deploy is in infra#72.
The eval set under brain/eval/qa-2026-05.md showed BM25 top-1 at 20%
with 5 of the missing slugs being short focused knowledge entries
that lost to long aggregate docs on raw term-frequency. Tier weighting
addresses that without touching the BM25 algorithm itself.
How
- Result struct gains a Tier field, populated during the file walk
via extractTier (frontmatter wins, path prefix as fallback —
mirrors the graph.inferTierFromPath logic so the two callers stay
in lockstep).
- After the existing sort (and optional hybridMerge), do a final
stable re-sort by float64(Score) * tierWeight(Tier). Knowledge
×1.5, note ×1.0, inbox ×0.3, unknown ×1.0.
- hydrate() (vector-only hits) also fills Tier so re-ranking covers
the hybrid path.
Test covers the load-bearing case: a long note-tier doc with raw=10
loses to a short knowledge-tier doc with raw=8 after weighting
(8×1.5=12 vs 10×1.0=10).
Measurement gate is in infra#72: re-run brain/eval/score.py against
the live brain after this image lands; close the issue when top-1
hit rate lifts by ≥10 absolute points.
extract.go now reads `tier:` and `topic:` from YAML frontmatter, with
a path-based fallback when frontmatter is absent (the pre-M3 state on
every existing entry):
knowledge/* → tier=knowledge
notes/* → tier=note
wiki/** → tier=note (sources + concepts + entities are I-level)
inbox/**, raw/**, sessions/**, clips/** → tier=inbox
Frontmatter wins when present — covers the M3-migrated case where an
entry's path may not match the tier the author chose for it.
UpsertEntity persists both columns. M1's schema already has them.
Backfill on next pod start populates tier for the whole corpus
without any file moves; M3 will follow up with the actual layout
migration and explicit frontmatter writes.
Schema-only change. DDL adds tier + topic on fresh tables and uses
ADD COLUMN IF NOT EXISTS on existing tables (idempotent across pod
restarts). New conditional indexes match the wing/hall pattern.
No behavior change in this commit — UpsertEntity still writes only
the original columns; tier + topic stay '' on every row. M2 plumbs
the parser through. The empty default means existing queries are
untouched until the rest of the chain lands.
Part of infra#72 — brain DIKW tier redesign.
Top-1 stayed at 20% (4/20), top-3 +5pt (65→70%) after:
- extract.go wing/topic parser fix (commit 3084c41)
- qwen35-9b-fast entity pad (was 239-byte stub → full entity)
- grafana entry: add "pod restart" synonym to lesson body
- dangling refs stripped from index.md + entities/k3s.md
The only retrieval move: qwen35-9b-fast climbed from rank 0 (off top-5)
to rank 2 — the entity pad worked. Other 5 misses are ranker behaviour
on already-keyword-overlapping entries; BM25 doesn't weight the right
slugs to the top.
Per the proposal's gate (≥10pt lift = stop, <10pt = Phase 2 justified),
the DIKW tier redesign earns its cost. Next session: tier column +
file moves + tier-weighted retrieval, then re-measure against this
same eval set.
classifyByPath had a hole: paths like wiki/index.md or wiki/<slug>.md
(direct children of wiki/, no subdirectory) hit the default branch and
wrote Wing=parts[1] — which IS the filename, not a wing. Symptom in
brain_entities: rows like (slug=index, wing=index.md) and
(slug=autobe-..., wing=autobe-evaluation-pattern-....md).
Fix: when len(parts) < 3 (no subdirectory at all), fall through to
Type=knowledge and let frontmatter set wing/hall if present.
Add brain/eval/ artifacts at the same time:
- qa-2026-05.md — 20 hand-authored Q→expected-slug pairs covering the
homelab knowledge corpus across mcp, dex, gitops, postgres, go,
models, methodology
- score.py — calls brain_query for each pair, scores top-1 + top-3,
emits per-question detail. BRAIN_MCP_TOKEN via env.
Pre-fix baseline against the live brain: top-1 = 20% (4/20),
top-3 = 65% (13/20). Six hard misses where the expected slug doesn't
even land in the top-5.
Used to gate the phase 2 DIKW redesign (infra#62 follow-up): if
phase 1 fixes (this parser fix + 20 backlink authoring on top
orphans) lift top-1 by <10 absolute points, structure is the
bottleneck and the tier redesign is justified.
Follow-up to infra#70. LiteLLM moved off piguard into k3s and the
public llm-api.d-ma.be hostname now upstreams to koala:30401. The
piguard:4000 default in the source bit-rots — works today because
piguard:4000 is still alive during the 7-day soak, breaks the moment
the compose comes down.
Pointing the default at the public hostname survives the cutover
without needing a follow-up. Production deploys via k3s already
override via env (in-cluster Service DNS) so this only affects local
dev shells without LITELLM_BASE_URL set.
- internal/config/routing.go: comment + envOr fallback
- internal/config/routing_test.go: expected value in defaults test
- scripts/smoke-routing.sh: shell default
task check: clean (tests + vet + govulncheck).
Commit 4 of Track A — the no-shelfware close-out the grill demanded.
brain_answer now folds the 1-hop outgoing neighbourhood of its top
BM25/rerank hit into the LLM's context as a <related> block when
BRAIN_GRAPH_ENABLED is on. With the flag off the prompt is byte-for-
byte identical to the pre-Track-A behaviour, so existing tests still
pass without modification.
The hop list contains slug, edge_type, doc_path — no extra retrieval
pass, no second LLM call, no file reads. The model can ignore the
block when irrelevant; when it adds signal we get GraphRAG for free.
Refs: docs/superpowers/specs/2026-05-homelab-training-graph-next-step.md
in infra repo + grill addendum item "Track A: GraphRAG wiring into
brain_answer is mandatory in same commit chain (no shelfware risk)".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Commit 3 of Track A. The MCP server now publishes a new tool that
opens the brain knowledge graph (entities + wikilink edges) for
external consumers (claude.ai connectors, gitea-mcp, agentsquad).
- tools_graph.go: brain_graph handler dispatches by op:
neighbors — 1-hop outgoing from slug, optional edge_type filter
subgraph — every reachable slug within depth hops (≤6)
path — shortest directed path src→dst within depth (≤8)
Returns slug + entity metadata + edge_type + hop distance.
- server.go: handleCall routes "brain_graph" to brainGraph.
- handlers.go: tool descriptor with the op enum + per-op required
fields documented in the description.
- server_test.go: TestServerToolsList expects brain_graph in the
listing.
The tool returns an error when BRAIN_GRAPH_ENABLED is unset — same
shape as brain_answer when the answer LLM is unconfigured.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Commit 2 of Track A. Service stays a no-op until BRAIN_GRAPH_ENABLED=
true; flipping it on creates the schema (idempotent), starts indexing
every successful write, and optionally backfills the existing brain
dir.
- internal/graphsync: best-effort wrapper around graph.Extract +
graphstore. IndexDoc reads docPath under brainDir, parses, upserts
entity + replaces edges. BackfillFromBrainDir walks wiki/ +
knowledge/. Both are no-ops on nil store so callers wire
unconditionally.
- mcp.Server gains WithGraph builder + graphsync.Store field.
brain_write, brain_ingest, brain_ingest_raw, brain_tunnel call
indexInGraph after success — failures slog.Warn but never
propagate (graph is augmentation, not correctness).
- cmd/server gates the wiring on BRAIN_GRAPH_ENABLED=true (default
off so first rollout doesn't surprise). BRAIN_GRAPH_BACKFILL=true
triggers a one-shot walk of the brain dir on boot.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Foundation for Track A (GraphRAG on top of existing wiki). Two new
packages, both unwired — service behaviour unchanged until commit 2
hooks the pipeline.
- internal/graph: pure parser. Extract() walks markdown + frontmatter
and emits one Entity + N wikilink Edges per doc. Dedupes per (dst,
line), ignores self-references, classifies hall/concept/entity/
source/knowledge from path layout.
- internal/graphstore: pgx-backed PGStore mirroring vectorstore's
shape. Idempotent Init() creates brain_entities + brain_edges with
indexes on src_slug, dst_slug, src_doc, wing, type. Operations:
UpsertEntity, ReplaceEdgesForDoc (tx), DeleteByDoc, Neighbors,
Subgraph (recursive CTE, depth ≤6), Path (shortest path, depth ≤8).
Schema lives on the shared postgres18 instance alongside the
brain_embeddings table — no new datastore. See
docs/superpowers/specs/2026-05-homelab-training-graph-next-step.md
in infra repo + infra#62.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Same reason as gitea-mcp ci retrigger commit — mcp-chassis was created
private; the ingestion port (commit ca22df2) couldn't fetch it in CI.
Chassis is now public; this empty commit retriggers the Build and deploy
pipeline.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Same fix as gitea-mcp commit for the same reason — mcp-chassis (added
in commit ca22df2) is hosted at gitea.d-ma.be and Gitea returns http://
in its go-import meta tag, breaking the default go module resolution
inside the Docker build.
GOPRIVATE+GOPROXY=direct+GOSUMDB=off plus a git config insteadOf rewrite
to flip http:// → https:// for gitea.d-ma.be clones.
Without this, hyperguild CI Build and deploy failed on the chassis
port (sha=ca22df2). Reapplying CI should now succeed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Second port of the MCP chassis (gitea-mcp was first, commit 658f4ba).
Closes the chassis-adoption loop on the two highest-LOC consumers.
Changes:
- Drop ingestion/internal/auth/ entirely (jwt.go + jwt_test.go +
protected_resource.go + protected_resource_test.go) — chassis provides
JWTValidator + ProtectedResourceHandler with identical semantics.
- Drop ingestion/internal/mcp/auth.go (BearerAuth function, ~65 LOC)
and the integration test auth_test.go (~200 LOC) — chassis
BearerMiddleware replaces it. Static-Bearer-or-Dex-JWT precedence and
RFC 9728 resource_metadata challenge behavior preserved 1:1.
- cmd/server/main.go: import chassis as `chassisauth`, rewire the three
call sites. Use realm="brain" in the BearerMiddleware call so a 401
challenge identifies the resource as the brain MCP.
OAuth client_credentials handler (ingestion/internal/oauth) stays —
chassis v0.1.0 covers only the JWT path; OAuth flow is a candidate for
chassis v0.2.0 once a second MCP needs it (rule of three).
Net delta: -~330 LOC of duplicated auth code; +1 import; +1 GOPRIVATE
env requirement on dev machines (documented in the spike handoff
2026-05-22-mcp-chassis-spike.md).
task check green (lint + test + vet + govulncheck).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes infra#50.
Adds an internal/metrics package with a hand-rolled Prometheus
exposition layer (stdlib + sync/atomic only — no new dep) and wraps the
HTTP mux with a timing middleware. Every request emits one observation
on the `brain_query_duration_seconds` histogram labeled by
`path` (request Pattern, low cardinality) and `status` (2xx/3xx/4xx/5xx).
Dependency choice: hand-rolled rather than github.com/prometheus/client_golang
because the surface needed is small (one histogram + bucket constants)
and the repo CLAUDE.md keeps deps stdlib + jwx + testify only. ~150 LOC
of code + tests is cheaper than the chart of transitive prometheus deps.
Endpoints:
- GET /metrics — OpenMetrics text exposition, no auth (cluster-internal)
Wire format pinned by tests in internal/metrics/metrics_test.go. The
ServiceMonitor that drives the kube-prometheus-stack scrape lives in
infra/k3s/apps/supervisor/ (separate commit on mathias/infra).
After this image deploys, the canary alert from
docs/superpowers/specs/2026-05-homelab-architecture-review.md becomes
wireable:
histogram_quantile(0.95,
sum(rate(brain_query_duration_seconds_bucket[5m])) by (le))
> 1.5 * histogram_quantile(0.95,
sum(rate(brain_query_duration_seconds_bucket[5m] offset 7d)) by (le))
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removes the TODO in Sync that left files static after their first embed.
Edits to brain/wiki/ and brain/knowledge/ now surface in subsequent
syncs without manual /backfill-embeddings calls.
Approach
- Store interface: KnownPaths → KnownPathsWithTime returning path →
updated_at. Callers compare against file mtime to detect edits.
- PGStore: SELECT path, updated_at FROM brain_embeddings.
- Sync groups known chunks by parent path and tracks the EARLIEST
updated_at per parent. A file is stale when its mtime is after that
oldest chunk's timestamp — any chunk older than the file means at
least one chunk hasn't been refreshed since the last edit.
- Stale-path rewrite: delete every old chunk for the parent (handles
"file shrunk → fewer chunks → orphan rows at higher #NNNN" cleanly),
then re-chunk + re-embed + re-upsert.
Tests
- New: TestSync_ReembedsFileWhenMtimeNewer — file mtime forced into the
future vs store updated_at; Sync deletes old chunk + upserts fresh one.
- New: TestSync_SkipsFileWhenMtimeOlder — file mtime backdated; Sync is
a no-op (no upserts, no deletes).
- Updated: stubStore.known is now map[string]time.Time. A zero value
resolves to a far-future sentinel so existing "skip if already known"
tests keep passing without per-test setup.
- pg_test renamed KnownPaths integration → KnownPathsWithTime; asserts
updated_at is non-zero and within 5s of insert wall-clock.
Backward compat
- brain_embeddings rows pre-dating this change carry valid updated_at
values (column was always populated via `DEFAULT now()` + ON CONFLICT
`updated_at = now()`). No migration needed. Live pod will start
re-embedding any file whose source has been edited since its chunks
were originally written.
Closes gitea/mathias/hyperguild#23.
Per the Gitea-as-true-master ADR (infra#34), GitHub mirror is now an
explicit opt-in via mirror_to_github=true. Default (omit / false) provisions
a Gitea repo + staging namespace + experiment-brief issue only — no GitHub
repo, no push-mirror.
Rationale: US cloud providers (Microsoft/GitHub) are subject to CLOUD Act
and NSL. Client code, business logic, and infra-adjacent repos should
never live on US-owned infrastructure. Only open-source projects intended
for public community (hyperguild, gitea-mcp, template-*) should opt in.
Changes
- internal/skills/project/handlers.go
- createArgs gains MirrorToGitHub bool (json:"mirror_to_github,omitempty").
- res.GitHubURL is set only when MirrorToGitHub is true; empty string otherwise.
- Steps 2 (create_github_repo) + 3 (mirror) are wrapped in `if args.MirrorToGitHub`.
- experimentBrief renders "Gitea-only" line by default and the existing
"Push-mirror configured" line only on opt-in.
- internal/skills/project/skill.go
- Tool schema gains mirror_to_github (boolean, default false) with description
spelling out when to opt in. Tool Description updated to reflect new default.
- internal/skills/project/handlers_test.go
- Added mirroredArgs() helper (happyArgs + mirror_to_github:true).
- Tests that exercise the GitHub flow (HappyPath, GitHubExists_Idempotent,
GitHubFails, NoGitHubClient_DegradedMode, Idempotent_RepoExists,
MirrorFails, InfraCommitFails) switched to mirroredArgs.
- Added TestProjectCreate_DefaultSkipsGitHubMirror covering the Gitea-only
path: 3 gitea-mcp calls, zero GitHub calls, empty github_url, reached=
[create_repo, infra_commit, issue], body reflects Gitea-only.
Closes gitea/mathias/hyperguild#17. Moves infra#34 acceptance item
"project_create updated: mirror_to_github defaults to false".
Long markdown files (>~8KB) silently failed to embed because nomic-embed-text
on iguana has a 2048-token context. embed sync logged errors=1 every cycle
with no useful body until #37 added per-item logging — three files exceed
the ceiling: finbert source (8 KB), koala-machine-state (7.1 KB),
litellm-absorption (8.8 KB). Curated knowledge entries should never be
vector-blind.
Approach: chunk-before-embed, no schema change.
vectorstore/chunk.go (new)
- ChunkMarkdown splits at H1/H2 boundaries; sections over maxBytes are
further split at paragraph boundaries, packing greedily under budget.
- NumberChunks assigns "<parent>#NNNN" storage paths (1-based, zero-padded
to 4 digits — handles files with up to ~10k sections in stable sort order).
- ParentPath strips the chunk suffix for retrieval-side dedup.
vectorstore/sync.go
- After ChunkMarkdown produces N pieces, each is embedded + upserted as a
separate brain_embeddings row at "<parent>#NNNN". maxChunkBytes = 4000
(≈1000 nomic tokens, well under the 2048 ceiling with headroom for
unicode/code blocks).
- "Already embedded?" check now reduces known paths to parent set via
ParentPath, so the first chunk hit short-circuits the file.
- Delete walk also reduces via ParentPath; when a parent file disappears,
every chunk row (and any pre-existing bare-path row, for backward
compatibility with rows written before this change) gets dropped.
search/search.go
- hybridMerge collapses chunk-path vector hits to parent via ParentPath
before scope check, RRF accumulation, and hydration. A file with three
chunk hits returns one result row, not three.
Backward compatibility: pre-existing bare-path rows in brain_embeddings
keep working — ParentPath returns them unchanged, knownParents handles
them as if they were "wiki/foo.md#NNNN" hits, sync skips re-embed, and
search dedup is a no-op for them. No migration required to ship.
Tests:
- chunk_test.go covers short / heading split / oversized section /
content preservation / chunk numbering / parent-path stripping.
- sync_test.go adds long-file chunking, single-chunk-row short file,
skip-if-any-chunk-known, delete-all-chunks-of-disappeared-file.
Existing tests updated for #NNNN paths.
- search_test.go adds chunk-paths-dedupe-to-parent.
Closes gitea/mathias/infra#38.
The embed sync goroutine only walked brain/wiki/. brain/knowledge/ (112
curated entries, per CLAUDE.md the most-important brain content) had zero
coverage in brain_embeddings — vector retrieval was blind to it. Hybrid
BM25 + pgvector retrieval would never surface a curated knowledge entry
via the vector arm.
Extract the per-root walk into a loop over a small subdir list and add
"knowledge" alongside "wiki". scanDirs is package-level so it stays a
single source of truth for what gets embedded.
Also log each failing item's path + error string from StartSync.
Previously only the aggregate count was logged, so a persistent
`errors=1` per cycle was opaque. With per-item warnings, the actual
ollama "input length exceeds the context length" surface immediately.
Refs gitea/mathias/infra#37 (this commit covers the knowledge/ scan
bug; the long-file chunking bug is a separate change.)
The previous "crude redaction" — pgDSN[:strings.IndexByte(pgDSN+"@", '@')] —
sliced up to the `@` character, which sits *after* the password in a
postgres URL, so the log line included the password in plaintext (caught
on first activation, 2026-05-18 startup log).
Use url.Parse + URL.Redacted() instead. Falls back to "postgres://***"
if parsing fails — we never log a raw DSN.
CREATE DATABASE doesn't work inside a DO $$ ... $$ block (transactional
restriction). And psql `:'var'` substitutions resolve client-side, so
they can't reach inside a DO block either.
Replace both DO blocks with psql-native idioms:
- `\gexec` for the conditional CREATE DATABASE
- `\if` + `\gset` for the create-or-rotate-password branch on the
brain_app role
Verified end-to-end on koala postgres18: brain DB created, vector
0.8.1 extension installed, brain_app role login works.
Wires nomic-embed-text (iguana ollama) + pgvector on the shared
postgres18 into brain_query / brain_answer via Reciprocal Rank Fusion.
Pure BM25 stays the default; setting BRAIN_PG_DSN and BRAIN_EMBED_URL
together opts in. Setting one without the other is misconfiguration →
exit 1.
New packages:
- internal/embed
Client.Embed(ctx, text) → []float32 via POST {URL}/api/embed.
Defaults to nomic-embed-text:latest (768 dim). nil-on-empty-URL so
callers gate on a single nil check.
- internal/vectorstore
PGStore wraps a pgxpool against postgres18. Init creates
brain_embeddings(path PK, vector(768), updated_at) + HNSW cosine
index idempotently. Upsert / Delete / Search / KnownPaths.
Sync(brainDir, store, embedder) diffs brain/wiki/ against the store
and upserts new files / deletes removed ones; StartSync runs it on
a ticker (default 300s). Integration tests gated by BRAIN_PG_TEST_DSN.
- scripts/brain-embeddings-init.sql
One-time DBA setup: brain DB, brain_app role, vector extension,
GRANTs. Idempotent.
Search layer:
- search.QueryOptions gains Vector + Embedder fields.
- QueryContext is the cancellable variant; Query stays for callers.
- When both are set, BM25 (top-N) and pgvector (top-4N) candidates
merge via Reciprocal Rank Fusion (k=60, Cormack et al. 2009 — no
tuning knob, robust to scale differences between rankers).
- Vector-only hits are hydrated from disk so callers see uniform
Result records (path, title, excerpt, wing, hall, score).
- Wing/hall filters still apply to vector candidates via path-prefix.
- On embedder/vector errors the search falls back to BM25 — embedding
outage degrades quality but doesn't take the brain offline.
MCP wiring:
- mcp.Server.WithHybridRetrieval(v, e) opt-in setter, same shape as
WithReranker.
- brainQuery and brainAnswer pass the wired vector/embedder through
to search.QueryContext.
REST:
- POST /backfill-embeddings drives Sync synchronously. Returns
{added, deleted, errors[]}. 503 when feature is unconfigured.
cmd/server/main.go:
- BRAIN_PG_DSN + BRAIN_EMBED_URL together enable hybrid; one alone
→ exit 1.
- vectorAdapter bridges *PGStore (returns []Hit) to
search.VectorSearcher (which takes []VectorHit) without either
package importing the other.
- BRAIN_EMBED_SYNC_INTERVAL (default 300s) controls the background
Sync ticker.
Backend pivot from Qdrant to pgvector recorded in DECISIONS.md
2026-05-18 (supersedes 2026-04-08): postgres18 already runs in
databases/ ns, Qdrant was never deployed, one engine beats two.
Dependency: github.com/jackc/pgx/v5 — modern, native pgvector via
parametric vector literals.
Tests:
- embed.Client: empty-URL nil, request shape, dimension, upstream
error propagation, empty-text rejection.
- vectorstore.PGStore: dimension validation (unit); upsert/search/
KnownPaths (integration, BRAIN_PG_TEST_DSN-gated).
- vectorstore.Sync: adds new files, skips known, deletes
disappeared, skips _index.md, no-op when nil, collects embedder
errors.
- search.Query: hybrid promotes vector-only hits via RRF; falls
back to BM25 on embedder error.
Closes hyperguild#8.
Adds an opt-in cross-encoder rerank step between BM25 retrieval and LLM
synthesis. With BRAIN_RERANKER_URL set, brain_answer retrieves BM25
top-20, scores each excerpt against the query via Qwen3-Reranker on
Ollama, drops the "no" answers, and forwards up to 5 surviving sources
to the LLM. Unset, behaviour is unchanged (BM25 top-10 → LLM).
The reranker is a *filter*, not a re-ranker: Qwen3-Reranker emits a
binary yes/no token under its native chat template, and ties within the
"yes" set are broken by BM25 rank — what got retrieved first stays
ahead.
New package ingestion/internal/reranker:
- Client with URL, Model, HTTP fields.
- New(url, model) returns nil on empty url so callers can treat
"feature disabled" as a single nil check.
- Score(ctx, query, docs) issues one /api/generate call per doc using
the Qwen3-Reranker yes/no chat template (verbatim, because the model
was trained on this exact wording). Parses the first non-think token.
Wiring:
- mcp.Server gains a WithReranker fluent setter to keep NewServer
signature stable.
- brain_answer's BM25 limit jumps to 20 only when a reranker is wired,
to give the filter something to do.
- cmd/server/main.go reads BRAIN_RERANKER_URL (+ optional
BRAIN_RERANKER_MODEL, default dengcao/Qwen3-Reranker-0.6B:F16).
Tests cover: nil-on-empty-url, ordered yes/no scoring, request shape
(model, prompt contents, yes/no template), ambiguous response → 0,
empty doc slice, upstream-error propagation, plus an end-to-end
brain_answer integration that proves only the relevant note reaches the
LLM when noise.md is rejected.
Closes hyperguild#7.
Adds a minimal RFC 8414 + RFC 6749 client_credentials flow so claude.ai's
custom-MCP integration (no static-Bearer field in the UI) can exchange a
client_id + client_secret pair for the existing BRAIN_MCP_TOKEN and use
it as a Bearer on /mcp. No JWTs, no refresh, no expiry — the rest of
the auth middleware is unchanged.
New package ingestion/internal/oauth:
- MetadataHandler(issuer): serves /.well-known/oauth-authorization-server
with grant_types=[client_credentials] and both
token_endpoint_auth_methods (post + basic).
- TokenHandler(cfg): serves /oauth/token. Validates client_id and
client_secret via constant-time compare; returns BRAIN_MCP_TOKEN as
access_token. RFC 6749 §5.2 error JSON on bad grant / bad creds.
Wiring in cmd/server/main.go: opt-in by setting both OAUTH_CLIENT_ID and
OAUTH_CLIENT_SECRET. Setting only one is misconfiguration → exit 1.
Mounts both endpoints with no auth; MCP_RESOURCE_URL supplies the
issuer.
Also pivots issue #8's vector backend from Qdrant to pgvector (see
DECISIONS.md 2026-05-18) — Qdrant was never deployed and postgres18 with
pgvector already runs as the project default; supersedes 2026-04-08 for
this use case.
Tests cover post-auth, basic-auth, wrong secret, bad grant, GET
rejection, malformed Basic header, and Basic without colon.
Closes hyperguild#5.
Adds the `brain_tunnel` MCP tool and auto-tunnel behaviour for
`brain_write`, so concepts that appear in multiple wings become
navigable from any of them.
New surface in package brain:
- WriteTunnel(brainDir, src, tgt) — appends a `## See also` bidirectional
wikilink between two notes in different wings. Idempotent (link not
duplicated on re-call) and reuses an existing See also section.
- DetectTunnels(brainDir, content) — walks brain/wiki/, returns
TunnelCandidates for notes whose title appears in content. Tags
whole-word case-insensitive hits as Exact=true and substring-only hits
as Exact=false.
- AutoTunnel(brainDir, src, content) — wraps DetectTunnels: writes
cross-wing exact matches, stages fuzzy matches into
brain/raw/tunnel-candidates-<YYYY-MM-DD>.md for human review.
MCP wiring:
- `brain_tunnel` tool: explicit manual link (source, target).
- `brain_write` with wing+hall now triggers AutoTunnel on the new
content. Failures are logged and never abort the primary write.
readTitleAndCreated also humanises the slug fallback (hyphens → spaces)
so titleless notes participate in content matching.
Closes hyperguild#16.
Tests: idempotency, same-wing rejection, missing-note rejection,
See-also reuse, exact/fuzzy detection, slug fallback, MCP tool happy
path, auto-tunnel hook (cross-wing exact → linked; same-wing → skipped;
fuzzy → candidates file).
Reorders BearerAuth so a valid BRAIN_MCP_TOKEN match wins instantly and
never emits WWW-Authenticate. Adds RFC 9728 resource_metadata challenge
header on 401 (only when MCP_RESOURCE_URL is configured) so claude.ai's
OAuth-discovery path still works.
Why: claude CLI on koala/flamingo with `.mcp.json` `Authorization: Bearer
$BRAIN_MCP_TOKEN` was being kicked into RFC 7591 dynamic client
registration against Dex (static-only) and dying. Cause was the auth
middleware running JWT validation first and emitting an OAuth challenge
on the fall-through 401 even when the caller had a valid static token.
Inverting the precedence and gating the challenge on resourceMetadataURL
keeps the LAN/Tailscale CLI path silent and only invites OAuth discovery
on actually-unauthenticated requests.
Regression guards in the test file:
- valid static Bearer 200 has no WWW-Authenticate
- 401 with resourceMetadataURL set carries the challenge
- 401 with empty resourceMetadataURL emits no challenge
Closes hyperguild#9 in code. Live verification (claude CLI on koala
listing brain tools) blocked on ingestion image rebuild + redeploy.
Adds a two-dimensional address (wing, hall) to brain notes. A wing is a
topic domain (e.g. jepa-fx, hyperguild); a hall is one of a closed
vocabulary of memory types (facts, decisions, failures, hypotheses,
sources). Notes route to brain/wiki/<wing>/<hall>/<slug>.md with
wing/hall/created_at YAML frontmatter, making the directory a valid
Obsidian vault.
Changes:
- new package ingestion/internal/brain (NotePath, ValidHalls, Sanitise,
BuildWingIndex, BuildAllWingIndexes)
- api.WriteNote refactored to WriteNoteOptions; wing+hall routes to
brain/wiki/, otherwise falls back to brain/knowledge/ (legacy)
- search.Query → QueryOptions with optional Wing/Hall filtering; Result
carries wing/hall extracted from frontmatter or path segments
- MCP tools brain_write and brain_query gain optional wing/hall params
(hall enum-validated); new brain_index tool regenerates _index.md MOC
- POST /index REST endpoint mirrors brain_index
- brain_write auto-rebuilds the wing's _index.md after a wing+hall write
- scripts/migrate-brain-halls.sh migrates flat brain/wiki/{concepts,entities}/
into the new layout (dry-run by default, --commit applies)
All existing tests pass; new tests cover wing/hall write routing, scope
filtering, invalid hall rejection, _index.md generation, and migration
script paths.
Closes hyperguild#1.
- Random port via net.Listen(":0") replaces hardcoded 33310 (was the
primary failure mode under parallel test load).
- Bump waitForPort deadline 5s → 30s — `go build` under -race can exceed
5s on a loaded machine.
- Replace osPath() (always returned empty PATH because exec.Command("env").Env
is the *child's* env, not the parent's) with explicit PATH+HOME via
os.Getenv. Don't inherit full env: would leak ROUTING_MCP_TOKEN from the
parent shell and flip the routing pod into auth-required mode, breaking
the test.
Closes#15. Verified: 10 cold-cache test runs pass, 3 consecutive task check
runs pass.
Drops the intermediate `staging/<name>` branch so Flux begins reconciling the
namespace within ~60s of `project_create` instead of waiting on a human PR
merge. Consistent with project-wide trunk-based development.
Rationale: ADR 2026-05-18 in DECISIONS.md.
Closes hyperguild#14 (item 1). Item 2 (GITEA_MCP_TOKEN in SOPS) verified
already-present in infra@408a527 secrets.enc.yaml.
Note: TestRoutingPodEndToEnd is failing on main pre-existing this commit
(context deadline waiting for port 33310 in <5s). Not caused by this change;
project skill tests pass. To track in a separate issue.
mcpclient.New previously accepted an empty token and silently omitted
the Authorization header at request time. When the env var sourcing
the token was missing from a Kubernetes Secret (envFrom doesn't warn
on missing keys), this surfaced as an opaque 401 from the upstream
MCP server with no log trail — see hyperguild #13 and brain entry
"mcpclient-empty-token-silent-401-envfrom-missing-key".
mcpclient.New now returns ErrTokenRequired when token is empty.
The routing pod's project_create init checks the error and exits
with a clear message pointing at routing-secrets, turning a runtime
401 storm into a startup crashloop the operator can fix immediately.
Tests pass a dummy "test" token (httptest servers don't enforce
bearer auth, so any non-empty value works). Added a regression
test asserting empty-token construction returns ErrTokenRequired.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Gitea's push-mirror cannot push to a non-existent remote — it just
runs 'git push' against whatever URL it's given. So a project_create
flow that only configures the mirror leaves the GitHub side as an
unfulfillable URL.
New internal/githubclient package: single-purpose client that POSTs
/user/repos to create an empty private repo (auto_init=false so the
first mirror push doesn't conflict with a generated README). Treats
422 'name already exists' as idempotent success via ErrAlreadyExists.
401/403 are surfaced as 'PAT missing repo scope or invalid' so the
operator sees the real cause instead of a vague upstream error.
Skill wiring:
- New stepCreateGitHub between stepCreateRepo and stepMirror in the
orchestrator.
- Skipped entirely when Config.GitHub is nil (degraded mode — the
routing pod runs without GITHUB_PAT, mirror config still lands,
but the actual sync to github fails until the repo exists).
- cmd/routing/main.go constructs githubclient.New(GitHubPAT) only
when the PAT is set; the skill receives nil otherwise.
Tests:
- happy path: fake github 201 + assertions that the 'reached' array
is [create_repo, create_github_repo, mirror, infra_commit, issue].
- github 422 already-exists: idempotent, all gitea steps still run.
- github 401: returns failed_step=create_github_repo, no mirror or
later steps.
- degraded mode (Config.GitHub nil): reached omits create_github_repo,
rest of the flow runs unchanged.
Updated existing tests to read [skill, gh] from newSkill instead of
just skill, and adjusted reached-array expectations to include the
new step.
Tracks #10.
Plan 7 (2026-05-12) retired the supervisor pod, deleted cmd/supervisor/
and the root Dockerfile, but cd.yml still tried to:
- buildctl a supervisor image using the (non-existent) root Dockerfile
- sed gitea.d-ma.be/mathias/supervisor: in k3s/apps/supervisor/deployment.yaml
(also non-existent — k3s/apps/supervisor/ only ships ingestion-* files now)
- wait for and rollout-verify a supervisor Deployment that no longer exists
Result: every CD run since the retirement has been failing at 'Build and push
supervisor image', leaving ingestion + routing un-deployed despite the binaries
being built. The routing pod was last deployed at sha 189ff89c (weeks stale).
This commit:
- Removes the supervisor build step and supervisor sed/git add lines.
- Adds 'Wait for Flux to apply new routing image' + 'Verify routing rollout'
steps that mirror the ingestion equivalents, so failures land loudly rather
than 5 min later when something tries to call the new tool.
- Updates the chore(deploy) commit message to 'ingestion+routing' to match
reality.
Unblocks deployment of feat: project_create (#10).
Adds the project_create tool to the routing pod that automates the
"new project" bootstrap end-to-end from claude.ai. Gitea-first
architecture: GitHub receives the repo only via push-mirror, never
via a direct GitHub API call from this server.
Four sequential calls to the gitea-mcp server (configured via
GITEA_MCP_URL):
1. create_project_from_template — Gitea repo from
template-go-{agent,web} per the 'stack' arg
2. repo_mirror_push (action=add) — push-mirror to
github.com/<GITHUB_OWNER>/<name>.git, interval 8h, sync_on_commit
3. file_write_branch — k3s/staging/<name>/namespace.yaml committed
on a staging/<name> branch in the infra repo
4. issue_create — experiment brief (hypothesis + description + stack
+ provisioning log) on the new repo, returns the issue_url
Returns gitea_url, github_url, issue_url, next_steps. The next_steps
string is the exact shell sequence the operator runs locally to
clone, scaffold via local-dev 'task new-project', and push.
Idempotency: create_project_from_template + repo_mirror_push +
file_write_branch all return JSON-RPC code -32003 (Conflict) when
their target already exists; the orchestrator swallows the conflict
and continues. Re-running on an existing repo restates the brief in
a fresh issue.
Error handling: on any non-conflict downstream failure the response
returns {reached: ["<step>",...], failed_step: "<step>"} alongside
a JSON-RPC error. No rollback — partial state stays so the operator
can resume manually.
New env vars (all optional except GITEA_MCP_URL):
GITEA_MCP_URL enables the tool
GITEA_MCP_TOKEN bearer auth for gitea-mcp
GITEA_OWNER default mathias
GITHUB_OWNER default mathiasb
INFRA_REPO default infra
GITHUB_PAT repo scope, used as mirror remote_password; never logged
Without GITEA_MCP_URL set, the tool is not registered and the
routing pod starts normally (degrades open).
internal/mcpclient/: new minimal JSON-RPC tools/call client with
bearer auth, used by project_create. Unwraps MCP's
content[0].text envelope and surfaces typed errors via mcpclient.Error.
Tests: table-driven against an httptest fake gitea-mcp covering happy
path (4-step success + correct PATCH-style arg shapes), idempotent
repo-exists, mirror failure (partial-success response with reached=
[create_repo] + failed_step=mirror), infra-commit failure (reached up
to mirror + failed_step=infra_commit), and validation errors.
Closes#10
Removes the supervisor binary and its two exclusive skill packages (tdd,
spec) now that all functionality is covered by SKILL.md files, the routing
pod, and the brain MCP. Routing pod reuses review/debug/retrospective/trainer
skill packages which are intentionally preserved.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds two new LLM-backed MCP tools to the ingestion service:
- brain_answer(query): BM25 retrieval + LLM synthesis → answer + sources
- brain_classify(text): classifies doc into type/title/tags via LLM
Adds llm.Router for primary→fallback routing (berget.ai → iguana).
Wired via BRAIN_LLM_PRIMARY_URL/BRAIN_LLM_FALLBACK_URL env vars;
no-op when unset so existing deployments are unaffected.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The routing decision is about reasoning capacity, not cost or provider.
Fast model (koala/qwen35-9b-fast) handles high-pass-rate calls; thinking
model (iguana/gemma4-26b) handles low-pass-rate calls. Removes the
implicit Anthropic dependency from the routing pod — both models go
through LiteLLM.
Renames: HYPERGUILD_LOCAL_MODEL → HYPERGUILD_FAST_MODEL,
HYPERGUILD_CLAUDE_MODEL → HYPERGUILD_THINKING_MODEL,
Router.LocalModel → FastModel, Router.ClaudeModel → ThinkingModel,
log decision "claude_fallback" → "thinking_fallback".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Brain and supervisor now behind NPM with Let's Encrypt. Use canonical
hostnames (brain-mcp.d-ma.be, supervisor-mcp.d-ma.be) over NodePorts so
connections work across networks without Tailscale for DNS.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
claude.ai probes with GET before initialize; without this the supervisor
returned application/json parse error instead of text/event-stream, causing
"Couldn't reach the MCP server" in the claude.ai connector setup.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>