fix(graph): route wiki/<flat>.md to Type=knowledge, not Type=hall with filename-as-wing

classifyByPath had a hole: paths like wiki/index.md or wiki/<slug>.md (direct children of wiki/, no subdirectory) hit the default branch and wrote Wing=parts[1] — which IS the filename, not a wing. Symptom in brain_entities: rows like (slug=index, wing=index.md) and (slug=autobe-..., wing=autobe-evaluation-pattern-....md). Fix: when len(parts) < 3 (no subdirectory at all), fall through to Type=knowledge and let frontmatter set wing/hall if present. Add brain/eval/ artifacts at the same time: - qa-2026-05.md — 20 hand-authored Q→expected-slug pairs covering the homelab knowledge corpus across mcp, dex, gitops, postgres, go, models, methodology - score.py — calls brain_query for each pair, scores top-1 + top-3, emits per-question detail. BRAIN_MCP_TOKEN via env. Pre-fix baseline against the live brain: top-1 = 20% (4/20), top-3 = 65% (13/20). Six hard misses where the expected slug doesn't even land in the top-5. Used to gate the phase 2 DIKW redesign (infra#62 follow-up): if phase 1 fixes (this parser fix + 20 backlink authoring on top orphans) lift top-1 by <10 absolute points, structure is the bottleneck and the tier redesign is justified.
2026-05-24 22:33:04 +02:00
parent 72be87b4e7
commit 3084c4173d
5 changed files with 413 additions and 0 deletions
--- a/brain/eval/qa-2026-05.md
+++ b/brain/eval/qa-2026-05.md
@@ -0,0 +1,76 @@
+# Brain retrieval eval set — 2026-05-24
+
+20 hand-authored Q→expected-top-1-slug pairs. Used by `score.sh` to
+measure brain_query top-1 + top-3 hit rate against the live brain.
+
+Authoring rules:
+- Each question maps to **one** clear-best entry. Avoid ambiguous
+  questions where multiple slugs could be the right answer.
+- Questions are phrased the way a future-me would actually ask, not
+  the way the entry's title reads. Some lexical distance is the point.
+- `expected` is the slug as stored in `brain_entities.slug`. Update
+  if the slug renames.
+
+## Pairs
+
+```
+q: how do I stop dex from logging users out on every pod restart?
+expected: dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
+
+q: my postgres-exporter broke after revoking PUBLIC CONNECT — why?
+expected: postgres-least-privilege-migration-tenant-grant-bypass-2026-05
+
+q: when is a NodePort acceptable vs needing a public ingress with bearer gate?
+expected: homelab-network-perimeter-model
+
+q: what does container exit code 255 with reason Unknown mean?
+expected: exit-255-unknown-reason-not-oom
+
+q: can gitea push-mirror create the github repo automatically?
+expected: gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
+
+q: a flux kustomization is stuck after I removed a resource — why?
+expected: flux-healthcheck-stale-on-resource-removal
+
+q: the bytes buffer aliasing trap with Reset in a loop — what's the bug?
+expected: go-bytes-buffer-bytes-reset-aliasing-trap
+
+q: what are the homelab architecture principles from may 2026?
+expected: homelab-architecture-principles-2026-05
+
+q: where does the sops age private key live in the cluster?
+expected: 2026-05-04-sops-age-key-from-flux-cluster
+
+q: why do my grafana dashboards disappear after a pod restart?
+expected: grafana-dashboards-as-code-not-ui-state
+
+q: what is the double diamond methodology?
+expected: double-diamond-methodology
+
+q: my MCP server works from claude code but fails on claude.ai — what's different?
+expected: 2026-05-04-mcp-transport-version-claude-ai-strict
+
+q: how should I rate security findings — isolated bugs or exploit chains?
+expected: homelab-security-chains-not-bugs
+
+q: how should canonical context files relate to derived adapter files?
+expected: 2026-05-03-canonical-vs-derived-context-flow
+
+q: what is the homelab core vocabulary glossary?
+expected: homelab-core-glossary
+
+q: which models on koala llama-swap actually emit native tool_calls correctly?
+expected: koala-llama-swap-native-tool-calls-survey-2026-05
+
+q: what is qwen35-9b-fast and what's it used for?
+expected: qwen35-9b-fast
+
+q: in go, how do I prevent defer body close from silently dropping errors?
+expected: go-defer-errcheck-body-close
+
+q: what was the level 3 rewrite of hyperguild's ingestion pipeline?
+expected: hyperguild-level3-pipeline-rewrite
+
+q: what's the new-project ADR — is it gitea-first or github-first?
+expected: adr-new-project-gitea-first-github-mirror
+```