Top-1 stayed at 20% (4/20), top-3 +5pt (65→70%) after:
- extract.go wing/topic parser fix (commit 3084c41)
- qwen35-9b-fast entity pad (was 239-byte stub → full entity)
- grafana entry: add "pod restart" synonym to lesson body
- dangling refs stripped from index.md + entities/k3s.md
The only retrieval move: qwen35-9b-fast climbed from rank 0 (off top-5)
to rank 2 — the entity pad worked. Other 5 misses are ranker behaviour
on already-keyword-overlapping entries; BM25 doesn't weight the right
slugs to the top.
Per the proposal's gate (≥10pt lift = stop, <10pt = Phase 2 justified),
the DIKW tier redesign earns its cost. Next session: tier column +
file moves + tier-weighted retrieval, then re-measure against this
same eval set.
classifyByPath had a hole: paths like wiki/index.md or wiki/<slug>.md
(direct children of wiki/, no subdirectory) hit the default branch and
wrote Wing=parts[1] — which IS the filename, not a wing. Symptom in
brain_entities: rows like (slug=index, wing=index.md) and
(slug=autobe-..., wing=autobe-evaluation-pattern-....md).
Fix: when len(parts) < 3 (no subdirectory at all), fall through to
Type=knowledge and let frontmatter set wing/hall if present.
Add brain/eval/ artifacts at the same time:
- qa-2026-05.md — 20 hand-authored Q→expected-slug pairs covering the
homelab knowledge corpus across mcp, dex, gitops, postgres, go,
models, methodology
- score.py — calls brain_query for each pair, scores top-1 + top-3,
emits per-question detail. BRAIN_MCP_TOKEN via env.
Pre-fix baseline against the live brain: top-1 = 20% (4/20),
top-3 = 65% (13/20). Six hard misses where the expected slug doesn't
even land in the top-5.
Used to gate the phase 2 DIKW redesign (infra#62 follow-up): if
phase 1 fixes (this parser fix + 20 backlink authoring on top
orphans) lift top-1 by <10 absolute points, structure is the
bottleneck and the tier redesign is justified.