• v0.8.0 37fdd33b2d

    v0.8.0 — chunk-before-embed for arbitrarily long markdown (#38)
    All checks were successful
    CI / Lint / Test / Vet (push) Successful in 11s
    CI / Mirror to GitHub (push) Has been skipped

    mathias released this 2026-05-19 20:00:05 +00:00 | 22 commits to main since this release

    ChunkMarkdown splits at H1/H2 boundaries, sub-splits oversized sections
    at paragraph boundaries with greedy packing under maxChunkBytes=4000
    (≈1000 nomic tokens — well under the 2048 ceiling).

    Storage: each chunk lives at "#NNNN" in brain_embeddings, 1-based
    4-digit zero-padded for stable sort order. No schema change.

    Retrieval: hybridMerge collapses chunk-path vector hits to parent via
    ParentPath before scope check, RRF accumulation, and hydration. Three
    chunk hits → one result row.

    Backward compatibility: pre-existing bare-path rows in brain_embeddings
    keep working — ParentPath is a no-op for them. No migration needed.

    First production sync after deploy hit added=32 deleted=0 errors=0 —
    first errors=0 cycle in days. The three previously-failing files now
    have 9 / 11 / 12 chunks each, all retrievable via brain_query.

    Closes infra#38.

    Downloads