feat(brain): re-embed on file edit (Sync should respect mtime) #23
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Context
vectorstore.Synccurrently only embeds files it has never seen — once a path appears inbrain_embeddings, edits to the underlying.mdare invisible to the vector store. The code has a TODO marker for this:Result: any note revised after its first sync silently drifts from its embedding.
Proposed change
Storeto exposeKnownPathsWithMtime() map[string]time.Time— returns each path'supdated_atfrombrain_embeddings. (OrKnownPathsreturns a richer struct; small API choice.)Sync, compare each file'smtimeagainst the store'supdated_at. If newer → re-embed + upsert.Acceptance criteria
Syncre-embeds files whosemtime > updated_atOut of scope
Content-hash invalidation (mtime is enough for the brain's edit pattern — Syncthing preserves it).
Shipped in commit
8157397.Approach
Store interface evolved from
KnownPaths() map[string]struct{}→KnownPathsWithTime() map[string]time.Time. PGStore:SELECT path, updated_at FROM brain_embeddings. Sync groups chunks by parent and tracks the earliest updated_at per parent — if a file's mtime is after that, at least one chunk is stale, so the file is re-embedded.Re-embed path deletes every old chunk for the parent first, then re-chunks + re-embeds + re-upserts. Handles shrunk files cleanly (no orphan
#NNNNrows at higher indexes).Tests
Existing 15 tests updated for the new stub signature (
stubStore.knownis nowmap[string]time.Time; zero values default to a far-future sentinel so "skip if already known" tests keep passing without per-test setup).Backward compatibility
brain_embeddingsrows pre-dating this change carry validupdated_atvalues — the column was always populated viaDEFAULT now()+ON CONFLICT ... updated_at = now()on every upsert. No schema migration. Live pod will start re-embedding any file whose source has been edited since its chunks were originally written.Acceptance
mtime > updated_atSELECT path, updated_atper cycle replaces singleSELECT path)Closes.