Walks wiki/sources/, extracts wikilinks from each source page, and injects
## Sources back-refs into all linked concept and entity pages. All refs from
all sources are accumulated in memory before writing, so multiple sources
referencing the same concept are merged in a single write. Running the
endpoint multiple times is safe — wiki.Merge deduplicates bullet items.
After each ingestion, every concept and entity page linked from the
source page gains a ## Sources entry pointing back to that source.
Pages already on disk (from prior ingestions) are loaded and updated,
so re-ingesting a new source accumulates references over time.
Deduplication is handled by wiki.Merge's existing bullet-section logic.
- New extract package: Text() dispatcher for .md/.txt passthrough and
PDF extraction via pdftotext subprocess
- wiki.Entry gains Aliases []string, loaded from YAML frontmatter
- Fuzzy entity resolution in pipeline: normalizes titles (lowercase,
strip articles, collapse hyphens) and matches proposed pages against
existing inventory slugs and aliases to prevent proliferation
- Watcher and API handler now use extract.Text() instead of os.ReadFile
- Dockerfile: apk add poppler-utils in Alpine runtime stage
Files dropped into brain/raw/ are now copied to processed/ or failed/ rather
than moved. A .processed or .failed marker is written next to the original so
the watcher skips it on subsequent polls without deleting it. This keeps
Syncthing-synced Obsidian vaults intact after ingestion.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Start background watcher on startup when INGEST_WATCH_INTERVAL > 0
- Procfile: add INGEST_WATCH_INTERVAL=30 and INGEST_SVC_URL for supervisor
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CLAUDE.md has a specific meaning in the Claude Code ecosystem (agent
instructions). The wiki schema for the ingestion pipeline should live
in schema.md to avoid confusion.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Return error when both path and content are supplied simultaneously
- Improve tool description to clearly state the two valid call forms
- Add per-field descriptions so LLMs understand what each parameter requires
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds an optional path field to brain_ingest so Claude can ingest files
or directories directly by path without embedding content in the call.
Routing: path set → /ingest-path; content+source set → /ingest; neither → error.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wires pipeline.Run into the HTTP layer so callers can ingest raw text
or files/directories without touching the filesystem directly. Rewrites
main.go to parse LLM and watcher env vars and build pipeline.Config.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Polls brain/raw/ on a configurable ticker, derives human-readable source
names from filenames, runs the pipeline, and moves files to
processed/YYYY-MM-DD/ on success or failed/ on error with a log.md entry.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds prompt.go (BuildPrompt + systemPrompt) and pipeline.go (Run, Config,
Result, mergeAll) that wire chunking, LLM calls, parse, merge, index rebuild,
and log append into a single ingestion pipeline. Includes integration tests
covering write, dry-run, and duplicate-path merge scenarios.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
brain_write with a custom filename omitted the .md extension, causing
search to skip the file (search.go filters on HasSuffix .md).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Small models (phi4-mini) produce correct markdown analysis but then
append the old {status/phase/skill} JSON schema out of training habit.
stripResultJSON() detects and removes these trailing fences so Claude
Code receives clean prose regardless of model behaviour.
Non-schema json blocks (config examples etc) are preserved.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Local models (phi4-mini, qwen3-coder-30b) ignore soft instructions
and revert to JSON from their training. Move the prohibition to the
very top with bold caps before any other content.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove JSON output contracts from all skill files (debug, review, spec,
tdd, retrospective, trainer-reader, trainer-writer). Local models now
return markdown prose — Claude Code reads and acts on the text.
Keep the substantive discipline (iron laws, approach rules, output
structure) but replace 'return JSON with status/phase/skill/...' with
clear markdown format instructions.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove JSON output contract, verification rules, escalation, and scope
limits that applied to the old Claude subprocess workers. Local models
are now consultants returning markdown prose, not JSON executors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ingestion server is a pure-Go HTTP binary — alpine runtime, no node.js.
CD now builds both supervisor and ingestion images on every push,
updates both deployment.yaml files in the infra repo.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace ollama/ prefix with iguana/ and koala/ prefixes to match
actual model IDs exposed by LiteLLM on this cluster.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Drop the three-layer Claude subprocess orchestration (local model →
Claude verifier → cloud escalation). Skills now call LiteLLM directly
and return plain text to Claude Code, which decides what to do with it.
- Delete executor, orchestrator, verifier, result, attempts packages
- Simplify LiteLLMExecutor: Run(Request)→Result becomes Complete(model,sys,user)→(string,int64,error)
- Replace ExecutorFn with CompleteFunc in all 6 skill configs
- Rewrite all skill handlers to call Complete and return {"text","model","duration_ms"}
- Simplify config/models: remove Verifier/LlamaSwapURL, add ModelFor
- Bump version to v0.5.0
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>