Files

Mathias Bergqvist 986e3e1d12 docs(hyperguild): document brain pass-rate subcommand and /pass-rate endpoint

Adds pass-rate to the CLI README's subcommand block. Updates CLAUDE.md
to note the new /pass-rate endpoint alongside the existing brain
HTTP REST API surface. Updates the session_log MCP tool's
final_status description to reflect the new pass|fail|skip vocabulary
introduced by Plan 5's SKILL.md instrumentation; the aggregator
still accepts legacy ok|error|skipped values for backwards compat.

2026-05-03 22:55:35 +02:00

11 KiB

Raw Blame History

Agent context — Mathias workspace

Who I am

I'm Mathias, a digital product manager and technology consultant based in Sweden. I build software, research emerging tech, and deliver consulting engagements for clients under NDA. I work across AI/ML, financial automation, web applications, and climate/sustainability tech.

How I work with agents

I think like a product manager — I care about why before how
I want agents to be opinionated and push back, not just execute blindly
I prefer concise responses; skip ceremony and get to the point
When I say "build this", I mean production-quality with tests, not a demo
Ask me before making irreversible changes or adding heavy dependencies
I work with confidential client data — never send it to cloud APIs unless I explicitly say it's OK

Behavior rules

These rules apply to every task across every project, regardless of harness.

No assumptions. Don't hide confusion — surface it. Surface tradeoffs explicitly. Think before coding; if the problem is unclear, ask or state assumptions before acting.
Minimum viable code. Solve with the smallest change that works. Nothing speculative, no "while we're here" cleanups, no premature abstractions. Simplicity first.
Surgical changes. Touch only what the task requires. Leave unrelated code, files, and formatting alone. Diffs should be small and reviewable.
Goal-driven execution. Define clear success criteria up front for every task. Loop — implement, verify, refine — until those criteria are met. Don't claim completion without evidence (tests pass, command output, observed behavior).

Default stack

Layer	Default	Fallback	Last resort
Language	Go	Python	TypeScript, Java, C
UI	HTMX + Templ	Server-rendered HTML	React (only if SPA is justified)
Build	Task (taskfile.dev)	Make	—
Containers	Docker Compose (dev), k3s (prod)	—	—
DB	PostgreSQL + sqlc	SQLite	—
Search	Qdrant (vector), BM25	—	—
Logging	slog (structured)	—	—
Testing	Table-driven, testify	—	—

Exploratory: Rust, Zig — I'll tell you when I want these.

Code conventions

Go style: golines, gofumpt, golangci-lint
Errors: fmt.Errorf("operation: %w", err) — never naked, never log-and-return
Naming: stdlib conventions, no stuttering
Architecture: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
Git: conventional commits (feat:, fix:, chore:), one concern per PR, PR describes why not what
Security: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
Dependencies: prefer stdlib. testify, slog, templ, sqlc are pre-approved; anything else needs justification in the commit message

Infrastructure

Three machines on Tailscale:

Machine	Role	Key specs
koala	GPU inference, heavy compute	RTX 5070, runs llama-swap, Qdrant
iguana	Services, builds	M2 Ultra Mac
flamingo	Daily driver, edge	Mac mini, ~/dev is here

Model routing: LiteLLM in front of llama-swap (local) + cloud APIs (when permitted)
Orchestration: k3s cluster across all three machines
Networking: Tailscale mesh

Project landscape

All development repos live at ~/dev/ (softlink from ~/Documents/local-dev/).

Organized in thematic folders:

Folder	Focus	Count
`GO/`	Go web frameworks, API integrations, learning projects	~10
`AI/`	ML research, AI frameworks (FinRL, DSPy, crawl4ai)	~6
`AGENTS/`	Autonomous agents, coding agents, MCP servers, infra	~15
`QKX/`	Invoice processing, financial automation, payment systems	~13
`XT/`	Climate data, sustainability (Klimatkollen, Garbo)	~2

See ~/dev/PROJECT_SUMMARY.md for detailed descriptions of each project.

Key active projects

super-koala (AGENTS/) — multi-component agent stack with LangGraph, DSPy, MCP
azure-tiger (QKX/) — invoice extraction → ISO 20022 payment instructions
gocrwl (AGENTS/) — Go web crawler with containerized deployment
koala-ai-stack (AGENTS/) — local AI server infrastructure management
klimatkollen (XT/) — Swedish municipal climate data platform

Knowledge base

When available, agents can query the shared knowledge base:

MCP: mcp://hyperguild.<TAILNET>.ts.net:3100/knowledge
HTTP: http://hyperguild.<TAILNET>.ts.net:3100/api/v1/search

Scoping: defaults to public collection; client projects filter to {client} + public

Client work rules

When working on a project tagged with a client name:

Never send code, data, or context to cloud APIs — use local models only
Never reference other client projects or their data
Keep all artifacts within the client's git org / directory
Treat everything as confidential unless told otherwise

Harness-agnostic principles

This context is designed to work with any AI coding tool:

Claude Code, Cursor, Aider, Open WebUI, Charmbracelet Mods/Crush
Pi Coding Agent, Mistral Vibe, Antigravity
Any tool that accepts a system prompt or reads a markdown context file

The canonical source is always .context/AGENT.md (root) and .context/PROJECT.md (per-project). Derived files are committed (see How context propagates below) so a git pull on any host yields full agent context with no setup.

How context propagates

Canonical sources of truth:

Universal: ~/dev/.context/AGENT.md (this file)
Project: <repo>/.context/PROJECT.md (per-repo)

Derived files (committed, regenerated by task context:sync):

CLAUDE.md, AGENTS.md, .cursorrules, .aider.conventions.md, .context/system-prompt.txt

Workflow:

Edit a canonical file. Run task context:sync. Commit canonical and derived together. Push.
On any other host, git pull brings both. Claude Code (tree-walking) uses CLAUDE.md; Crush / Pi / Antigravity (cwd-only) use AGENTS.md; Cursor uses .cursorrules; Aider uses .aider.conventions.md.
task check runs context:sync then asserts git status --porcelain is empty over the derived files (catches both modified-tracked drift and missing-untracked adapters). A drift fails the check with a message telling you to stage the regenerated files.

Behavior rules in this file and per-project rules in PROJECT.md apply unconditionally on every host, every harness.

Engineering Skills

Shared engineering skills are available in ~/dev/.skills/. Load on demand via the index.

See ~/dev/.skills/SKILLS_INDEX.md for the full list with descriptions and "use when" triggers.

Key skills:

TDD: always write tests first — load tdd skill
Code Review: load code-review skill before any review
SOLID/Clean Code: load solid or clean-code skill for design work
Problem first: load problem-analysis skill before coding non-trivial features

Project context

Identity

Name: supervisor
Owner: Mathias
Client: personal
Repo:
Status: active

Stack

Primary language: Go
UI layer: HTMX + Templ (when applicable)
Fallback languages: Python, TypeScript (justify in PR if used)
Build: Task (taskfile.dev), not Make
Containers: Docker (compose for dev, k3s for deploy)
Target infra: koala (GPU workloads), iguana (services), flamingo (edge)

Conventions

Code style

Go: follow golines, gofumpt, golangci-lint with project config
Tests: table-driven, in _test.go next to source, testify for assertions
Errors: wrap with fmt.Errorf("operation: %w", err), no naked returns
Naming: stdlib conventions, no stuttering (http.Client not http.HTTPClient)

Architecture preferences

Prefer standard library over frameworks (net/http over gin/echo)
Dependency injection via constructor functions, not containers
Configuration via environment variables, parsed at startup into a typed struct
Structured logging via slog

Git

Conventional commits: feat:, fix:, chore:, docs:, refactor:
Branch naming: feat/short-description, fix/short-description
PRs: one concern per PR, description explains why not what

Security

No secrets in code, ever — use env vars or SOPS-encrypted files
Client data never leaves local network unless explicitly cleared
Dependencies: audit with govulncheck before adding

MCP endpoints

Two MCP servers expose this project's tooling, both reachable over Tailscale:

brain at http://koala:30330/mcp — preferred path for brain_query, brain_write, brain_ingest, brain_ingest_raw, and session_log. Hosted by the ingestion service directly.
supervisor at http://koala:30320/mcp — skill workers (tdd_red, tdd_green, tdd_refactor, review, debug, spec, retrospective, trainer, tier). Will shrink as skill workers move to SKILL.md in a later migration.

The brain HTTP REST API (/query, /write, /ingest, /ingest-raw, /ingest-path, /backfill-refs) remains available on the same port (3300) for shell scripts and non-MCP clients.

The brain HTTP REST API also serves a read-only GET /pass-rate?skill=X&window=Y endpoint that aggregates final_status counts from session logs and returns {skill, window, pass, fail, skip, total, pass_rate}. Plan 6 (routing pod) reads this to decide whether to route skill calls to local models. Pass rate is null when no logged invocations are in the window.

Agent instructions

When acting as a coding agent on this project:

Read this file and all SKILL.md files in .skills/ before starting work
Run task check before committing (lint + test + vet)
If unsure about a convention, check DECISIONS.md or ask
Never modify files outside the project root without explicit permission
When adding a dependency, explain why in the commit message
For client projects: never send code or context to cloud APIs — use local models via LiteLLM

11 KiB Raw Blame History