Phase 1 of mathias/skills extraction (infra#62 Track D — homelab next-step plan addendum). Imports ~/dev/.skills/ verbatim (19 skill dirs + SKILLS_INDEX.md) and adds the installation surface: - Taskfile.yml — install / update / list / release / check targets - install.sh — bootstrap installer for hosts without Task. Idempotent symlink wirer; default checkout at ~/.local/share/skills/ on every host; SKILLS_REF env var pins a tag (default: main). - .gitea/workflows/release.yml — auto-tag every push to main by Bump-Type footer (major/minor/patch, default patch). Skipped when commit contains [skip-release]. - README — usage, versioning, contribution flow, secret-hygiene rule. Phase 1 wires Claude Code only (~/.claude/skills/<name> global + <repo>/.claude/skills/<name> per-repo). Phase 2 adds Crush, opencode, antigravity, and gitea-resident agents (cobalt-dingo, agentsquad) once their skill conventions are researched. Public repo, markdown-only — no secrets, no client names. Verified via pre-push grep before initial push. [skip-release]
7.7 KiB
name, description
| name | description |
|---|---|
| session-retrospective | Review a completed coding session (from brain/sessions/*.jsonl) and surface non-obvious learnings worth preserving. Use after a session ends, before context is lost. |
Session Retrospective
Overview
A session retrospective reads the structured log of a completed session — every skill invocation, what was attempted, what failed, what passed, how long it took — and identifies what is worth preserving in the brain for future sessions. The output is a small list of learnings, each with enough context that someone reading them cold understands why they matter.
Core principle: A retrospective that surfaces only obvious patterns is a retrospective that wastes the brain. Filter aggressively for non-obvious.
Iron Laws
- Do not call
brain_writefrom this skill. Surface candidates only — the caller decides what to write. - Run
brain_queryfirst to avoid surfacing learnings that are already in the brain. - Every Recommendation must generalize beyond this session. Project-specific paths and identifiers belong in Context, not Recommendation.
When to Use
- After a coding session ends, before the operator switches context
- After a multi-day push on the same area, summarizing the whole stretch
- When the brain is feeling stale on a project area — re-running retrospective on older sessions can re-surface forgotten context
Do not use for:
- Sessions with no novel content (single-attempt passes, mechanical operations only)
- Sessions less than ~30 minutes long, unless something dramatic happened in them
Inputs
The session log is in JSON Lines format. Each project that uses the brain stores logs in its own brain/sessions/ directory (for hyperguild, that is ~/dev/AI/hyperguild/brain/sessions/<session-id>.jsonl). Each line is one event:
{"ts": "...", "phase": "tdd_red", "skill": "tdd", "outcome": "fail", "notes": "..."}
Either pass the session ID explicitly, or default to the most recent file.
What is Worth Preserving
- Patterns that worked — approaches clean enough to repeat next time, especially if non-obvious from the codebase alone
- Failures that revealed something — bugs whose root cause taught you something about the codebase, the toolchain, or the approach
- Decisions made during the session — architectural, structural, tooling choices that future-you would want to know the reasoning behind
- Anything that contradicts or extends established patterns — these are the highest-value learnings
What is NOT Worth Preserving
- Routine red-green-refactor cycles with no surprises
- Single-attempt passes with no interesting context
- Mechanical operations (file moves, renames, formatting fixes)
- Failures whose cause was a typo
- Anything that is already in
CLAUDE.mdor an existing memory file
Decision procedure: If the answer would be discoverable by reading CLAUDE.md, the codebase, or standard toolchain docs, drop it. If you are still unsure, ask: "would future-me, three months from now, want this to surface when querying the brain?" If no, drop it.
Output Format
Respond in markdown. For each learning worth preserving:
**Learning:** One sentence describing what was learned.
**Context:** Why this session surfaced it — what made it non-obvious.
**Recommendation:** What should be done differently or repeated going forward.
End with a summary line: N learnings worth writing to brain or No novel learnings in this session.
The caller (operator or another agent) decides which learnings to actually write to the brain via brain_write. This skill does not write directly.
Worked Example
Input session log shows: a 3-hour session implementing a JSON ingestion pipeline. The session log shows three failed attempts at handling escape characters in LLM-generated JSON, then a passing fix that introduced a repairJSON helper.
Output:
**Learning:** LLM-generated JSON containing markdown can produce invalid escape sequences (e.g. `\_` and `\#`) that break the standard JSON parser even when the structure is correct.
**Context:** The first three TDD cycles failed because the test corpus included markdown-flavored content from the LLM and the parser rejected the entire payload. The fourth cycle introduced a `repairJSON` pre-pass that normalizes escapes before parsing. Without this context, future sessions will rediscover the same failure mode.
**Recommendation:** When ingesting LLM output as JSON, route it through `repairJSON` before `json.Unmarshal`. Add this to the ingestion-pipeline reference doc.
**Learning:** Integration tests with `defer resp.Body.Close()` will fail CI's errcheck even when local `go test` is green.
**Context:** Task 8 of the brain MCP migration shipped this pattern; CI caught it 30 minutes later. The codebase enforces `defer func() { _ = resp.Body.Close() }()` to satisfy errcheck.
**Recommendation:** Implementer subagents must run `task check` (lint + test + vet) per task, not just `go test`. (Note: this learning has since been written to memory — a real run of this skill would now find it via `brain_query` and drop the candidate. Kept here as illustration.)
2 learnings worth writing to brain
Anti-Patterns
| Anti-Pattern | Why It Fails |
|---|---|
| "Lots of tests passed today" | Outcome summary, not a learning. Brain queries on this return nothing useful. |
| "Refactored the parser" | Activity log, not a learning. What was learned by doing it? |
| Listing every failure as a learning | Most failures are typos or mechanical fixes. Filter to the ones with insight. |
| Project-specific paths in the recommendation | Recommendations should generalize where possible — "ingestion pipelines" not "internal/ingest/v3/parser.go". |
| Writing to brain directly from this skill | This skill surfaces candidates only. The caller decides what to write. |
Brain MCP Integration
The brain holds prior session learnings; checking it prevents this retrospective from re-surfacing what is already preserved.
At retrospective start:
- Run
brain_querywith the session's main topic to surface recent related learnings. Avoid emitting a "learning" that is already in the brain.
After retrospective output:
- This skill does NOT call
brain_write. The caller does, selectively.
Logging
Call session_log once at the end of every phase to record the outcome.
Pass-rate is computed downstream by the /pass-rate HTTP endpoint, which
treats pass as success, fail as failure, skip as neither.
At end of each phase:
session_logwith{skill: "session-retrospective", phase: "<phase-name>", final_status: "pass" | "fail" | "skip", message: "<one-line summary>", duration_ms: <wall-clock>, project_root: "<absolute path>"}
Phases for this skill: surfaced
Status semantics:
pass— the phase's intended outcome was reached (candidates were emitted).fail— the phase's intended outcome was NOT reached (retrospective could not produce coherent candidates).skip— phase was skipped intentionally (no novel learnings worth surfacing).
Why this matters: the routing pod (Plan 6) reads pass-rate to decide whether to route a future call to a local model. If your skill never logs, the routing pod sees no data.
Mode 2 Routing Note
Reading long session logs and pulling themes is high-volume, mechanical pattern-recognition work — a strong Mode 2 routing candidate. Until Plan 6 ships the routing pod, treat as Mode 1 only.
Cross-References
- Load
trainerskill when you have more than ~5 candidates, or when the retrospective output feels too generous and needs a stricter quality gate before any of it goes into the brain. Trainer is the stricter cousin of this skill. - Pair with
brain_querycalls for prior session learnings on the same topic to avoid duplication.