chore: bootstrap skills library — 19 skills + installer + CI auto-tag
Some checks failed
release / tag (push) Has been cancelled
Some checks failed
release / tag (push) Has been cancelled
Phase 1 of mathias/skills extraction (infra#62 Track D — homelab next-step plan addendum). Imports ~/dev/.skills/ verbatim (19 skill dirs + SKILLS_INDEX.md) and adds the installation surface: - Taskfile.yml — install / update / list / release / check targets - install.sh — bootstrap installer for hosts without Task. Idempotent symlink wirer; default checkout at ~/.local/share/skills/ on every host; SKILLS_REF env var pins a tag (default: main). - .gitea/workflows/release.yml — auto-tag every push to main by Bump-Type footer (major/minor/patch, default patch). Skipped when commit contains [skip-release]. - README — usage, versioning, contribution flow, secret-hygiene rule. Phase 1 wires Claude Code only (~/.claude/skills/<name> global + <repo>/.claude/skills/<name> per-repo). Phase 2 adds Crush, opencode, antigravity, and gitea-resident agents (cobalt-dingo, agentsquad) once their skill conventions are researched. Public repo, markdown-only — no secrets, no client names. Verified via pre-push grep before initial push. [skip-release]
This commit is contained in:
146
debug/SKILL.md
Normal file
146
debug/SKILL.md
Normal file
@@ -0,0 +1,146 @@
|
||||
---
|
||||
name: debug
|
||||
description: Systematic hypothesis-first debugging. Generate 3-5 ranked hypotheses with verification steps before suggesting any fix. Use when encountering a bug, test failure, or unexpected behavior.
|
||||
---
|
||||
|
||||
# Debug
|
||||
|
||||
## Overview
|
||||
|
||||
Debugging is hypothesis generation, not fix generation. The first instinct — "try this and see if it works" — wastes time and teaches nothing. A disciplined debug session produces a small set of ranked, falsifiable hypotheses, each with a concrete verification command. The fix comes after one hypothesis is confirmed, not before.
|
||||
|
||||
**Core principle:** A hypothesis you cannot verify in one command is not a hypothesis — it is a guess.
|
||||
|
||||
## When to Use
|
||||
|
||||
- Any test failure whose cause is not immediately obvious
|
||||
- Any production error or unexpected behavior
|
||||
- Any bug report from a user or stakeholder
|
||||
- Before reaching for a debugger or adding print statements: form hypotheses first
|
||||
|
||||
**Do not use for:**
|
||||
- Compile errors with clear messages (just fix the typo)
|
||||
- Already-diagnosed bugs where you know the cause (go straight to TDD with a regression test)
|
||||
|
||||
## Iron Laws
|
||||
|
||||
1. **Never suggest "try X and see what happens."** Every hypothesis must have a specific expected outcome if correct.
|
||||
2. **Generate 3-5 hypotheses, ordered by likelihood (most likely first).** Fewer than 3 means you stopped thinking; more than 5 means you are not prioritizing.
|
||||
3. **Diagnose only — do not fix in this skill.** The fix happens in a separate TDD cycle (load `tdd` skill) once a hypothesis is confirmed.
|
||||
|
||||
## Process
|
||||
|
||||
### Step 1: Read the failure
|
||||
|
||||
Before forming hypotheses:
|
||||
- Read the full error message and stack trace, not just the headline
|
||||
- Read the file where the failure originated, around the failing line
|
||||
- If the failure is from a test, read the test and the code under test
|
||||
- Identify the **failure mode** — what actually went wrong (e.g. "nil pointer dereference in goroutine spawned by handler") not just what the error says ("runtime error")
|
||||
|
||||
### Step 2: Generate hypotheses
|
||||
|
||||
For each hypothesis, capture three things:
|
||||
|
||||
- **Mechanism:** what specific code path or state would produce this exact failure
|
||||
- **Verification:** the single command or file inspection that confirms or denies it
|
||||
- **Expected outcome if correct:** the specific output you would see
|
||||
|
||||
Order by likelihood. The most likely cause is hypothesis 1.
|
||||
|
||||
### Step 3: Output
|
||||
|
||||
Use this format:
|
||||
|
||||
```
|
||||
HYPOTHESIS 1 (likelihood: high): <mechanism in one sentence>
|
||||
VERIFY: <exact command or file:line to inspect>
|
||||
EXPECTED IF CORRECT: <specific output, value, or condition>
|
||||
|
||||
HYPOTHESIS 2 (likelihood: medium): <mechanism>
|
||||
VERIFY: <exact command>
|
||||
EXPECTED IF CORRECT: <specific output>
|
||||
|
||||
[... up to 5 ...]
|
||||
|
||||
RECOMMENDED NEXT STEP: Run VERIFY for hypothesis 1 first.
|
||||
```
|
||||
|
||||
End with the recommendation, not a fix.
|
||||
|
||||
## Worked Example
|
||||
|
||||
**Failure:**
|
||||
```
|
||||
--- FAIL: TestInvoiceParser_HandlesEmptyPDF (0.00s)
|
||||
parser_test.go:47: panic: runtime error: index out of range [0] with length 0
|
||||
```
|
||||
|
||||
**Output:**
|
||||
|
||||
```
|
||||
HYPOTHESIS 1 (likelihood: high): parser indexes into pages[0] before checking len(pages) > 0; empty PDFs produce a zero-page document
|
||||
VERIFY: rg -n 'pages\[0\]' internal/parser/
|
||||
EXPECTED IF CORRECT: at least one site reads pages[0] without a preceding length check
|
||||
|
||||
HYPOTHESIS 2 (likelihood: medium): the test fixture is a zero-byte file rather than a valid empty PDF; pdf library returns nil pages slice instead of empty
|
||||
VERIFY: ls -la testdata/empty.pdf && file testdata/empty.pdf
|
||||
EXPECTED IF CORRECT: file size 0 bytes or "data" rather than "PDF document"
|
||||
|
||||
HYPOTHESIS 3 (likelihood: low): a recently changed dependency reordered the page-extraction API; pages[0] now refers to metadata, not content
|
||||
VERIFY: git log --oneline -10 -- go.sum | grep -i pdf
|
||||
EXPECTED IF CORRECT: a pdf-library bump in the last few commits
|
||||
|
||||
RECOMMENDED NEXT STEP: Run VERIFY for hypothesis 1 first.
|
||||
```
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
| Anti-Pattern | Why It Fails |
|
||||
|---|---|
|
||||
| "Maybe try restarting the service" | Not a mechanism. Not falsifiable. Teaches nothing if it works. |
|
||||
| "Could be a race condition" | Mechanism without specifics. Which two operations race? On what state? |
|
||||
| "Let me add some print statements and see" | Skips hypothesis generation. Generates noise, not understanding. |
|
||||
| Single hypothesis presented as fact | If you are sure, write the regression test, do not run a debug skill. |
|
||||
| Mixing hypotheses with fix suggestions | The skill is diagnose-only. The fix is a TDD task on its hypothesis. |
|
||||
|
||||
## Brain MCP Integration
|
||||
|
||||
The brain holds prior debug sessions across the project. Use it to skip rediscovering known failure modes.
|
||||
|
||||
**At debug start:**
|
||||
- Run `brain_query` with the error message snippet + the package name. Past sessions may have logged identical or similar failures with their resolved hypotheses.
|
||||
|
||||
**After a hypothesis is confirmed:**
|
||||
- Run `brain_write` with the failure signature → confirmed mechanism. Future debug sessions on the same area get the answer immediately.
|
||||
|
||||
**Never:**
|
||||
- Run `brain_write` for an unconfirmed hypothesis. Speculation in the brain pollutes future queries.
|
||||
|
||||
### Logging
|
||||
|
||||
Call `session_log` once at the end of every phase to record the outcome.
|
||||
Pass-rate is computed downstream by the `/pass-rate` HTTP endpoint, which
|
||||
treats `pass` as success, `fail` as failure, `skip` as neither.
|
||||
|
||||
**At end of each phase:**
|
||||
- `session_log` with `{skill: "debug", phase: "<phase-name>", final_status: "pass" | "fail" | "skip", message: "<one-line summary>", duration_ms: <wall-clock>, project_root: "<absolute path>"}`
|
||||
|
||||
**Phases for this skill:** read-failure, generate-hypotheses, output
|
||||
|
||||
**Status semantics:**
|
||||
- `pass` — the phase's intended outcome was reached.
|
||||
- `fail` — the phase's intended outcome was NOT reached.
|
||||
- `skip` — phase was skipped intentionally.
|
||||
|
||||
**Why this matters:** the routing pod (Plan 6) reads pass-rate to decide whether to route a future call to a local model. If your skill never logs, the routing pod sees no data.
|
||||
|
||||
## Mode 2 Routing Note
|
||||
|
||||
This skill produces high-volume mechanical output (hypothesis enumeration) and is a candidate for Mode 2 routing to a local model in the future. Until Plan 6 ships the routing pod, treat as Mode 1 only. The hypothesis format and discipline are identical regardless of which model generates them.
|
||||
|
||||
## Cross-References
|
||||
|
||||
- After a hypothesis is confirmed, load `tdd` skill — write a failing regression test that proves the bug exists, then fix it.
|
||||
- For test failures specifically caused by mock-vs-real divergence, also load `tdd/references/testing-anti-patterns.md`.
|
||||
- Load `code-review` skill if the diagnosis surfaces a structural issue (god object, shotgun surgery) rather than a pointwise bug.
|
||||
Reference in New Issue
Block a user