Phase 1 of mathias/skills extraction (infra#62 Track D — homelab next-step plan addendum). Imports ~/dev/.skills/ verbatim (19 skill dirs + SKILLS_INDEX.md) and adds the installation surface: - Taskfile.yml — install / update / list / release / check targets - install.sh — bootstrap installer for hosts without Task. Idempotent symlink wirer; default checkout at ~/.local/share/skills/ on every host; SKILLS_REF env var pins a tag (default: main). - .gitea/workflows/release.yml — auto-tag every push to main by Bump-Type footer (major/minor/patch, default patch). Skipped when commit contains [skip-release]. - README — usage, versioning, contribution flow, secret-hygiene rule. Phase 1 wires Claude Code only (~/.claude/skills/<name> global + <repo>/.claude/skills/<name> per-repo). Phase 2 adds Crush, opencode, antigravity, and gitea-resident agents (cobalt-dingo, agentsquad) once their skill conventions are researched. Public repo, markdown-only — no secrets, no client names. Verified via pre-push grep before initial push. [skip-release]
9.7 KiB
name, description
| name | description |
|---|---|
| spec-driven-dev | Write a structured specification before writing any code. Use when starting a new project, feature, or significant change. Adapted for a PM-first workflow where why comes before how. |
Spec-Driven Development
Overview
Write a structured specification before writing any code. The spec is the shared source of truth — it defines what we're building, why it matters, and how we'll know it's done.
Code without a spec is guessing.
A spec doesn't need to be long. A two-paragraph spec beats no spec. The value is in forcing clarity before code is written, not in the length of the document.
Mathias PM Context
As a digital product manager building software:
- Why before how: The spec must capture the business context and user need before technical decisions. Agents reading the spec should understand why this matters, not just what to build.
- Explicit success criteria: Vague requirements produce vague results. Every spec must have testable success criteria.
- Surfaces assumptions: The spec's primary job is to surface misunderstandings before they become expensive code.
- Living document: Update the spec when decisions change. An outdated spec is still better than no spec.
When to Use
Always create a spec when:
- Starting a new project or feature
- Requirements are ambiguous or incomplete
- The change touches multiple files or modules
- You're about to make an architectural decision
- The task would take more than a day to implement
When NOT to use: Single-line fixes, typos, or changes where requirements are unambiguous and self-contained.
The Gated Workflow
Do not advance to the next phase without validation at each gate.
SPECIFY → [review] → PLAN → [review] → TASKS → [review] → IMPLEMENT
Each gate is a deliberate pause: does the next phase make sense given what we know?
Phase 1: SPECIFY
Surface Assumptions Immediately
Before writing spec content, list what you're assuming:
ASSUMPTIONS I'M MAKING:
1. This is a Go backend service (no frontend changes)
2. Authentication is handled by the existing middleware
3. The database is PostgreSQL (matching the rest of the stack)
4. The feature is used by authenticated users only, not public
→ Confirm or correct before I proceed.
Don't silently fill in ambiguous requirements.
Write the Spec
A spec covers six areas:
1. Objective — WHY are we building this?
This is the most important section. It must answer:
- What user problem does this solve?
- Who is the user?
- What does success look like from the user's perspective?
- Why now?
## Objective
Invoice importers at small accounting firms manually copy payment details
from PDF invoices into their banking system, taking 10–20 minutes per invoice.
**User:** Invoice processor at an accounting firm (10–50 invoices/day)
**Problem:** Manual data entry is slow, error-prone, and creates compliance risk
**Goal:** Reduce per-invoice processing time from ~15 minutes to < 2 minutes
Success: Invoice processor can extract and queue a payment from a PDF in under 2 minutes,
with confidence the data is correct.
2. Commands — Full executable commands
Build: task build
Test: task test (or: go test ./...)
Lint: task lint (or: golangci-lint run)
Dev: task dev
Deploy: task deploy:staging
3. Project Structure — Where things live
internal/
domain/ → Core types and interfaces
service/ → Business logic
store/ → Database implementations
handler/ → HTTP handlers
cmd/
server/ → Main entry point
4. Code Style — One real example beats three paragraphs
// Error handling: always wrap with context
if err != nil {
return fmt.Errorf("parse invoice PDF: %w", err)
}
// Dependency injection: accept interfaces
func NewInvoiceService(store InvoiceStore, parser PDFParser) *InvoiceService { ... }
// Context: always first parameter for I/O operations
func (s *InvoiceService) ProcessPDF(ctx context.Context, r io.Reader) (Invoice, error) { ... }
5. Testing Strategy
Framework: testing package + testify
Locations: *_test.go files in same package
Unit tests: table-driven, in-memory implementations for stores
Fast path: go test ./... (unit tests only)
Full suite: go test -tags=integration ./... (includes DB tests)
Coverage: >80% for business logic packages
6. Boundaries
Always:
- Run go test ./... before committing
- Wrap errors with fmt.Errorf("context: %w", err)
- Pass ctx as first parameter to any I/O function
- Run govulncheck before adding new dependencies
Ask first:
- Schema changes
- Adding new external dependencies
- Changing public API contracts
- Performance changes that affect existing behavior
Never:
- Commit secrets or API keys
- Remove or skip failing tests
- Send client data to external APIs
- Use naked returns on errors
Spec Template
# Spec: [Feature/Project Name]
## Objective
[What we're building and why. User story or problem statement.]
[Who is the user? What does success look like from their perspective?]
## Tech Stack
[Language, key libraries, relevant existing infrastructure]
## Commands
Build: [full command]
Test: [full command]
Lint: [full command]
Dev: [full command]
## Project Structure
[Directory layout with descriptions]
## Code Style
[One real code example showing the patterns to follow]
## Testing Strategy
[Framework, test locations, what to unit test vs integration test]
## Boundaries
- Always: [...]
- Ask first: [...]
- Never: [...]
## Success Criteria
[Specific, testable conditions that define "done"]
- [ ] [Condition 1: metric/threshold/method]
- [ ] [Condition 2: ...]
## Open Questions
[Anything unresolved that needs input before implementation begins]
Reframing Vague Requirements
When you receive a vague requirement, translate it into specific success criteria before writing any spec content:
Vague: "Make the invoice parser more reliable"
Reframed success criteria:
- Parser correctly extracts IBAN from 95% of Swedish invoice formats
- Parser correctly extracts total amount from 98% of tested invoices
- Parser returns a structured error (not a panic) for unrecognized formats
- Processing time < 2 seconds for PDFs up to 10MB
→ Are these the right targets?
Phase 2: PLAN
With a validated spec, create a technical implementation plan:
- Identify major components and their dependencies
- Determine implementation order (foundations first)
- Note risks and unknowns
- Identify what can be built in parallel vs. what must be sequential
- Define verification checkpoints between phases
The plan should be reviewable: anyone should be able to read it and say "yes, that's the right approach" or "no, change X."
Load the planning skill for detailed task breakdown.
Phase 3: TASKS
Break the plan into discrete, implementable tasks. Load the planning skill for the full task breakdown methodology.
Each task must have:
- Acceptance criteria
- Verification step (test command, build, manual check)
- File count estimate (no task should touch more than ~5 files)
Phase 4: IMPLEMENT
Execute tasks one at a time. For each task:
- Load
tddskill — write failing tests first - Implement minimal code to pass
- Load
clean-codeskill — refactor
Keeping the Spec Alive
- Update when decisions change — spec first, then code
- Update when scope changes
- Commit the spec — it belongs in version control
- Reference the spec in PRs — link to the section each PR implements
Common Rationalizations
| Rationalization | Reality |
|---|---|
| "This is simple, I don't need a spec" | Simple tasks still need acceptance criteria. A two-line spec is fine. |
| "I'll write the spec after" | That's documentation, not specification. The spec's value is forcing clarity before code. |
| "The spec will slow us down" | A 15-minute spec prevents hours of rework. |
| "Requirements will change anyway" | That's why the spec is a living document. |
| "The user knows what they want" | Even clear requests have implicit assumptions. The spec surfaces those. |
Verification
Before starting implementation:
- Spec covers all six core areas
- Mathias has reviewed and approved the spec
- Success criteria are specific and testable
- Boundaries (Always/Ask First/Never) are defined
- Open questions are resolved or accepted as unknowns
- Spec is saved to a file in the repository
Brain MCP Integration
Logging
Call session_log once at the end of every phase to record the outcome.
Pass-rate is computed downstream by the /pass-rate HTTP endpoint, which
treats pass as success, fail as failure, skip as neither.
At end of each phase:
session_logwith{skill: "spec-driven-dev", phase: "<phase-name>", final_status: "pass" | "fail" | "skip", message: "<one-line summary>", duration_ms: <wall-clock>, project_root: "<absolute path>"}
Phases for this skill: specify, plan, tasks, implement
Status semantics:
pass— the phase's intended outcome was reached (gate passed).fail— the phase's intended outcome was NOT reached (gate blocked, rework required).skip— phase was skipped intentionally.
Why this matters: the routing pod (Plan 6) reads pass-rate to decide whether to route a future call to a local model. If your skill never logs, the routing pod sees no data.
Cross-References
- Load
problem-analysisskill for deep requirement understanding before speccing - Load
user-storiesskill to decompose the spec into stories - Load
planningskill for task breakdown - Load
feature-specskill once implementation begins, to scope individual features inside the project - Load
tddskill during implementation