Two corrections applied during Plan 5 execution:
- Task 2 test helper writeSession now joins a sessions/ subdir so it
matches the handler's <brainDir>/sessions/*.jsonl scan path.
(The original heredoc would have produced 0 records in tests.)
- Task 6 grew a Step 1.5 to update the session_log MCP tool's
final_status description, picking up a spec requirement that
didn't translate into a task in the original plan.
1103 lines
41 KiB
Markdown
1103 lines
41 KiB
Markdown
# Pass-rate Logging Implementation Plan
|
|
|
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
|
|
**Goal:** Instrument skill invocations with `session_log` calls, expose pass-rate aggregates via a new `/pass-rate` HTTP endpoint on the brain pod, and add an optional `hyperguild brain pass-rate` CLI subcommand. Provides Plan 6 (Mode 2 routing pod) with the per-skill signal it needs to make routing decisions.
|
|
|
|
**Architecture:** SKILL.md is the source of truth for the logging contract — each instrumented skill's "Brain MCP Integration" section gets a literal copy-paste-friendly `session_log` call template. A new `GET /pass-rate?skill=X&window=Y` handler in the ingestion pod walks `brain/sessions/*.jsonl` on demand, normalizes legacy vocabulary (`pass`≡`ok`, `fail`≡`error`, `skip`≡`skipped`), and returns aggregate counts + pass rate. The hyperguild CLI gains a `brain pass-rate` subcommand that calls the endpoint.
|
|
|
|
**Tech Stack:** Go 1.26.1 stdlib (`net/http`, `encoding/json`, `time`); existing testify; markdown for SKILL.md updates. Two repos touched: `local-dev` (skill content) and `hyperguild` (endpoint + CLI).
|
|
|
|
---
|
|
|
|
## Plan 5 of 7 — Hyperguild Skill Migration
|
|
|
|
Plans 1-4 done. Plan 5 instruments skills and exposes pass-rate, unblocking Plan 6 (routing pod) which then unblocks Plan 7 (supervisor retirement).
|
|
|
|
**Spec:** `docs/superpowers/specs/2026-05-03-pass-rate-logging-design.md` (committed `af52f50`).
|
|
|
|
**Two repos, two worktrees:**
|
|
|
|
- **Phase 1 worktree (`local-dev`):** `~/Documents/local-dev-worktrees/pass-rate-instrumentation/` on branch `feat/pass-rate-instrumentation`.
|
|
- **Phase 2 worktree (`hyperguild`):** `~/dev/AI/hyperguild/.worktrees/pass-rate-endpoint/` on branch `feat/pass-rate-endpoint`.
|
|
|
|
Each task's "Files" header names the worktree. Implementer subagents must `cd` into the named worktree before any read/edit/git operation. Internal absolute paths in code (e.g. `${HOME}/dev/.skills`) remain unchanged — they're correct for runtime.
|
|
|
|
**Lint convention:** project's golangci-lint has `errcheck` enabled. Append `//nolint:errcheck` to any `fmt.Fprint*` call that ignores its return value when writing to stdout/stderr. The plan's heredocs include these markers. (Same as Plans 3 and 4.)
|
|
|
|
**Verb decision (GET vs POST for `/pass-rate`):** the existing brain HTTP REST API uses POST for all endpoints (`/query`, `/write`, `/ingest`, etc.) — historically because those endpoints all take JSON bodies. `/pass-rate` is a pure read with query-string parameters. We use **GET** here to follow REST semantics for pure reads; the inconsistency with `/query` is acknowledged. Future read endpoints SHOULD follow GET.
|
|
|
|
## File Structure
|
|
|
|
### Phase 1 (`local-dev` worktree)
|
|
|
|
| Path | Action | Responsibility |
|
|
|---|---|---|
|
|
| `.skills/tdd/SKILL.md` | modify | Pilot — add the `session_log` contract under existing "Brain MCP Integration" |
|
|
| `.skills/code-review/SKILL.md` | modify | Rollout — same contract, code-review's phase set |
|
|
| `.skills/debug/SKILL.md` | modify | Rollout |
|
|
| `.skills/feature-spec/SKILL.md` | modify | Rollout |
|
|
| `.skills/session-retrospective/SKILL.md` | modify | Rollout |
|
|
| `.skills/trainer/SKILL.md` | modify | Rollout |
|
|
| `.skills/spec-driven-dev/SKILL.md` | modify | Rollout |
|
|
|
|
### Phase 2 (`hyperguild` worktree)
|
|
|
|
| Path | Action | Responsibility |
|
|
|---|---|---|
|
|
| `ingestion/internal/api/passrate.go` | create | Aggregator + handler for `/pass-rate` |
|
|
| `ingestion/internal/api/passrate_test.go` | create | Unit tests for the aggregator + handler |
|
|
| `ingestion/cmd/server/main.go` | modify | Register `GET /pass-rate` route |
|
|
| `ingestion/internal/api/handler_test.go` | modify (or create alongside) | Integration test that the route serves expected JSON |
|
|
| `cmd/hyperguild/brain.go` | modify | Add `runBrainPassRate` and dispatch case |
|
|
| `cmd/hyperguild/brain_test.go` | modify | Tests for `runBrainPassRate` |
|
|
| `cmd/hyperguild/http.go` | modify | Add `brainClient.PassRate` method |
|
|
| `cmd/hyperguild/http_test.go` | modify | Tests for `PassRate` method |
|
|
| `cmd/hyperguild/README.md` | modify | Document the new subcommand and env vars |
|
|
| `CLAUDE.md` (or `.context/PROJECT.md`) | modify | Document `/pass-rate` endpoint alongside other brain HTTP REST endpoints |
|
|
|
|
---
|
|
|
|
## Task 1: Pilot — instrument `tdd/SKILL.md`
|
|
|
|
**Worktree:** `~/Documents/local-dev-worktrees/pass-rate-instrumentation/` (`local-dev` repo)
|
|
|
|
Add a "Logging" subsection under the existing "Brain MCP Integration" in `tdd/SKILL.md`. The subsection states the literal `session_log` call shape with explicit field names for the three TDD phases (red, green, refactor).
|
|
|
|
**Files:**
|
|
- Modify: `.skills/tdd/SKILL.md`
|
|
|
|
- [ ] **Step 1: Read the existing SKILL.md to find the insertion point**
|
|
|
|
```bash
|
|
cd ~/Documents/local-dev-worktrees/pass-rate-instrumentation
|
|
grep -n "## Brain MCP Integration\|## Mode 2 Routing Note\|## Cross-References" .skills/tdd/SKILL.md
|
|
# Locate the "Brain MCP Integration" section and the next section after it.
|
|
# The new "Logging" subsection inserts at the END of "Brain MCP Integration",
|
|
# immediately before whatever section comes next (typically "Mode 2 Routing Note").
|
|
```
|
|
|
|
- [ ] **Step 2: Insert the Logging subsection**
|
|
|
|
Inside `.skills/tdd/SKILL.md`, find the "## Brain MCP Integration" section and append (before the next section starts) this exact subsection:
|
|
|
|
```markdown
|
|
### Logging
|
|
|
|
Call `session_log` once at the end of every phase to record the outcome.
|
|
Pass-rate is computed downstream by the `/pass-rate` HTTP endpoint, which
|
|
treats `pass` as success, `fail` as failure, `skip` as neither.
|
|
|
|
**At end of `red` phase:**
|
|
- `session_log` with `{skill: "tdd", phase: "red", final_status: "pass" | "fail" | "skip", message: "<one-line summary>", duration_ms: <wall-clock>, project_root: "<absolute path>"}`
|
|
|
|
**At end of `green` phase:**
|
|
- `session_log` with `{skill: "tdd", phase: "green", final_status: "pass" | "fail" | "skip", message: "<one-line summary>", duration_ms: <wall-clock>, project_root: "<absolute path>"}`
|
|
|
|
**At end of `refactor` phase:**
|
|
- `session_log` with `{skill: "tdd", phase: "refactor", final_status: "pass" | "fail" | "skip", message: "<one-line summary>", duration_ms: <wall-clock>, project_root: "<absolute path>"}`
|
|
|
|
**Status semantics:**
|
|
- `pass` — the phase's intended outcome was reached (red: test fails as expected; green: test passes; refactor: tests still pass after refactor).
|
|
- `fail` — the phase's intended outcome was NOT reached (test compiled when it shouldn't, test still fails after green attempt, refactor broke tests).
|
|
- `skip` — phase was skipped intentionally (e.g. refactor not warranted).
|
|
|
|
**Why this matters:** the routing pod (Plan 6) reads pass-rate to decide whether to route a future `tdd` call to a local model. If your skill never logs, the routing pod sees no data and may default-route or default-not-route in a way that doesn't reflect real performance.
|
|
```
|
|
|
|
- [ ] **Step 3: Sanity-check the edit**
|
|
|
|
```bash
|
|
grep -A2 "### Logging" .skills/tdd/SKILL.md | head -5
|
|
# Should show the subsection header
|
|
|
|
grep -c "session_log" .skills/tdd/SKILL.md
|
|
# Should show ≥ 4 (3 phase calls + 1 reference in the subsection intro,
|
|
# plus any existing references)
|
|
|
|
# Confirm the subsection sits inside Brain MCP Integration:
|
|
awk '/^## Brain MCP Integration$/,/^## /' .skills/tdd/SKILL.md | tail -20
|
|
```
|
|
|
|
- [ ] **Step 4: Commit**
|
|
|
|
```bash
|
|
cd ~/Documents/local-dev-worktrees/pass-rate-instrumentation
|
|
git add .skills/tdd/SKILL.md
|
|
git commit -m "feat(skills): instrument tdd with session_log contract — Plan 5 pilot
|
|
|
|
Adds a Logging subsection under Brain MCP Integration that names the
|
|
session_log fields and per-phase semantics (pass/fail/skip). Pilot
|
|
for Plan 5 of the hyperguild migration. Six other binary-outcome
|
|
skills will follow once this pilot validates end-to-end via the
|
|
endpoint work in Phase 2."
|
|
```
|
|
|
|
---
|
|
|
|
## Task 2: Endpoint — `/pass-rate` aggregator + handler + unit tests
|
|
|
|
**Worktree:** `~/dev/AI/hyperguild/.worktrees/pass-rate-endpoint/` (`hyperguild` repo)
|
|
|
|
Add a new aggregator and handler in `ingestion/internal/api/passrate.go`. Walks the pod's `brain/sessions/*.jsonl` files, filters by skill + window, normalizes status vocabulary, returns aggregates.
|
|
|
|
**Note on dogfooding:** This task is itself a TDD cycle. The implementing subagent should follow the now-instrumented `tdd/SKILL.md` (Task 1) and call `session_log` at the end of red, green, refactor. This validates Phase 1's pilot end-to-end. If the implementer doesn't have access to the brain MCP, they may skip the logging calls and report DONE_WITH_CONCERNS — the controller will validate manually.
|
|
|
|
**Files:**
|
|
- Create: `ingestion/internal/api/passrate.go`
|
|
- Create: `ingestion/internal/api/passrate_test.go`
|
|
|
|
- [ ] **Step 1: Write the failing test**
|
|
|
|
Create `ingestion/internal/api/passrate_test.go`:
|
|
|
|
```go
|
|
package api
|
|
|
|
import (
|
|
"encoding/json"
|
|
"net/http"
|
|
"net/http/httptest"
|
|
"os"
|
|
"path/filepath"
|
|
"testing"
|
|
"time"
|
|
|
|
"github.com/stretchr/testify/assert"
|
|
"github.com/stretchr/testify/require"
|
|
)
|
|
|
|
// writeSession writes one or more JSONL entries to <dir>/sessions/<sessionID>.jsonl.
|
|
// The handler scans <brainDir>/sessions/*.jsonl, so the helper creates the
|
|
// sessions/ subdir to match the production layout.
|
|
func writeSession(t *testing.T, dir, sessionID string, entries ...string) {
|
|
t.Helper()
|
|
sessionsDir := filepath.Join(dir, "sessions")
|
|
require.NoError(t, os.MkdirAll(sessionsDir, 0o755))
|
|
path := filepath.Join(sessionsDir, sessionID+".jsonl")
|
|
body := ""
|
|
for _, e := range entries {
|
|
body += e + "\n"
|
|
}
|
|
require.NoError(t, os.WriteFile(path, []byte(body), 0o644))
|
|
}
|
|
|
|
func TestPassRate_HappyPath(t *testing.T) {
|
|
dir := t.TempDir()
|
|
now := time.Now().UTC()
|
|
recent := now.Add(-1 * time.Hour).Format(time.RFC3339)
|
|
|
|
writeSession(t, dir, "s1",
|
|
`{"timestamp":"`+recent+`","skill":"tdd","phase":"red","final_status":"pass"}`,
|
|
`{"timestamp":"`+recent+`","skill":"tdd","phase":"green","final_status":"pass"}`,
|
|
`{"timestamp":"`+recent+`","skill":"tdd","phase":"refactor","final_status":"fail"}`,
|
|
)
|
|
writeSession(t, dir, "s2",
|
|
`{"timestamp":"`+recent+`","skill":"code-review","phase":"review","final_status":"pass"}`,
|
|
)
|
|
|
|
h := &Handler{brainDir: dir}
|
|
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd&window=24h", nil)
|
|
w := httptest.NewRecorder()
|
|
h.PassRate(w, req)
|
|
|
|
resp := w.Result()
|
|
require.Equal(t, http.StatusOK, resp.StatusCode)
|
|
|
|
var got passRateResponse
|
|
require.NoError(t, json.NewDecoder(resp.Body).Decode(&got))
|
|
assert.Equal(t, "tdd", got.Skill)
|
|
assert.Equal(t, "24h", got.Window)
|
|
assert.Equal(t, 2, got.Pass)
|
|
assert.Equal(t, 1, got.Fail)
|
|
assert.Equal(t, 0, got.Skip)
|
|
assert.Equal(t, 3, got.Total)
|
|
require.NotNil(t, got.PassRate)
|
|
assert.InDelta(t, 0.6667, *got.PassRate, 0.001)
|
|
}
|
|
|
|
func TestPassRate_LegacyVocabulary(t *testing.T) {
|
|
dir := t.TempDir()
|
|
now := time.Now().UTC().Format(time.RFC3339)
|
|
writeSession(t, dir, "s1",
|
|
`{"timestamp":"`+now+`","skill":"tdd","final_status":"ok"}`,
|
|
`{"timestamp":"`+now+`","skill":"tdd","final_status":"error"}`,
|
|
`{"timestamp":"`+now+`","skill":"tdd","final_status":"skipped"}`,
|
|
)
|
|
|
|
h := &Handler{brainDir: dir}
|
|
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd&window=24h", nil)
|
|
w := httptest.NewRecorder()
|
|
h.PassRate(w, req)
|
|
|
|
var got passRateResponse
|
|
require.NoError(t, json.NewDecoder(w.Result().Body).Decode(&got))
|
|
assert.Equal(t, 1, got.Pass, "ok→pass")
|
|
assert.Equal(t, 1, got.Fail, "error→fail")
|
|
assert.Equal(t, 1, got.Skip, "skipped→skip")
|
|
}
|
|
|
|
func TestPassRate_OutsideWindow_Excluded(t *testing.T) {
|
|
dir := t.TempDir()
|
|
old := time.Now().UTC().Add(-30 * 24 * time.Hour).Format(time.RFC3339)
|
|
recent := time.Now().UTC().Add(-1 * time.Hour).Format(time.RFC3339)
|
|
writeSession(t, dir, "s1",
|
|
`{"timestamp":"`+old+`","skill":"tdd","final_status":"pass"}`,
|
|
`{"timestamp":"`+recent+`","skill":"tdd","final_status":"pass"}`,
|
|
)
|
|
|
|
h := &Handler{brainDir: dir}
|
|
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd&window=24h", nil)
|
|
w := httptest.NewRecorder()
|
|
h.PassRate(w, req)
|
|
|
|
var got passRateResponse
|
|
require.NoError(t, json.NewDecoder(w.Result().Body).Decode(&got))
|
|
assert.Equal(t, 1, got.Pass)
|
|
assert.Equal(t, 1, got.Total)
|
|
}
|
|
|
|
func TestPassRate_NoData_ReturnsZerosAndNullRate(t *testing.T) {
|
|
dir := t.TempDir()
|
|
h := &Handler{brainDir: dir}
|
|
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd&window=24h", nil)
|
|
w := httptest.NewRecorder()
|
|
h.PassRate(w, req)
|
|
|
|
var got passRateResponse
|
|
require.NoError(t, json.NewDecoder(w.Result().Body).Decode(&got))
|
|
assert.Equal(t, 0, got.Pass)
|
|
assert.Equal(t, 0, got.Fail)
|
|
assert.Equal(t, 0, got.Skip)
|
|
assert.Equal(t, 0, got.Total)
|
|
assert.Nil(t, got.PassRate, "pass_rate must be null when pass+fail == 0")
|
|
}
|
|
|
|
func TestPassRate_DefaultsTo7d(t *testing.T) {
|
|
dir := t.TempDir()
|
|
now := time.Now().UTC().Format(time.RFC3339)
|
|
writeSession(t, dir, "s1", `{"timestamp":"`+now+`","skill":"tdd","final_status":"pass"}`)
|
|
|
|
h := &Handler{brainDir: dir}
|
|
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd", nil) // no window
|
|
w := httptest.NewRecorder()
|
|
h.PassRate(w, req)
|
|
|
|
var got passRateResponse
|
|
require.NoError(t, json.NewDecoder(w.Result().Body).Decode(&got))
|
|
assert.Equal(t, "7d", got.Window)
|
|
assert.Equal(t, 1, got.Pass)
|
|
}
|
|
|
|
func TestPassRate_MissingSkill_ReturnsBadRequest(t *testing.T) {
|
|
dir := t.TempDir()
|
|
h := &Handler{brainDir: dir}
|
|
req := httptest.NewRequest(http.MethodGet, "/pass-rate", nil)
|
|
w := httptest.NewRecorder()
|
|
h.PassRate(w, req)
|
|
assert.Equal(t, http.StatusBadRequest, w.Result().StatusCode)
|
|
}
|
|
|
|
func TestPassRate_BadWindow_ReturnsBadRequest(t *testing.T) {
|
|
dir := t.TempDir()
|
|
h := &Handler{brainDir: dir}
|
|
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd&window=foo", nil)
|
|
w := httptest.NewRecorder()
|
|
h.PassRate(w, req)
|
|
assert.Equal(t, http.StatusBadRequest, w.Result().StatusCode)
|
|
}
|
|
|
|
func TestPassRate_MalformedLine_Skipped(t *testing.T) {
|
|
dir := t.TempDir()
|
|
now := time.Now().UTC().Format(time.RFC3339)
|
|
writeSession(t, dir, "s1",
|
|
`{"timestamp":"`+now+`","skill":"tdd","final_status":"pass"}`,
|
|
`not valid json`,
|
|
`{"timestamp":"`+now+`","skill":"tdd","final_status":"pass"}`,
|
|
)
|
|
|
|
h := &Handler{brainDir: dir}
|
|
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd&window=24h", nil)
|
|
w := httptest.NewRecorder()
|
|
h.PassRate(w, req)
|
|
|
|
var got passRateResponse
|
|
require.NoError(t, json.NewDecoder(w.Result().Body).Decode(&got))
|
|
assert.Equal(t, 2, got.Pass, "the malformed line is silently skipped")
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 2: Run test to confirm build failure**
|
|
|
|
```bash
|
|
cd ~/dev/AI/hyperguild/.worktrees/pass-rate-endpoint
|
|
cd ingestion && go test ./internal/api/... -run TestPassRate 2>&1 | tail -10
|
|
# Expected: build failure — `Handler.PassRate`, `passRateResponse` undefined
|
|
# Note: Handler.brainDir field already exists (existing pattern); just adding a new method.
|
|
```
|
|
|
|
- [ ] **Step 3: Write the implementation**
|
|
|
|
Create `ingestion/internal/api/passrate.go`:
|
|
|
|
```go
|
|
package api
|
|
|
|
import (
|
|
"encoding/json"
|
|
"net/http"
|
|
"os"
|
|
"path/filepath"
|
|
"strings"
|
|
"time"
|
|
)
|
|
|
|
type passRateResponse struct {
|
|
Skill string `json:"skill"`
|
|
Window string `json:"window"`
|
|
Pass int `json:"pass"`
|
|
Fail int `json:"fail"`
|
|
Skip int `json:"skip"`
|
|
Total int `json:"total"`
|
|
PassRate *float64 `json:"pass_rate"`
|
|
}
|
|
|
|
// PassRate handles GET /pass-rate?skill=X&window=Y.
|
|
// Walks brainDir/sessions/*.jsonl, filters by skill name and timestamp,
|
|
// returns aggregated counts and pass rate.
|
|
func (h *Handler) PassRate(w http.ResponseWriter, r *http.Request) {
|
|
skill := r.URL.Query().Get("skill")
|
|
if skill == "" {
|
|
writeError(w, http.StatusBadRequest, "skill is required")
|
|
return
|
|
}
|
|
|
|
windowStr := r.URL.Query().Get("window")
|
|
if windowStr == "" {
|
|
windowStr = "7d"
|
|
}
|
|
window, err := parseWindow(windowStr)
|
|
if err != nil {
|
|
writeError(w, http.StatusBadRequest, "invalid window: "+err.Error())
|
|
return
|
|
}
|
|
|
|
cutoff := time.Now().UTC().Add(-window)
|
|
pass, fail, skip := 0, 0, 0
|
|
|
|
sessionsDir := filepath.Join(h.brainDir, "sessions")
|
|
entries, err := os.ReadDir(sessionsDir)
|
|
if err != nil && !os.IsNotExist(err) {
|
|
writeError(w, http.StatusInternalServerError, "read sessions dir: "+err.Error())
|
|
return
|
|
}
|
|
|
|
for _, entry := range entries {
|
|
if entry.IsDir() || !strings.HasSuffix(entry.Name(), ".jsonl") {
|
|
continue
|
|
}
|
|
body, err := os.ReadFile(filepath.Join(sessionsDir, entry.Name()))
|
|
if err != nil {
|
|
continue // skip unreadable files
|
|
}
|
|
for _, line := range strings.Split(string(body), "\n") {
|
|
line = strings.TrimSpace(line)
|
|
if line == "" {
|
|
continue
|
|
}
|
|
var rec struct {
|
|
Timestamp string `json:"timestamp"`
|
|
Skill string `json:"skill"`
|
|
FinalStatus string `json:"final_status"`
|
|
}
|
|
if err := json.Unmarshal([]byte(line), &rec); err != nil {
|
|
continue // malformed — skip
|
|
}
|
|
if rec.Skill != skill {
|
|
continue
|
|
}
|
|
ts, err := time.Parse(time.RFC3339, rec.Timestamp)
|
|
if err != nil {
|
|
continue
|
|
}
|
|
if ts.Before(cutoff) {
|
|
continue
|
|
}
|
|
switch normalizeStatus(rec.FinalStatus) {
|
|
case "pass":
|
|
pass++
|
|
case "fail":
|
|
fail++
|
|
case "skip":
|
|
skip++
|
|
}
|
|
}
|
|
}
|
|
|
|
total := pass + fail + skip
|
|
resp := passRateResponse{
|
|
Skill: skill,
|
|
Window: windowStr,
|
|
Pass: pass,
|
|
Fail: fail,
|
|
Skip: skip,
|
|
Total: total,
|
|
}
|
|
if pass+fail > 0 {
|
|
rate := float64(pass) / float64(pass+fail)
|
|
resp.PassRate = &rate
|
|
}
|
|
|
|
w.Header().Set("Content-Type", "application/json")
|
|
_ = json.NewEncoder(w).Encode(resp)
|
|
}
|
|
|
|
// normalizeStatus maps both new (pass/fail/skip) and legacy (ok/error/skipped)
|
|
// vocabularies to the canonical pass/fail/skip set. Unknown values are treated
|
|
// as skip for safety.
|
|
func normalizeStatus(s string) string {
|
|
switch s {
|
|
case "pass", "ok":
|
|
return "pass"
|
|
case "fail", "error":
|
|
return "fail"
|
|
case "skip", "skipped":
|
|
return "skip"
|
|
default:
|
|
return "skip"
|
|
}
|
|
}
|
|
|
|
// parseWindow accepts Go-style durations plus "Nd" for days.
|
|
func parseWindow(s string) (time.Duration, error) {
|
|
if strings.HasSuffix(s, "d") {
|
|
// Replace "d" with "h" * 24
|
|
days := strings.TrimSuffix(s, "d")
|
|
d, err := time.ParseDuration(days + "h")
|
|
if err != nil {
|
|
return 0, err
|
|
}
|
|
return d * 24, nil
|
|
}
|
|
return time.ParseDuration(s)
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 4: Run tests, expect all 8 PASS**
|
|
|
|
```bash
|
|
cd ~/dev/AI/hyperguild/.worktrees/pass-rate-endpoint/ingestion
|
|
go test ./internal/api/... -run TestPassRate -v 2>&1 | tail -25
|
|
# Expected: 8/8 PASS
|
|
```
|
|
|
|
- [ ] **Step 5: Run task check**
|
|
|
|
```bash
|
|
cd ~/dev/AI/hyperguild/.worktrees/pass-rate-endpoint
|
|
task check 2>&1 | tail -15
|
|
```
|
|
|
|
- [ ] **Step 6: Commit**
|
|
|
|
```bash
|
|
cd ~/dev/AI/hyperguild/.worktrees/pass-rate-endpoint
|
|
git add ingestion/internal/api/passrate.go ingestion/internal/api/passrate_test.go
|
|
git commit -m "feat(brain): /pass-rate aggregator and handler
|
|
|
|
Adds a new HTTP GET handler at the ingestion pod that walks
|
|
brain/sessions/*.jsonl, filters by skill name and timestamp window
|
|
(default 7d, accepts Nh and Nd), normalizes legacy status vocabulary
|
|
(ok→pass, error→fail, skipped→skip), and returns aggregated counts
|
|
plus pass_rate.
|
|
|
|
Pass rate is null when pass+fail == 0, distinguishing 'no data' from
|
|
'always passes'. Plan 6 routing pod will check for null before
|
|
making decisions.
|
|
|
|
Route registration in cmd/server/main.go lands in a follow-up commit."
|
|
```
|
|
|
|
---
|
|
|
|
## Task 3: Wire `/pass-rate` into the route table
|
|
|
|
**Worktree:** `~/dev/AI/hyperguild/.worktrees/pass-rate-endpoint/`
|
|
|
|
Register the handler at `GET /pass-rate` in the ingestion server's mux. Note the verb is GET (consistent with REST semantics for a pure read), unlike the other brain HTTP REST endpoints which are POST.
|
|
|
|
**Files:**
|
|
- Modify: `ingestion/cmd/server/main.go`
|
|
|
|
- [ ] **Step 1: Read the existing route table**
|
|
|
|
```bash
|
|
grep -n "mux.HandleFunc" ~/dev/AI/hyperguild/.worktrees/pass-rate-endpoint/ingestion/cmd/server/main.go
|
|
# Should list the existing POST routes (/query, /write, etc.)
|
|
```
|
|
|
|
- [ ] **Step 2: Add the route**
|
|
|
|
In `ingestion/cmd/server/main.go`, find the block of `mux.HandleFunc(...)` calls. After the last existing one (likely `mux.HandleFunc("POST /backfill-refs", h.BackfillRefs)`), add:
|
|
|
|
```go
|
|
mux.HandleFunc("GET /pass-rate", h.PassRate)
|
|
```
|
|
|
|
- [ ] **Step 3: Build and check the route is reachable**
|
|
|
|
```bash
|
|
cd ~/dev/AI/hyperguild/.worktrees/pass-rate-endpoint
|
|
go build ./ingestion/cmd/server
|
|
# Expected: build succeeds, no errors
|
|
|
|
# A route-reachability check: spin up the server pointing at a tmp brain dir
|
|
# and curl the endpoint.
|
|
TMPBRAIN=$(mktemp -d)
|
|
mkdir "$TMPBRAIN/sessions"
|
|
BRAIN_DIR="$TMPBRAIN" go run ./ingestion/cmd/server &
|
|
SERVER_PID=$!
|
|
sleep 1
|
|
curl -sS "http://localhost:3300/pass-rate?skill=tdd&window=7d" | head -3
|
|
# Expected: {"skill":"tdd","window":"7d","pass":0,"fail":0,"skip":0,"total":0,"pass_rate":null}
|
|
kill $SERVER_PID
|
|
rm -rf "$TMPBRAIN"
|
|
# Note: server may bind to a different port if BRAIN_DIR isn't enough config.
|
|
# If this step fails due to env var or port issues, fall back to confirming
|
|
# the route compiles by `go build ./ingestion/...` and rely on Task 4's
|
|
# integration test.
|
|
```
|
|
|
|
- [ ] **Step 4: Run task check**
|
|
|
|
```bash
|
|
task check 2>&1 | tail -15
|
|
```
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
```bash
|
|
git add ingestion/cmd/server/main.go
|
|
git commit -m "feat(brain): register GET /pass-rate route
|
|
|
|
Adds the route entry alongside the existing POST routes. Note: this
|
|
is the brain HTTP REST API's first GET endpoint — it follows REST
|
|
semantics for pure reads, while the legacy POST routes (query, write,
|
|
ingest, etc.) all take JSON bodies. Future read endpoints SHOULD use
|
|
GET; future write endpoints continue with POST."
|
|
```
|
|
|
|
---
|
|
|
|
## Task 4: CLI subcommand `hyperguild brain pass-rate`
|
|
|
|
**Worktree:** `~/dev/AI/hyperguild/.worktrees/pass-rate-endpoint/`
|
|
|
|
Add a third nested verb under `brain` in the hyperguild CLI. New `brainClient.PassRate` method calls the endpoint. New `runBrainPassRate` subcommand wires it into the dispatcher.
|
|
|
|
**Files:**
|
|
- Modify: `cmd/hyperguild/http.go` (add `PassRate` method)
|
|
- Modify: `cmd/hyperguild/http_test.go` (tests for `PassRate`)
|
|
- Modify: `cmd/hyperguild/brain.go` (add `runBrainPassRate`, wire into `runBrain` switch)
|
|
- Modify: `cmd/hyperguild/brain_test.go` (tests for `runBrainPassRate`)
|
|
|
|
- [ ] **Step 1: Write the failing tests**
|
|
|
|
Append to `cmd/hyperguild/http_test.go`:
|
|
|
|
```go
|
|
func TestBrainClient_PassRate_Success(t *testing.T) {
|
|
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
|
assert.Equal(t, http.MethodGet, r.Method)
|
|
assert.Equal(t, "/pass-rate", r.URL.Path)
|
|
assert.Equal(t, "tdd", r.URL.Query().Get("skill"))
|
|
assert.Equal(t, "7d", r.URL.Query().Get("window"))
|
|
w.Header().Set("Content-Type", "application/json")
|
|
_, _ = w.Write([]byte(`{"skill":"tdd","window":"7d","pass":47,"fail":3,"skip":0,"total":50,"pass_rate":0.94}`))
|
|
}))
|
|
defer srv.Close()
|
|
|
|
c := &brainClient{baseURL: srv.URL, http: srv.Client()}
|
|
res, err := c.PassRate(context.Background(), "tdd", "7d")
|
|
require.NoError(t, err)
|
|
assert.Equal(t, "tdd", res.Skill)
|
|
assert.Equal(t, 47, res.Pass)
|
|
assert.Equal(t, 3, res.Fail)
|
|
require.NotNil(t, res.PassRate)
|
|
assert.InDelta(t, 0.94, *res.PassRate, 0.001)
|
|
}
|
|
|
|
func TestBrainClient_PassRate_NullRate(t *testing.T) {
|
|
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
|
w.Header().Set("Content-Type", "application/json")
|
|
_, _ = w.Write([]byte(`{"skill":"tdd","window":"7d","pass":0,"fail":0,"skip":0,"total":0,"pass_rate":null}`))
|
|
}))
|
|
defer srv.Close()
|
|
|
|
c := &brainClient{baseURL: srv.URL, http: srv.Client()}
|
|
res, err := c.PassRate(context.Background(), "tdd", "7d")
|
|
require.NoError(t, err)
|
|
assert.Nil(t, res.PassRate)
|
|
}
|
|
```
|
|
|
|
Append to `cmd/hyperguild/brain_test.go`:
|
|
|
|
```go
|
|
func TestRunBrainPassRate_Human(t *testing.T) {
|
|
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
|
_, _ = w.Write([]byte(`{"skill":"tdd","window":"7d","pass":47,"fail":3,"skip":0,"total":50,"pass_rate":0.94}`))
|
|
}))
|
|
defer srv.Close()
|
|
t.Setenv("BRAIN_URL", srv.URL)
|
|
|
|
var out, errBuf bytes.Buffer
|
|
err := runBrain(context.Background(), []string{"pass-rate", "tdd"}, strings.NewReader(""), &out, &errBuf)
|
|
require.NoError(t, err)
|
|
got := out.String()
|
|
assert.Contains(t, got, "tdd")
|
|
assert.Contains(t, got, "47 / 50")
|
|
assert.Contains(t, got, "94%")
|
|
}
|
|
|
|
func TestRunBrainPassRate_NoData(t *testing.T) {
|
|
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
|
_, _ = w.Write([]byte(`{"skill":"tdd","window":"7d","pass":0,"fail":0,"skip":0,"total":0,"pass_rate":null}`))
|
|
}))
|
|
defer srv.Close()
|
|
t.Setenv("BRAIN_URL", srv.URL)
|
|
|
|
var out, errBuf bytes.Buffer
|
|
err := runBrain(context.Background(), []string{"pass-rate", "tdd"}, strings.NewReader(""), &out, &errBuf)
|
|
require.NoError(t, err)
|
|
assert.Contains(t, out.String(), "no data")
|
|
}
|
|
|
|
func TestRunBrainPassRate_JSON(t *testing.T) {
|
|
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
|
_, _ = w.Write([]byte(`{"skill":"tdd","window":"7d","pass":47,"fail":3,"skip":0,"total":50,"pass_rate":0.94}`))
|
|
}))
|
|
defer srv.Close()
|
|
t.Setenv("BRAIN_URL", srv.URL)
|
|
|
|
var out, errBuf bytes.Buffer
|
|
err := runBrain(context.Background(), []string{"pass-rate", "--json", "tdd"}, strings.NewReader(""), &out, &errBuf)
|
|
require.NoError(t, err)
|
|
assert.Contains(t, out.String(), `"pass_rate": 0.94`)
|
|
}
|
|
|
|
func TestRunBrainPassRate_MissingSkill(t *testing.T) {
|
|
var out, errBuf bytes.Buffer
|
|
err := runBrain(context.Background(), []string{"pass-rate"}, strings.NewReader(""), &out, &errBuf)
|
|
assert.Error(t, err)
|
|
assert.Contains(t, err.Error(), "skill required")
|
|
}
|
|
|
|
func TestRunBrainPassRate_WindowFlag(t *testing.T) {
|
|
gotWindow := ""
|
|
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
|
gotWindow = r.URL.Query().Get("window")
|
|
_, _ = w.Write([]byte(`{"skill":"tdd","window":"30d","pass":0,"fail":0,"skip":0,"total":0,"pass_rate":null}`))
|
|
}))
|
|
defer srv.Close()
|
|
t.Setenv("BRAIN_URL", srv.URL)
|
|
|
|
var out, errBuf bytes.Buffer
|
|
err := runBrain(context.Background(), []string{"pass-rate", "--window", "30d", "tdd"}, strings.NewReader(""), &out, &errBuf)
|
|
require.NoError(t, err)
|
|
assert.Equal(t, "30d", gotWindow)
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 2: Run tests, expect failure (undefined symbols)**
|
|
|
|
```bash
|
|
cd ~/dev/AI/hyperguild/.worktrees/pass-rate-endpoint
|
|
go test ./cmd/hyperguild/... -run "TestBrainClient_PassRate|TestRunBrainPassRate" 2>&1 | tail -10
|
|
# Expected: build failure — `PassRate`, `runBrainPassRate` undefined
|
|
```
|
|
|
|
- [ ] **Step 3: Add the brainClient method**
|
|
|
|
In `cmd/hyperguild/http.go`, append below the existing `Write` method:
|
|
|
|
```go
|
|
// PassRateResult mirrors the /pass-rate response envelope.
|
|
type PassRateResult struct {
|
|
Skill string `json:"skill"`
|
|
Window string `json:"window"`
|
|
Pass int `json:"pass"`
|
|
Fail int `json:"fail"`
|
|
Skip int `json:"skip"`
|
|
Total int `json:"total"`
|
|
PassRate *float64 `json:"pass_rate"`
|
|
}
|
|
|
|
func (c *brainClient) PassRate(ctx context.Context, skill, window string) (*PassRateResult, error) {
|
|
q := url.Values{}
|
|
q.Set("skill", skill)
|
|
q.Set("window", window)
|
|
u := c.baseURL + "/pass-rate?" + q.Encode()
|
|
|
|
req, err := http.NewRequestWithContext(ctx, http.MethodGet, u, nil)
|
|
if err != nil {
|
|
return nil, fmt.Errorf("build request: %w", err)
|
|
}
|
|
resp, err := c.http.Do(req)
|
|
if err != nil {
|
|
return nil, fmt.Errorf("brain GET /pass-rate: %w", err)
|
|
}
|
|
defer func() { _ = resp.Body.Close() }()
|
|
if resp.StatusCode != http.StatusOK {
|
|
body, _ := io.ReadAll(resp.Body)
|
|
return nil, fmt.Errorf("brain GET /pass-rate: status %d: %s", resp.StatusCode, string(body))
|
|
}
|
|
var out PassRateResult
|
|
if err := json.NewDecoder(resp.Body).Decode(&out); err != nil {
|
|
return nil, fmt.Errorf("decode /pass-rate response: %w", err)
|
|
}
|
|
return &out, nil
|
|
}
|
|
```
|
|
|
|
The `net/url` and `strconv` imports were removed in Plan 4's GET→POST fix; `net/url` needs to come back for this method. Add it to the imports.
|
|
|
|
- [ ] **Step 4: Add `runBrainPassRate` and wire into the dispatcher**
|
|
|
|
In `cmd/hyperguild/brain.go`, update the `runBrain` switch to add a `pass-rate` case:
|
|
|
|
```go
|
|
func runBrain(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io.Writer) error {
|
|
if len(args) == 0 {
|
|
return errors.New("subcommand required (query|write|pass-rate)")
|
|
}
|
|
switch args[0] {
|
|
case "query":
|
|
return runBrainQuery(ctx, args[1:], stdin, stdout, stderr)
|
|
case "write":
|
|
return runBrainWrite(ctx, args[1:], stdin, stdout, stderr)
|
|
case "pass-rate":
|
|
return runBrainPassRate(ctx, args[1:], stdin, stdout, stderr)
|
|
default:
|
|
return fmt.Errorf("unknown subcommand: %s (expected query|write|pass-rate)", args[0])
|
|
}
|
|
}
|
|
```
|
|
|
|
Then append `runBrainPassRate`:
|
|
|
|
```go
|
|
func runBrainPassRate(ctx context.Context, args []string, _ io.Reader, stdout, stderr io.Writer) error {
|
|
fs := flag.NewFlagSet("brain pass-rate", flag.ContinueOnError)
|
|
fs.SetOutput(stderr)
|
|
asJSON := fs.Bool("json", false, "output JSON instead of human-readable")
|
|
window := fs.String("window", "7d", "lookback window (e.g. 1h, 24h, 7d, 30d)")
|
|
if err := fs.Parse(args); err != nil {
|
|
return fmt.Errorf("parse flags: %w", err)
|
|
}
|
|
if fs.NArg() < 1 {
|
|
return errors.New("skill required")
|
|
}
|
|
skill := fs.Arg(0)
|
|
|
|
res, err := newBrainClient().PassRate(ctx, skill, *window)
|
|
if err != nil {
|
|
return err
|
|
}
|
|
|
|
if *asJSON {
|
|
enc := json.NewEncoder(stdout)
|
|
enc.SetIndent("", " ")
|
|
return enc.Encode(res)
|
|
}
|
|
if res.PassRate == nil {
|
|
fmt.Fprintf(stdout, "%s: no data (window: %s)\n", res.Skill, res.Window) //nolint:errcheck
|
|
return nil
|
|
}
|
|
fmt.Fprintf(stdout, "%s: %d / %d = %.0f%% (window: %s)\n", res.Skill, res.Pass, res.Total, *res.PassRate*100, res.Window) //nolint:errcheck
|
|
return nil
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 5: Run tests, expect 5 + 2 + previous all PASS**
|
|
|
|
```bash
|
|
go test ./cmd/hyperguild/... -v 2>&1 | tail -50
|
|
# Expected: previous tests still pass + 7 new tests pass
|
|
```
|
|
|
|
- [ ] **Step 6: Run task check**
|
|
|
|
```bash
|
|
task check 2>&1 | tail -15
|
|
```
|
|
|
|
- [ ] **Step 7: Commit**
|
|
|
|
```bash
|
|
git add cmd/hyperguild/http.go cmd/hyperguild/http_test.go cmd/hyperguild/brain.go cmd/hyperguild/brain_test.go
|
|
git commit -m "feat(hyperguild): brain pass-rate subcommand
|
|
|
|
Adds 'hyperguild brain pass-rate <skill> [--window 7d] [--json]'
|
|
calling the new /pass-rate endpoint. Human output:
|
|
tdd: 47 / 50 = 94% (window: 7d)
|
|
or 'no data (window: 7d)' when pass_rate is null.
|
|
|
|
PassRateResult mirrors the response envelope; PassRate is *float64
|
|
so null is preserved across decode."
|
|
```
|
|
|
|
---
|
|
|
|
## Task 5: Rollout — instrument 6 other binary-outcome SKILL.md files
|
|
|
|
**Worktree:** `~/Documents/local-dev-worktrees/pass-rate-instrumentation/` (`local-dev` repo)
|
|
|
|
Apply the same Logging subsection pattern from Task 1 to six more skills, each with its own phase set. Single commit covers all six (small, mechanical).
|
|
|
|
**Files (all six, same pattern):**
|
|
|
|
| File | Phases to log |
|
|
|---|---|
|
|
| `.skills/code-review/SKILL.md` | One log call per phase the skill defines (typically `phase-0` through `phase-3` or similar — check the skill's existing structure) |
|
|
| `.skills/debug/SKILL.md` | One log call at the end of each phase (`hypothesize`, `instrument`, `verify`, `fix` — or whatever the skill defines) |
|
|
| `.skills/feature-spec/SKILL.md` | One log call per output phase (typically `draft`, `review`, `approved`) |
|
|
| `.skills/session-retrospective/SKILL.md` | One log call at the end (`surfaced` with `final_status` reflecting whether candidates were emitted) |
|
|
| `.skills/trainer/SKILL.md` | Two log calls: one at end of READ phase, one at end of WRITE phase |
|
|
| `.skills/spec-driven-dev/SKILL.md` | One log call per output phase (typically per `spec`, `decide`, `plan`) |
|
|
|
|
For each file:
|
|
|
|
- [ ] **Step 1: Open the file and locate "## Brain MCP Integration"**
|
|
|
|
```bash
|
|
cd ~/Documents/local-dev-worktrees/pass-rate-instrumentation
|
|
grep -n "## Brain MCP Integration" .skills/code-review/SKILL.md .skills/debug/SKILL.md .skills/feature-spec/SKILL.md .skills/session-retrospective/SKILL.md .skills/trainer/SKILL.md .skills/spec-driven-dev/SKILL.md
|
|
```
|
|
|
|
- [ ] **Step 2: Append the Logging subsection to each**
|
|
|
|
The template (substitute `<skill-name>` and the per-skill phase set):
|
|
|
|
```markdown
|
|
### Logging
|
|
|
|
Call `session_log` once at the end of every phase to record the outcome.
|
|
Pass-rate is computed downstream by the `/pass-rate` HTTP endpoint, which
|
|
treats `pass` as success, `fail` as failure, `skip` as neither.
|
|
|
|
**At end of each phase:**
|
|
- `session_log` with `{skill: "<skill-name>", phase: "<phase-name>", final_status: "pass" | "fail" | "skip", message: "<one-line summary>", duration_ms: <wall-clock>, project_root: "<absolute path>"}`
|
|
|
|
**Phases for this skill:** [list the phases the skill defines, e.g. `red`, `green`, `refactor` for tdd; `phase-0`, `phase-1`, `phase-2`, `phase-3` for code-review; etc.]
|
|
|
|
**Status semantics:**
|
|
- `pass` — the phase's intended outcome was reached.
|
|
- `fail` — the phase's intended outcome was NOT reached.
|
|
- `skip` — phase was skipped intentionally.
|
|
|
|
**Why this matters:** the routing pod (Plan 6) reads pass-rate to decide whether to route a future call to a local model. If your skill never logs, the routing pod sees no data.
|
|
```
|
|
|
|
For each of the six SKILL.md files, insert this subsection at the end of the existing "## Brain MCP Integration" section (immediately before the next `##` header). Substitute `<skill-name>` with the skill's actual name. Read each skill's existing Process or Phases section to know what phases to list.
|
|
|
|
- [ ] **Step 3: Verify all six were updated**
|
|
|
|
```bash
|
|
for f in code-review debug feature-spec session-retrospective trainer spec-driven-dev; do
|
|
echo "=== $f ==="
|
|
grep -c "session_log" .skills/$f/SKILL.md
|
|
done
|
|
# Each count should be ≥ 2 (the call site + the introductory mention).
|
|
```
|
|
|
|
- [ ] **Step 4: Commit**
|
|
|
|
```bash
|
|
git add .skills/code-review/SKILL.md .skills/debug/SKILL.md .skills/feature-spec/SKILL.md .skills/session-retrospective/SKILL.md .skills/trainer/SKILL.md .skills/spec-driven-dev/SKILL.md
|
|
git commit -m "feat(skills): instrument 6 binary-outcome skills with session_log — Plan 5 rollout
|
|
|
|
Following the tdd pilot, applies the Logging subsection template to:
|
|
|
|
- code-review
|
|
- debug
|
|
- feature-spec
|
|
- session-retrospective
|
|
- trainer
|
|
- spec-driven-dev
|
|
|
|
Each skill's Brain MCP Integration section gains a literal session_log
|
|
call template with the skill's specific phase set and pass/fail/skip
|
|
semantics. Skills with no clear binary outcome (clean-code,
|
|
cognitive-load, solid, refactoring, test-design, problem-analysis,
|
|
user-stories, planning, atdd, gitea-ci) are intentionally not
|
|
instrumented — they're reading or process-shaping content, not
|
|
discrete actions."
|
|
```
|
|
|
|
---
|
|
|
|
## Task 6: Documentation updates
|
|
|
|
**Worktree:** `~/dev/AI/hyperguild/.worktrees/pass-rate-endpoint/`
|
|
|
|
Update `cmd/hyperguild/README.md` to document the new `pass-rate` subcommand. Update `CLAUDE.md` to mention the `/pass-rate` endpoint alongside the other brain HTTP REST endpoints.
|
|
|
|
**Files:**
|
|
- Modify: `cmd/hyperguild/README.md` (add `pass-rate` subcommand block)
|
|
- Modify: `CLAUDE.md` (mention `/pass-rate` in the brain HTTP REST API section)
|
|
- Modify: `internal/skills/sessionlog/skill.go` (or wherever the `session_log` MCP tool is registered) — update the `final_status` description from `ok | error | skipped` to `pass | fail | skip (legacy: ok | error | skipped accepted)` so the agent-facing tool definition matches the new SKILL.md vocabulary
|
|
|
|
- [ ] **Step 1: Update the CLI README**
|
|
|
|
Find the existing `### hyperguild brain write <type> <slug>` section in `cmd/hyperguild/README.md`. After it (before the next `###` heading), insert:
|
|
|
|
```markdown
|
|
### `hyperguild brain pass-rate <skill>`
|
|
|
|
Returns the pass rate for a skill over a lookback window. Computed
|
|
on-demand from `brain/sessions/*.jsonl`.
|
|
|
|
```bash
|
|
$ hyperguild brain pass-rate tdd
|
|
tdd: 47 / 50 = 94% (window: 7d)
|
|
|
|
$ hyperguild brain pass-rate tdd --window 30d --json
|
|
{
|
|
"skill": "tdd",
|
|
"window": "30d",
|
|
"pass": 142,
|
|
"fail": 8,
|
|
"skip": 5,
|
|
"total": 155,
|
|
"pass_rate": 0.9467
|
|
}
|
|
```
|
|
|
|
Flags:
|
|
|
|
- `--window` — lookback window (default `7d`; accepts `Nh`, `Nd`)
|
|
- `--json` — emit the raw response envelope
|
|
|
|
Skills with no logged invocations return zero counts and `pass_rate: null`
|
|
(indicating "no data", distinct from "always passes").
|
|
```
|
|
|
|
- [ ] **Step 1.5: Update the `session_log` MCP tool's `final_status` description**
|
|
|
|
Find the `session_log` tool registration in the supervisor (`internal/skills/sessionlog/`). Update the `final_status` field's description string to read:
|
|
|
|
```
|
|
pass | fail | skip (legacy: ok | error | skipped also accepted)
|
|
```
|
|
|
|
This keeps the agent-facing schema honest — Plan 5's SKILL.md contract uses the new vocabulary. The aggregator normalizes both for backward compat, but the tool's docstring should lead with the new vocabulary.
|
|
|
|
```bash
|
|
grep -rn "ok | error | skipped\|final_status.*description" ~/dev/AI/hyperguild/.worktrees/pass-rate-endpoint/internal/skills/sessionlog/ ~/dev/AI/hyperguild/.worktrees/pass-rate-endpoint/ingestion/internal/skills/sessionlog/ 2>/dev/null | head
|
|
# Locate the registration; update the description string verbatim.
|
|
```
|
|
|
|
- [ ] **Step 2: Update CLAUDE.md**
|
|
|
|
Find the section in `CLAUDE.md` that documents the brain HTTP REST API (look for `## MCP endpoints` or similar). After the existing endpoint list, add a note about `/pass-rate`:
|
|
|
|
```markdown
|
|
The brain HTTP REST API also serves a read-only `GET /pass-rate?skill=X&window=Y`
|
|
endpoint that aggregates `final_status` counts from session logs and returns
|
|
`{skill, window, pass, fail, skip, total, pass_rate}`. Plan 6 (routing pod)
|
|
reads this to decide whether to route skill calls to local models. Pass rate
|
|
is `null` when no logged invocations are in the window.
|
|
```
|
|
|
|
- [ ] **Step 3: Run task check**
|
|
|
|
```bash
|
|
task check 2>&1 | tail -15
|
|
```
|
|
|
|
- [ ] **Step 4: Commit**
|
|
|
|
```bash
|
|
git add cmd/hyperguild/README.md CLAUDE.md
|
|
git commit -m "docs(hyperguild): document brain pass-rate subcommand and /pass-rate endpoint
|
|
|
|
Adds pass-rate to the CLI README's subcommand block. Updates CLAUDE.md
|
|
to note the new /pass-rate endpoint alongside the existing brain
|
|
HTTP REST API surface."
|
|
```
|
|
|
|
---
|
|
|
|
## Task 7: Push both branches
|
|
|
|
**Both worktrees.** Push the `local-dev` instrumentation branch first (Phase 1), then the `hyperguild` endpoint+CLI branch (Phase 2).
|
|
|
|
- [ ] **Step 1: Verify both worktrees are clean**
|
|
|
|
```bash
|
|
cd ~/Documents/local-dev-worktrees/pass-rate-instrumentation
|
|
git status
|
|
git log --oneline origin/main..HEAD
|
|
# Expected: 2 commits (Task 1 pilot + Task 5 rollout), working tree clean
|
|
|
|
cd ~/dev/AI/hyperguild/.worktrees/pass-rate-endpoint
|
|
git status
|
|
git log --oneline origin/main..HEAD
|
|
# Expected: 5 commits (Task 2-4, 6), working tree clean
|
|
```
|
|
|
|
- [ ] **Step 2: Push both**
|
|
|
|
```bash
|
|
cd ~/Documents/local-dev-worktrees/pass-rate-instrumentation
|
|
git push -u origin HEAD
|
|
# Note the gitea PR URL.
|
|
|
|
cd ~/dev/AI/hyperguild/.worktrees/pass-rate-endpoint
|
|
git push -u origin HEAD
|
|
# Note the gitea PR URL.
|
|
```
|
|
|
|
- [ ] **Step 3: Surface the two PR URLs to the controller**
|
|
|
|
Stop after both pushes. The controller decides merge timing — both branches are independent and can merge in either order, though Phase 1 (instrumentation) should land first so Phase 2's smoke test against a fresh `tdd` invocation has a chance of producing real data.
|
|
|
|
---
|
|
|
|
## Self-review
|
|
|
|
**1. Spec coverage:**
|
|
|
|
| Spec success criterion | Implementing task |
|
|
|---|---|
|
|
| `tdd` instrumented (pilot) | Task 1 |
|
|
| 6 other skills instrumented | Task 5 |
|
|
| `GET /pass-rate?skill=X&window=Y` returns aggregate JSON | Task 2 |
|
|
| Aggregator normalizes `pass`/`ok`, `fail`/`error`, `skip`/`skipped` | Task 2 (`normalizeStatus`) |
|
|
| Endpoint returns null pass_rate when pass+fail==0 | Task 2 (`TestPassRate_NoData_ReturnsZerosAndNullRate`) |
|
|
| Optional `hyperguild brain pass-rate` CLI subcommand | Task 4 |
|
|
| `task check` passes per task and on merged branches | Step in every task |
|
|
| Real data for `tdd` 1 week post-merge | Manual verification (out-of-band) |
|
|
|
|
**2. Placeholder scan:** No "TBD"/"TODO"/"appropriate"/"similar to Task N". Each rollout SKILL.md has a concrete template; the per-skill phase set is the only thing the implementer fills in (not a placeholder — it requires reading each skill's existing structure).
|
|
|
|
**3. Type/name consistency:**
|
|
- `passRateResponse` (lowercase) on the brain side; `PassRateResult` (uppercase, exported) on the CLI side. Different packages, different conventions. Field names match: `Skill`, `Window`, `Pass`, `Fail`, `Skip`, `Total`, `PassRate`. JSON tags identical.
|
|
- `normalizeStatus`, `parseWindow` defined in Task 2, used unchanged.
|
|
- `runBrainPassRate`, `PassRate` named consistently in Tasks 2/4.
|
|
- `BRAIN_URL` env var consumed by `newBrainClient()` (existing) — no new env vars.
|
|
|
|
---
|
|
|
|
## Risks (plan-level)
|
|
|
|
| Risk | Mitigation |
|
|
|---|---|
|
|
| Phase 1 worktree path conflicts with prior `cross-tool-skill-wiring` worktree (both under `local-dev-worktrees/`) | Different branch name; the cross-tool-skill-wiring worktree was removed after Plan 3 merge. New worktree is created fresh. |
|
|
| Implementer subagent doesn't have brain MCP access (can't dogfood the new tdd contract) | DONE_WITH_CONCERNS path documented in Task 2; controller validates manually post-merge. |
|
|
| Phase 2's smoke test (Task 3 Step 3) requires running the ingestion server locally, which may need additional config | Falls back to relying on Task 4's integration test (which uses httptest.Server + the real Handler). |
|
|
| `final_status` in legacy entry uses `pass` (not the documented `ok`) — confirmed during spec phase | Aggregator handles both; no migration. |
|
|
| Plan 6 may want per-model or per-mode aggregation that this endpoint doesn't provide | Endpoint shape is small and additive; Plan 6 can extend with new query params (`?model=X`) without breaking existing callers. |
|
|
| 6-skill rollout in a single commit could blur per-skill mistakes | Acceptable — the rollout is mechanical (same template, six substitutions). If an individual skill has a structural quirk that doesn't fit the template, surface as DONE_WITH_CONCERNS and fix in a follow-up. |
|