Follow-up to infra#70. LiteLLM moved off piguard into k3s and the
public llm-api.d-ma.be hostname now upstreams to koala:30401. The
piguard:4000 default in the source bit-rots — works today because
piguard:4000 is still alive during the 7-day soak, breaks the moment
the compose comes down.
Pointing the default at the public hostname survives the cutover
without needing a follow-up. Production deploys via k3s already
override via env (in-cluster Service DNS) so this only affects local
dev shells without LITELLM_BASE_URL set.
- internal/config/routing.go: comment + envOr fallback
- internal/config/routing_test.go: expected value in defaults test
- scripts/smoke-routing.sh: shell default
task check: clean (tests + vet + govulncheck).
The routing decision is about reasoning capacity, not cost or provider.
Fast model (koala/qwen35-9b-fast) handles high-pass-rate calls; thinking
model (iguana/gemma4-26b) handles low-pass-rate calls. Removes the
implicit Anthropic dependency from the routing pod — both models go
through LiteLLM.
Renames: HYPERGUILD_LOCAL_MODEL → HYPERGUILD_FAST_MODEL,
HYPERGUILD_CLAUDE_MODEL → HYPERGUILD_THINKING_MODEL,
Router.LocalModel → FastModel, Router.ClaudeModel → ThinkingModel,
log decision "claude_fallback" → "thinking_fallback".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Typed config struct and env parser for the routing pod. Kept separate
from the supervisor Config to avoid forcing routing fields onto the
supervisor and vice versa. Uses the existing envOr helper from config.go.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>