AcceptedFebruary 2026

Routing Reliability

Fixture-backed routing regression controls that catch drift before release, using a golden corpus, overlap detection, and a deterministic scoring engine.

Context

As workflow count grew, prompt overlap increased and routing risk became a production confidence concern. Similar prompts could score close across multiple workflows, trigger edits could silently change routing outcomes, and tie behavior needed explicit governance.

Core Decisions

1. Golden Routing Corpus

A fixed fixture defines representative prompts and expected workflows. Any behavior change must update this corpus explicitly.

The routing corpus acts as a regression safety net, making routing changes visible and reviewable in pull requests.

2. Overlap/Tie Risk Detection

A dedicated test suite checks high-risk near-ties. Known acceptable ambiguities are documented in a controlled allowlist.

3. Deterministic Scoring Engine for Tests

Routing behavior is validated through a deterministic engine with:

  • Trigger type weighting (regex, contains, exact)
  • Context satisfaction penalties/boosts
  • Explicit priority boosts for strong signals
  • Confidence ranking visibility for winner and runner-up

4. Change Management Rule

Trigger or routing logic changes are not considered complete unless corpus expectations are reviewed, overlap checks pass or allowlist entries are justified, and the full agent suite remains green.

Key References

.github/tests/fixtures/routing-corpus.json.github/tests/fixtures/routing-overlap-allowlist.json.github/tests/routing.test.ts.github/tests/routing-overlap.test.ts.github/tests/utils/routing-engine.ts

Trade-offs

Routing regressions are caught pre-merge. Ambiguous-intent risks are visible and reviewable. Workflow trigger tuning has measurable impact in CI.
Corpus can become stale if not refreshed with real prompt patterns. Overfitting to fixture prompts may miss unseen user phrasing.