feat: Phase 2 — community-friendly provider registry system #1195

PR
PR description

Summary

  • Problem: After Phase 1, providers are cleanly extracted but still built-in only — the factory is a hardcoded switch, provider identity is closed ('claude' | 'codex'), and no extension path exists for community providers.
  • Why it matters: Community contributors cannot add a new provider without modifying core code. The closed union type propagates through schemas, config, deps, UI, and CLI.
  • What changed: Replace the factory switch with a typed ProviderRegistration registry. Each entry carries metadata (displayName, capabilities, isModelCompatible) alongside the factory function. Widen provider unions to string everywhere. Add GET /api/providers endpoint. Make all UI and CLI surfaces dynamic. Validate provider strings at config-entry surfaces.
  • What did not change (scope boundary): No community provider is shipped yet (proves the system after merge). No dynamic config forms for community providers. No out-of-tree plugin loading. Built-in auth flows and provider-specific UI panels remain hardcoded by design.

UX Journey

Before

User                      Archon                         Config/UI
────                      ──────                         ─────────
sets provider ──────────▶ factory.ts switch(type)
                          case 'claude': new Claude()
                          case 'codex': new Codex()
                          default: throw

config YAML ────────────▶ z.enum(['claude','codex'])     hardcoded dropdowns
                          fixed assistants.claude/codex   fixed Settings fields

After

User                      Archon                         Config/UI
────                      ──────                         ─────────
sets provider ──────────▶ registry.get(id)
                          looks up ProviderRegistration
                          calls entry.factory()
                          unknown → UnknownProviderError

config YAML ────────────▶ z.string() + registry check    GET /api/providers
                          generic assistants[provider]    dynamic dropdowns
                          ProviderDefaultsMap             dynamic Settings

Architecture Diagram

Before

@archon/providers/factory.ts ──▶ switch('claude'|'codex') ──▶ ClaudeProvider / CodexProvider
@archon/workflows/schemas ──────▶ z.enum(['claude','codex'])
@archon/core/config-types ──────▶ assistants: { claude: ...; codex: ... }
@archon/web/SettingsPage ───────▶ hardcoded <option value="claude"> / <option value="codex">
model-validation.ts ────────────▶ isClaudeModel() hardcoded logic

After

[+] @archon/providers/registry.ts ──▶ Map<string, ProviderRegistration> ──▶ factory()
[-] @archon/providers/factory.ts (deleted)
[~] @archon/workflows/schemas ──────▶ z.string().min(1)
[~] @archon/core/config-types ──────▶ assistants: ProviderDefaultsMap & { claude; codex }
[~] @archon/web/SettingsPage ───────▶ fetches GET /api/providers → dynamic dropdowns
[~] model-validation.ts ────────────▶ delegates to registry isModelCompatible()
[+] @archon/server/routes/api.ts ───▶ GET /api/providers endpoint
[+] @archon/web/hooks/useProviders ─▶ React hook for provider list
[~] @archon/workflows/validator.ts ─▶ capability-driven warnings (not hardcoded Codex checks)

Connection inventory:

From To Status Notes
registry.ts ClaudeProvider / CodexProvider new Via registerBuiltinProviders()
server/index.ts registry new Bootstrap at startup
cli/cli.ts registry new Bootstrap at startup
config-loader.ts registry new Validates provider strings
model-validation.ts registry new Delegates compatibility checks
validator.ts registry new Capability-driven warnings
GET /api/providers registry new Returns ProviderInfo[]
Web UI dropdowns /api/providers new Dynamic provider lists
factory.ts providers removed Replaced by registry

Label Snapshot

  • Risk: risk: medium
  • Size: size: L
  • Scope: multi
  • Module: providers:registry, core:config, workflows:executor, server:api, web:settings, cli:setup

Change Metadata

  • Change type: feature
  • Primary scope: multi

Linked Issue

  • Related: Phase 2 of provider extraction (follows Phase 1 in #1185)

Validation Evidence (required)

bun run validate  # type-check + lint + format + tests — all pass
  • bun run type-check — all 10 packages pass
  • bun run lint --max-warnings 0 — 0 errors, 0 warnings
  • bun run format:check — all files pass
  • bun run test (per-package isolation) — all tests pass
  • Note: Running bun test with multiple test files in a single process triggers mock.module pollution (documented in CLAUDE.md). Always use bun run test for isolation.

Security Impact (required)

  • New permissions/capabilities? No
  • New external network calls? No
  • Secrets/tokens handling changed? No
  • File system access scope changed? No

Compatibility / Migration

  • Backward compatible? Yes — existing 'claude' and 'codex' values in YAML configs continue to work unchanged
  • Config/env changes? No — existing config keys are preserved. DEFAULT_AI_ASSISTANT env var now validates against registry (errors on unknown provider instead of silently accepting)
  • Database migration needed? No

Human Verification (required)

  • Verified scenarios: Type check, lint, format, all test suites pass via bun run validate
  • Edge cases checked: Unknown provider in config (throws), unknown provider in env var (throws), model/provider compatibility via registry, idempotent bootstrap
  • What was not verified: Manual browser testing of Settings page and workflow builder dropdowns (requires running dev server + web UI)

Side Effects / Blast Radius (required)

  • Affected subsystems/workflows: Provider instantiation, config loading, workflow execution, web UI settings, CLI setup wizard, workflow resource validation
  • Potential unintended effects: DEFAULT_AI_ASSISTANT env var now fails loudly on unknown provider (was silently accepted before). Workflow YAML with provider: field now accepts any string (was z.enum(['claude','codex']))
  • Guardrails/monitoring for early detection: Unknown providers throw at config load time with clear error listing available providers. Registry bootstrap is called defensively in loadConfig() as well as entrypoints.

Rollback Plan (required)

  • Fast rollback command/path: git revert the three commits on this branch
  • Feature flags or config toggles: None — but the change is backward compatible (existing configs work unchanged)
  • Observable failure symptoms: Config load failures mentioning "not a registered provider", missing providers in UI dropdowns, workflow execution errors on provider lookup

Risks and Mitigations

  • Risk: DEFAULT_AI_ASSISTANT env var behavior change is breaking for users with typos in their env
    • Mitigation: Error message clearly names the unknown provider and lists available options. This is intentional — silent acceptance was the actual footgun.
  • Risk: Stale generated types if server isn't running during generate:types
    • Mitigation: Types regenerated as part of this PR from the updated OpenAPI spec.

Summary by CodeRabbit

  • New Features

    • GET /api/providers endpoint listing available AI providers and capabilities.
    • UI now loads provider options dynamically; added provider-listing hook for components.
  • Improvements

    • Configuration, setup, and validation now require/validate provider IDs against the registry; defaults derive from registered providers.
    • Capability-driven compatibility checks and warnings replace hardcoded provider assumptions.
  • Tests

    • New/updated tests covering registry, API responses, and registry-driven model validation.
  • Documentation

    • Docs updated to state provider IDs must match registered providers.
CUT
cutter bot commented just now

Cutter Summary

The Settings page assistant dropdown and the workflow builder's toolbar provider dropdown both list "Claude" and "Codex" as plain labels, without the parenthesized vendor names ("Claude (Anthropic)", "Codex (OpenAI)") that the registry-driven version would display.