ADR-0030: Voice Canonical Runtime Surface

Date: 2026-03-13 Status: Proposed Author: Architect Agent Supersedes: N/A

1. Context

Jorvis voice capability is implemented across two runtime surfaces:

  1. the NestJS VoiceModule under analytics-platform/src/voice/**
  2. the standalone voice-gateway/ runtime used for Gemini Live bridge flows

Both surfaces are real and useful, but they are not equivalent:

  • VoiceModule owns provider adapters, fallback chains, REST-facing STT/TTS endpoints, shared cost tracking, and application-local voice services
  • voice-gateway owns a standalone realtime runtime optimized for Gemini Live websocket/audio bridging and is already included in local/prod deployment surfaces

Current repo/docs truth does not define one canonical production runtime surface for future voice work. That creates drift risk in:

  • route ownership
  • realtime contract evolution
  • auth and operator expectations
  • health/version checks
  • future deploy and rollback assumptions

Without a canonical runtime-surface decision, future voice work risks widening both paths independently.

2. Decision

The standalone voice-gateway is the canonical production runtime surface for realtime voice interaction.

The NestJS VoiceModule remains the canonical internal application voice surface for:

  • shared provider adapters
  • fallback chains
  • REST-oriented STT/TTS endpoints
  • cost tracking and supporting services
  • application-local voice integration points

2.1 Canonical Boundary

Future work must treat these boundaries as authoritative:

  1. Standalone voice-gateway
    • canonical public/prod realtime ingress
    • websocket/audio-session runtime
    • runtime health surface for realtime voice
  2. NestJS VoiceModule
    • canonical internal/provider orchestration layer
    • REST-compatible transcription/speech surfaces
    • reusable application voice services

2.2 Non-Authorization Rule

This ADR does not authorize implementation by itself.

It does not issue GO for:

  • consolidating the two runtimes into one
  • removing one runtime
  • changing production routing
  • changing auth/session behavior
  • changing deployment topology

Any such work requires a fresh Stage 0 and fresh GO on the exact review head.

3. Alternatives Considered

  • Treat NestJS voice as the only canonical runtime. Rejected: it does not match the current standalone realtime gateway reality.
  • Treat both surfaces as equal long-term primaries. Rejected: this preserves architectural ambiguity and future drift.
  • Merge both runtimes immediately. Rejected: too broad for the current architecture decision and not justified by a fresh problem statement.

4. Consequences

Positive

  • future voice work gets a clear primary production runtime surface
  • realtime and non-realtime voice evolution can be reasoned about separately
  • docs and deploy expectations become easier to normalize later

Negative

  • current dual-surface architecture remains in place for now
  • a later cleanup/consolidation decision is still required if Jorvis wants one canonical implementation path

5. References

  • analytics-platform/src/voice/voice.module.ts
  • analytics-platform/src/voice/audio.controller.ts
  • analytics-platform/src/voice/realtime.gateway.ts
  • voice-gateway/server.js
  • deploy/docker-compose.local.yml
  • deploy/docker-compose.prod.yml
  • docs/architecture/VOICE_PLATFORM.md