ADR 0002: Graph/State-Machine Orchestration for Text2SQL Pipeline

Status

Accepted

Context

Jorvis has a multi-step pipeline (context → SQL generation → validation → execution → synthesis → optional visualization/metadata). Imperative logic in a single "long service" hampers maintainability and increases the risk of invariant drift.

LangGraph as an approach highlights the benefits of explicit orchestration: durable execution, human-in-the-loop, memory, observability.

In Jorvis, we don't want to pull Python frameworks into runtime. We need a pattern, not a dependency.

Decision

Refactor Text2SQL orchestration into an explicit state machine (graph) in TypeScript:

  • nodes = pipeline steps (resolve profile/dialect, build schema context, generate SQL, validate, execute, synthesize, optional repair, optional visualize)
  • edges = explicit transitions based on results (OK/ERROR/TIMEOUT/VALIDATION_FAIL)
  • state = minimal data set carried between steps (no PII/secrets).

Scope / Non-goals

  • Do not add a new external orchestration framework in v1.
  • Do not change public API (OpenAI-compat endpoints) within this ADR.
  • Do not create a sidecar security gate.

Implementation sketch

  1. Introduce ConversationState type (request metadata + db_profile_id + target_dialect + sql_fingerprint + step outcomes).
  2. Extract each step into a pure-ish handler (input state → output state + event).
  3. Add minimal state machine "runner" (switch/event loop) with well-defined "stop conditions".
  4. Observability: log step, event, latency_ms, taxonomy, without raw SQL/PII.

Why (benefits)

  • Less drift: invariants live on transition boundaries.
  • Easier to add lanes (eval runner, repair, RAG context, viz) without breaking the core.
  • Better diagnostics: can reproduce state and stop at a specific step.

Risks

  • Refactoring can be "broad": needs phasing and regression coverage.
  • Need to align with Text2SQL eval runner DoD (execution-based) to have evidence.

DoD / Evidence

  • E2E smoke (AdventureWorks) + Text2SQL eval runner PASS.
  • Comparative metrics not worse (exec_success_rate / safety_violation_rate).
  • Release-gate PASS.