Federation Module Architecture
Federation Module Architecture
Namespace: src/data/federation/
Status: ✅ Production Ready (Task-067)
Key Service: FederatedQueryService
The Federation module enables "Chat with any Data" by allowing Jorvis to execute queries across multiple disparate data sources (e.g., PostgreSQL + Google Sheets + Snowflake) and merge the results in-memory.
🏗️ Core Architecture
The federation engine follows a Plan-Execute-Merge pipeline:
- Query Decomposition: A complex question is broken down into a
FederatedQueryPlan. - Topological Sort: Sub-queries are ordered based on dependencies.
- Parallel Execution: Independent sub-queries are executed concurrently.
- InMemory Merge: Results are combined using a specified strategy.
graph TD
A[FederatedQueryPlan] --> B{Topological Sort}
B --> C[Batch 1: Independent Queries]
C --> D[Batch 2: Dependent Queries]
D --> E[Merge Strategy]
E --> F[FederatedResult]
1. FederatedQueryService
The main orchestrator that manages the execution lifecycle.
- Concurrency: Controlled by
JORVIS_FEDERATION_MAX_PARALLEL(default: 5). - Timeout: Fails fast if sources are unresponsive (
JORVIS_FEDERATION_TIMEOUT_MS). - Resilience: Handles partial failures if configured.
2. Execution Pipeline
- Logic:
execute(plan, executor, options) - Timeout Guard: Each sub-query is wrapped in a promise race with a timeout.
- Dependency Resolution: Using a topological sort algorithm to ensure sub-queries with upstream dependencies wait for data.
🔄 Merge Strategies
The module supports 4 strategies for combining data from different sources:
| Strategy | Description | Use Case |
|---|---|---|
| UNION | Combines rows from all sources, removing duplicates. | Merging "Sales" tables from US and EU databases. |
| CONCAT | Appends all rows, preserving duplicates. | Logging or raw data aggregation. |
| JOIN | Nested-loop join on common keys. | Enriching "Orders" (SQL) with "Customer Details" (CRM/API). |
| AGGREGATE | Sums numeric values across datasets. | Total revenue across different payment gateways. |
⚙️ Configuration
| Environment Variable | Default | Description |
|---|---|---|
JORVIS_FEDERATION_ENABLED | false | Master toggle for the module. |
JORVIS_FEDERATION_MAX_PARALLEL | 5 | Max concurrent DB connections. |
JORVIS_FEDERATION_TIMEOUT_MS | 30000 | Global timeout for the entire plan. |
🧩 Data Structures
FederatedQueryPlan
interface FederatedQueryPlan {
id: string;
subQueries: SubQuery[];
mergeStrategy: MergeStrategy;
}
SubQuery
interface SubQuery {
id: string;
connectionId: string; // Target Data Source
sql: string;
dependencies?: string[]; // IDs of queries that must finish first
}