Quickstart: 2-Agent Workflow for Phase Q Development

Version: 1.0 Date: 2026-01-19 Status: Production Ready (Blocked by Phase P)


Overview

The Phase Q automation system uses a 2-agent architecture:

  • Architect Agent (Claude Opus): Design ADRs, code review, error classification
  • Executor Agent (Gemini): Code generation, tests, documentation

Critical: Phase Q automation is blocked until Phase P (Task-102) is complete.


Prerequisites

1. Install Dependencies

# Python dependencies
pip install -r requirements-automation.txt

# Verify versions
python3 --version  # Python 3.11+
pip show langgraph langchain-anthropic langchain-google-genai

2. Setup API Keys

# Required keys
export ANTHROPIC_API_KEY="sk-ant-api03-..."  # Claude API key
export GOOGLE_API_KEY="AIza..."              # Gemini API key

# Verify keys
./scripts/verify_setup.sh

Important: System will not start without these keys.

3. Check Repository

# Check for uncommitted changes
./scripts/preflight.sh

# Check SSOT consistency
./scripts/detect-conflicts.sh

# Verify Phase P status
grep "Phase P" docs/handoff/HANDOFF_TO_NEXT_AGENT.md

Critical: If Phase P shows "BLOCKED", Phase Q automation cannot be run.


Workflow Modes

Use Case: Execute a single Phase Q task with full control.

# Execute single task (creates isolated git branch)
python3 scripts/langgraph_workflow.py Task-105

# Monitor progress
tail -f artifacts/task_105_result.json

# Check branch created
git branch | grep task-105

# Review generated code
git diff main..task-105-wikidata-sparql

Output:

  • Git branch: task-105-wikidata-sparql
  • Code: analytics-platform/src/mcp/tools/wikidata.tool.ts
  • Tests: analytics-platform/src/mcp/tools/wikidata.tool.spec.ts
  • Documentation: Updated in docs/
  • Result: artifacts/task_105_result.json

Next Steps:

  1. Review ADR in branch
  2. Run tests manually: npm --prefix analytics-platform test
  3. Create PR if quality gate passed
  4. Request Gatekeeper review

Mode 2: Multi-Task Pilot (Advanced)

Use Case: Batch execution of multiple Phase Q tasks (Q.2, Q.4-Q.7).

⚠️ Warning: Pilot mode creates 5+ git branches sequentially. Use only after successful Mode 1 test.

# Execute pilot (all automatable tasks)
python3 scripts/multi_task_runner.py

# Monitor progress
watch -n 5 'ls artifacts/task_*_result.json | wc -l'

# Check created branches
git branch | grep task-

Expected Branches:

  • task-105-wikidata-sparql
  • task-107-sportsdb-api
  • task-108-tmdb-movies
  • task-109-excel-upload
  • task-110-weather-data

Duration: ~2-4 hours (depends on API rate limits)

Rollback: System stops automatically if success rate <50%.


Workflow Steps (Detailed)

Step 1: Architect Node (ADR Creation)

Agent: Claude Opus Duration: 2-5 minutes Input: Task ID (e.g., Task-105) Output: ADR file in Markdown format

What Happens:

  1. Reads PHASE_Q_IMPLEMENTATION_PLAN.md for context
  2. Validates task against CLAUDE.md invariants
  3. Designs implementation approach
  4. Creates ADR with:
    • Problem statement
    • Proposed solution
    • Implementation plan
    • Testing strategy
    • Acceptance criteria

Success Criteria:

  • ADR contains all required sections
  • No conflicts with existing architecture
  • Follows NestJS + TypeScript patterns
  • Coverage ≥80% target set

Example ADR:

# ADR: WikiData SPARQL MCP Tool (Task-105)

## Problem
Need to query WikiData knowledge graph via SPARQL.

## Solution
Implement MCP tool with SPARQL endpoint wrapper.

## Implementation
- Create `wikidata.tool.ts` in `analytics-platform/src/mcp/tools/`
- Use `@modelcontextprotocol/sdk` for MCP integration
- SPARQL queries via `https://query.wikidata.org/sparql`
- Result caching (5min TTL)

## Testing
- Unit tests: SPARQL query builder
- Integration tests: Real WikiData queries
- Coverage: ≥80%

## Acceptance Criteria
- [ ] Tool registered in MCP server
- [ ] Query execution <2s (p95)
- [ ] Error handling for malformed SPARQL
- [ ] Documentation in README.md

Step 2: Executor Node (Code Generation)

Agent: Gemini Duration: 3-7 minutes Input: ADR + Task context Output: TypeScript code + tests

What Happens:

  1. Parses ADR requirements
  2. Generates TypeScript code following style guide
  3. Creates Jest unit tests (≥80% coverage)
  4. Updates documentation
  5. Commits to feature branch

Code Structure:

analytics-platform/src/mcp/tools/
├── wikidata.tool.ts          # MCP tool implementation
├── wikidata.tool.spec.ts     # Unit tests
└── README.md                 # Updated documentation

Quality Standards:

  • Google TypeScript Style Guide
  • JSDoc comments on all public functions
  • No any types without justification
  • Proper error handling
  • Input validation

Step 3: Quality Gate (Verification)

Automated Checks:

  1. Lint: npm run lint (ESLint)
  2. Build: npm run build (TypeScript compilation)
  3. Test: npm test (Jest with coverage)
  4. Coverage: ≥80% line coverage enforcement

Duration: 2-5 minutes

Pass Criteria:

✅ Lint: 0 errors
✅ Build: Successful compilation
✅ Tests: 100% passing
✅ Coverage: 85% (≥80% required)

Fail Handling:

  • Trivial errors (imports, unused vars): Auto-fix by Executor
  • Tactical errors (type mismatches, missing tests): Executor re-generates
  • Strategic errors (architectural issues): Escalate to Architect

Step 4: Review Loop (Error Classification)

Max Iterations: 3-7 (adaptive based on complexity)

Flow:

  1. Quality Gate fails → Architect classifies error
  2. Trivial: Executor auto-fixes (import sorting, formatting)
  3. Tactical: Executor re-generates code section
  4. Strategic: Architect redesigns approach (rare)
  5. Quality Gate re-runs
  6. Repeat until pass OR max iterations reached

Example Loop:

Iteration 1: Test failure (missing mock) → Tactical → Executor adds mock
Iteration 2: Coverage 75% → Tactical → Executor adds edge case tests
Iteration 3: Coverage 82% → Pass ✅

Success Rate: 70%+ expected (based on design)

Rollback Trigger: If ≥5 iterations without progress, task fails.


Step 5: Output & PR Creation

Automated Output:

  • Git branch: task-{ID}-{feature-name}
  • Code + tests committed
  • Result JSON: artifacts/task_{ID}_result.json

Manual Steps (Human Required):

  1. Review ADR in branch

  2. Test code locally:

    git checkout task-105-wikidata-sparql
    npm --prefix analytics-platform test -- wikidata.tool.spec.ts
    npm --prefix analytics-platform run lint
    
  3. Create PR:

    gh pr create --title "feat(mcp): Add WikiData SPARQL tool (Task-105)" \
      --body "$(cat artifacts/task_105_adr.md)"
    
  4. Request Gatekeeper review in docs/agent_ops/GO_NO_GO.md


Safety & Rollback

Git Branch Isolation

Why: Sequential task execution prevents conflicts.

Mechanism:

  • Each task creates isolated branch from main
  • Branches never merge automatically
  • Human reviews all PRs before merge

Example:

main
├── task-105-wikidata-sparql  (PR #201)
├── task-107-sportsdb-api     (PR #202)
└── task-108-tmdb-movies      (PR #203)

Merge Order: Sequential (105 → 107 → 108) to avoid conflicts.


Rollback Scenarios

Scenario 1: Single Task Fails Quality Gate

Symptom: artifacts/task_105_result.json shows "success": false

Action:

# Delete failed branch
git branch -D task-105-wikidata-sparql

# Re-run with increased iterations
python3 scripts/langgraph_workflow.py Task-105 --max-iterations 10

Scenario 2: Pilot Mode Success Rate <50%

Symptom: Only 2/5 tasks passed Quality Gate

Action:

# Check results
cat artifacts/pilot_run_summary.json

# Manual execution of failed tasks
python3 scripts/langgraph_workflow.py Task-107  # Manual retry

Escalation: If persistent failures, revert to manual development (no automation).


Scenario 3: Hallucination Detected (Gemini-Specific)

Symptom: Code references non-existent APIs or imports

Detection: Architect node validates imports against package.json

Action: Automatic re-generation with stricter prompt constraints.


Emergency Stop

# Kill running automation
pkill -f langgraph_workflow.py

# Check orphaned branches
git branch | grep task-

# Delete orphaned branches
git branch | grep task- | xargs git branch -D

Monitoring & Debugging

Real-Time Progress

# Watch automation logs
tail -f artifacts/automation.log

# Check task status
cat artifacts/task_105_result.json | jq '.status'

# Monitor API usage
cat artifacts/automation.log | grep "API call"

Common Issues

Issue 1: API Rate Limits

Symptom: RateLimitError: Anthropic API rate limit exceeded

Fix:

# Wait 60 seconds
sleep 60

# Retry with exponential backoff
python3 scripts/langgraph_workflow.py Task-105 --retry

Issue 2: Coverage <80%

Symptom: Quality Gate fails with "Coverage: 75%"

Fix: Executor automatically adds tests in next iteration (up to 7 iterations).

Manual Override: Edit code locally if automation fails after 7 iterations.


Issue 3: Git Conflicts

Symptom: git merge fails during branch creation

Fix:

# Rebase on latest main
git checkout task-105-wikidata-sparql
git rebase main

# Resolve conflicts manually
git mergetool

Success Criteria

Task Completion Checklist

  • Quality Gate passed (lint, build, test pass)
  • Coverage target ≥80% (enforced for new code, see Quality Gate)
  • Git branch created and pushed
  • ADR documentation in branch
  • Confidence score ≥30% (auto-blocks if <30%)
  • Result JSON shows "success": true
  • PR created with evidence
  • Gatekeeper review requested

Phase Q Completion Criteria

  • 8/8 tasks completed (Q.1-Q.8)
  • All PRs merged to main
  • Integration tests passing
  • Documentation updated
  • Demo scenarios validated

Documentation References

  • Architecture: docs/automation/ARCHITECTURE.md
  • Implementation Guide: docs/automation/IMPLEMENTATION_GUIDE.md
  • Decision Log: docs/automation/DECISION_LOG.md (10 ADRs)
  • Phase Q Plan: docs/agent_ops/plans/PHASE_Q_IMPLEMENTATION_PLAN.md
  • Enforcement Layers: AGENTS.md (Layer 6: Automation Workflow)

Support & Escalation

Automation Issues:

  1. Check artifacts/automation.log for errors
  2. Verify API keys: ./scripts/verify_setup.sh
  3. Re-run with debug flag: python3 scripts/langgraph_workflow.py Task-105 --debug

Strategic Issues:

  1. Escalate to Architect agent (human)
  2. Review ADR for architectural conflicts
  3. Update PHASE_Q_IMPLEMENTATION_PLAN.md if requirements changed

Emergency:

  • Stop automation: pkill -f langgraph_workflow.py
  • Rollback: git branch -D task-*
  • Manual execution: Follow AGENTS.md manual workflow

Quickstart maintained by Automation Team. Last updated: 2026-01-19