4-Agent Architecture Implementation¶

Overview¶

The Statistics Agent Team now uses a 4-agent architecture with clear separation of concerns:

┌─────────────────────────────────────────────────────┐
│         Orchestration Agent (Eino/ADK)              │
│              Port 8003 / 8000                       │
└────────┬──────────────┬──────────────┬──────────────┘
         │              │              │
         ▼              ▼              ▼
┌────────────────┐ ┌──────────────┐ ┌────────────────┐
│   Research     │ │  Synthesis   │ │ Verification   │
│    Agent       │ │    Agent     │ │    Agent       │
│   Port 8001    │ │  Port 8004   │ │   Port 8002    │
│                │ │              │ │                │
│ - Web Search   │ │ - Fetch URLs │ │ - Verify URLs  │
│ - Find Sources │ │ - LLM Extract│ │ - Check Facts  │
│ - Filter       │ │ - Parse Stats│ │ - Validate     │
│   Reputable    │ │              │ │   Excerpts     │
└────────────────┘ └──────────────┘ └────────────────┘
        │                  │                  │
        ▼                  ▼                  ▼
   Serper/SerpAPI    Webpage Content    Source Validation

Agent Responsibilities¶

1. Research Agent (Port 8001)¶

Role: Source Discovery Technology: Web Search (Serper/SerpAPI) No LLM Required

Tasks: - Perform web searches via pkg/search service - Return URLs with metadata (title, snippet, domain) - Filter for reputable sources (.gov, .edu, research orgs) - Output: List of SearchResult objects

Files: - agents/research/main.go - Simplified to focus on search only - pkg/search/service.go - MetaSerp integration

API:

POST /research
{
  "topic": "climate change",
  "max_statistics": 20,
  "reputable_only": true
}

Response:
{
  "topic": "climate change",
  "candidates": [/* URLs as placeholder candidates */],
  "timestamp": "2025-12-13T10:30:00Z"
}

2. Synthesis Agent (Port 8004) ⭐ NEW¶

Role: Statistics Extraction Technology: ADK + LLM (Gemini/Claude/OpenAI/Ollama) LLM-Heavy

Tasks: - Fetch webpage content from URLs - Use LLM to intelligently analyze text and extract statistics - Extract numerical values, units, and context using structured prompts - Find verbatim excerpts containing statistics - Create candidate statistics with proper metadata - Output: List of CandidateStatistic objects

Files: - agents/synthesis/main.go - LLM-based extraction agent ✅ - Uses pkg/agent/base.go for shared LLM initialization - Full LLM integration with JSON output parsing - Supports all LLM providers via ADK

API:

POST /synthesize
{
  "topic": "climate change",
  "search_results": [
    {
      "url": "https://www.iea.org/...",
      "title": "Renewable Energy Report",
      "snippet": "...",
      "domain": "iea.org"
    }
  ],
  "min_statistics": 5,
  "max_statistics": 20
}

Response:
{
  "topic": "climate change",
  "candidates": [
    {
      "name": "Renewable energy growth",
      "value": 83,
      "unit": "%",
      "source": "iea.org",
      "source_url": "https://www.iea.org/...",
      "excerpt": "Renewable capacity grew by 83% in 2023..."
    }
  ],
  "sources_analyzed": 5,
  "timestamp": "2025-12-13T10:30:15Z"
}

3. Verification Agent (Port 8002)¶

Role: Fact Checking Technology: ADK + LLM (light usage) LLM-Light

Tasks: - Re-fetch source URLs - Verify excerpts exist verbatim in source - Check numerical values match - Flag hallucinations or mismatches - Output: VerificationResult objects with pass/fail

Files: - agents/verification/main.go - Unchanged

API: (Unchanged)

4. Orchestration Agent (Ports 8003/8000)¶

Role: Workflow Coordination Technology: Eino (recommended) or ADK No LLM (Eino) or LLM-driven (ADK)

Workflow: 1. Call Research Agent → get URLs 2. Call Synthesis Agent → extract statistics from URLs 3. Call Verification Agent → validate statistics 4. Retry logic if needed 5. Return verified statistics

Files: - agents/orchestration-eino/main.go - Eino version (deterministic) - agents/orchestration/main.go - ADK version (LLM-driven) - pkg/orchestration/eino.go - Shared Eino logic

Data Flow¶

User Request
     │
     ▼
┌─────────────────┐
│ Orchestration   │
└────────┬────────┘
         │
         │ 1. Search for sources
         ▼
┌─────────────────┐
│ Research Agent  │ ──► Returns: [{url, title, snippet, domain}, ...]
└────────┬────────┘
         │
         │ 2. Extract statistics from URLs
         ▼
┌─────────────────┐
│ Synthesis Agent │ ──► Returns: [{name, value, unit, source_url, excerpt}, ...]
└────────┬────────┘
         │
         │ 3. Verify statistics
         ▼
┌─────────────────┐
│ Verification    │ ──► Returns: [{statistic, verified: true/false, reason}, ...]
└────────┬────────┘
         │
         ▼
    Verified Statistics

Models (pkg/models/statistic.go)¶

New Models Added:¶

// SearchResult - Output from Research Agent
type SearchResult struct {
    URL      string
    Title    string
    Snippet  string
    Domain   string
    Position int
}

// SynthesisRequest - Input to Synthesis Agent
type SynthesisRequest struct {
    Topic         string
    SearchResults []SearchResult
    MinStatistics int
    MaxStatistics int
}

// SynthesisResponse - Output from Synthesis Agent
type SynthesisResponse struct {
    Topic           string
    Candidates      []CandidateStatistic
    SourcesAnalyzed int
    Timestamp       time.Time
}

Implementation Status¶

✅ Completed¶

Synthesis Agent Created (agents/synthesis/main.go)
ADK integration
Webpage fetching
Basic regex extraction (placeholder)
HTTP API on port 8004
Research Agent Refactored (agents/research/main.go)
Removed LLM/ADK dependencies
Focus on search only
Returns SearchResult objects
Reputable source filtering
Models Updated (pkg/models/statistic.go)
SearchResult model
SynthesisRequest/Response models
Configuration Updated (pkg/config/config.go)
Added SynthesisAgentURL (port 8004)

✅ Completed (Updated)¶

Orchestration Update - Both Eino and ADK orchestration updated for 4-agent workflow
Eino Orchestration (pkg/orchestration/eino.go) ✅
- Added synthesis agent call between research and verification
- Updated workflow graph with nodeSynthesis
- Added SynthesisState type
- Added callSynthesisAgent() helper method
ADK Orchestration (agents/orchestration/main.go) ✅
- Added HTTP client for synthesis agent
- Updated orchestrate() method to call all 4 agents in sequence
- Added callSynthesisAgent() helper method
Docker Configuration ✅
Added synthesis agent to Dockerfile (build and copy binary)
Updated docker-compose.yml with port 8004
Added synthesis agent to docker-entrypoint.sh (startup and shutdown)
Exposed all 4 ports: 8001, 8002, 8003, 8004
Makefile Updates ✅
Added run-synthesis target
Updated run-all to include synthesis agent
Updated run-all-eino to include synthesis agent
Updated build target to build synthesis binary

✅ Recently Completed¶

LLM Integration & Code Refactoring ✅
Created pkg/agent/base.go - Shared agent base with common LLM initialization
Refactored Synthesis Agent to use proper LLM-based extraction (not regex)
Refactored Verification Agent to use shared base agent
Eliminated code duplication across agents
Full multi-LLM support (Gemini/Claude/OpenAI/Ollama) via unified interface

⏳ TODO¶

Documentation Updates
Update API examples to show 4-agent workflow with LLM extraction

Benefits of 4-Agent Architecture¶

✅ Separation of Concerns - Each agent has one job ✅ Better Caching - Research results can be reused ✅ Parallel Processing - Synthesize multiple URLs concurrently ✅ Cost Optimization - Only use LLM where needed (synthesis) ✅ Easier Testing - Mock each agent independently ✅ Scalability - Scale synthesis agents based on load ✅ Flexibility - Can run all in-process or as microservices

Next Steps¶

To complete the 4-agent architecture:

Update orchestration agents to call synthesis
Update Docker/Makefile for deployment
Enhance synthesis agent with full LLM integration
Update all documentation
Test end-to-end workflow

Port Mapping¶

Agent	Port	Role
Orchestration (ADK)	8000	Workflow (LLM-driven)
Research	8001	Search (no LLM)
Verification	8002	Validation (LLM-light)
Orchestration (Eino)	8003	Workflow (deterministic)
Synthesis	8004	Extraction (LLM-heavy) ⭐