LLM Integration & Multi-Provider Support¶

Overview¶

The Statistics Agent Team now features full LLM-based extraction with support for multiple LLM providers through a unified interface. The refactored codebase eliminates duplication and provides a clean, maintainable architecture.

Supported LLM Providers¶

All agents support the following LLM providers:

Provider	Model	Configuration	Integration
Gemini (Default)	`gemini-2.0-flash-exp`	`GEMINI_API_KEY` or `GOOGLE_API_KEY`	Google ADK (native) ✅
Claude	`claude-3-5-sonnet-20241022`	`CLAUDE_API_KEY` or `ANTHROPIC_API_KEY`	MetaLLM adapter ✅
OpenAI	`gpt-4o-mini`	`OPENAI_API_KEY`	MetaLLM adapter ✅
Ollama	`llama3.2`	`OLLAMA_URL` (local)	MetaLLM adapter ✅

Architecture¶

Multi-Provider Support via MetaLLM¶

The system uses two integration paths:

Gemini: Direct via Google ADK (native)
Claude, OpenAI, Ollama: Via metallm adapter (pkg/llm/adapters/metallm_adapter.go)

The metallm adapter implements the ADK model.LLM interface, allowing seamless multi-provider support.

Design Note: The pkg/llm/adapters/ directory is self-contained and can be moved to the metallm repository as pkg/adk/ for broader reuse.

Shared Base Agent (`pkg/agent/base.go`)¶

The new BaseAgent struct provides common functionality for all LLM-powered agents:

type BaseAgent struct {
    Cfg          *config.Config
    Client       *http.Client
    Model        model.LLM
    ModelFactory *llm.ModelFactory
}

Features: - ✅ Unified LLM initialization across all agents - ✅ Shared HTTP client with configurable timeouts - ✅ Common URL fetching with size limits - ✅ Centralized logging helpers - ✅ Zero code duplication

Synthesis Agent - LLM-Based Extraction¶

The Synthesis Agent now uses intelligent LLM analysis instead of regex patterns:

Before (Regex-based):

// Simple pattern matching for statistics
patterns := []string{
    `(\d+\.?\d*)\s*%`,                    // Percentages
    `(\d+\.?\d*)\s*(million|billion)`,    // Large numbers
}

After (LLM-based):

// Use LLM to extract statistics with structured prompt
prompt := fmt.Sprintf(`Analyze the following webpage content and extract numerical statistics related to "%s".

For each statistic found, provide:
1. name: A brief descriptive name
2. value: The numerical value (as a number, not string)
3. unit: The unit of measurement
4. excerpt: The verbatim excerpt from the text

Return valid JSON array...`, topic)

response := sa.Model.GenerateContent(ctx, llmReq, false)

Benefits: - ✅ Understands context and semantics - ✅ Extracts complex statistics, not just simple patterns - ✅ Handles various formats and units intelligently - ✅ Returns structured JSON output - ✅ Includes verbatim excerpts for verification

Verification Agent - Refactored¶

The Verification Agent now uses the shared base:

Before:

type VerificationAgent struct {
    cfg      *config.Config
    client   *http.Client
    adkAgent agent.Agent
}

func (va *VerificationAgent) fetchSourceContent(...) {
    // Custom HTTP fetching code
}

After:

type VerificationAgent struct {
    *agentbase.BaseAgent
    adkAgent agent.Agent
}

// Use shared method
sourceContent, err := va.FetchURL(ctx, candidate.SourceURL, 1)

Configuration¶

Environment Variables¶

# LLM Provider Selection
LLM_PROVIDER=gemini  # Options: gemini, claude, openai, ollama

# API Keys (provide based on chosen provider)
GEMINI_API_KEY=your_gemini_key_here
CLAUDE_API_KEY=your_claude_key_here
OPENAI_API_KEY=your_openai_key_here

# For Ollama (local LLM)
OLLAMA_URL=http://localhost:11434
OLLAMA_MODEL=llama3.2

# Optional: Override default models
LLM_MODEL=gemini-2.0-flash-exp  # Or claude-3-5-sonnet-20241022, gpt-4, etc.

Provider-Specific Setup¶

Gemini (Default - Recommended)¶

export LLM_PROVIDER=gemini
export GEMINI_API_KEY=your_api_key_here

- ✅ Fast and cost-effective - ✅ gemini-2.0-flash-exp optimized for speed - ✅ Good JSON output reliability

Claude¶

export LLM_PROVIDER=claude
export CLAUDE_API_KEY=your_api_key_here

- ✅ Excellent reasoning capabilities - ✅ Good for complex extraction tasks

OpenAI¶

export LLM_PROVIDER=openai
export OPENAI_API_KEY=your_api_key_here

- ✅ GPT-4 for highest quality - ⚠️ Higher cost

Ollama (Local)¶

export LLM_PROVIDER=ollama
export OLLAMA_URL=http://localhost:11434
export OLLAMA_MODEL=llama3.2

- ✅ Free, runs locally - ✅ No API key required - ⚠️ Slower, requires GPU

Code Organization¶

Before Refactor¶

agents/synthesis/main.go    - 373 lines (duplicated LLM init, HTTP client, fetching)
agents/verification/main.go - 220 lines (duplicated LLM init, HTTP client, fetching)

After Refactor¶

pkg/agent/base.go           - 95 lines (shared functionality)
agents/synthesis/main.go    - 360 lines (focused on synthesis logic)
agents/verification/main.go - 185 lines (focused on verification logic)

Improvements: - 🎯 Single source of truth for LLM initialization - 🎯 Consistent HTTP client configuration - 🎯 Shared URL fetching with proper error handling - 🎯 Easier to add new agents - 🎯 Easier to maintain and update

Testing¶

Build All Agents¶

make build

Run Individual Agents¶

# Synthesis Agent (LLM-based extraction)
make run-synthesis

# Verification Agent
make run-verification

# Full workflow
make run-all-eino

Test LLM Extraction¶

# Start synthesis agent
PORT=8004 make run-synthesis

# Test extraction
curl -X POST http://localhost:8004/synthesize \
  -H "Content-Type: application/json" \
  -d '{
    "topic": "renewable energy",
    "search_results": [
      {
        "url": "https://www.iea.org/reports/renewables-2023",
        "title": "Renewables 2023",
        "snippet": "Renewable capacity additions reach record levels",
        "domain": "iea.org"
      }
    ],
    "min_statistics": 3,
    "max_statistics": 10
  }'

Performance Considerations¶

LLM Provider Speed Comparison¶

Provider	Avg Latency	Cost (1M tokens)	Quality
Gemini 2.0 Flash	~500ms	$0.075	⭐⭐⭐⭐
Claude 3.5 Sonnet	~1.5s	$3.00	⭐⭐⭐⭐⭐
GPT-4	~2.0s	$10.00	⭐⭐⭐⭐⭐
Ollama (local)	~3-5s	Free	⭐⭐⭐

Recommendation: Use Gemini 2.0 Flash for production (best speed/cost/quality balance)

Token Usage¶

Synthesis Agent: - Input: ~2000 tokens per webpage (8000 char limit) - Output: ~200 tokens (JSON array of statistics) - Total per page: ~2200 tokens

Cost Example (Gemini): - 10 pages analyzed = 22,000 tokens - Cost: ~$0.00165 per request

Migration Guide¶

For Existing Agents¶

To add LLM support to a new agent:

import agentbase "github.com/agentplexus/stats-agent-team/pkg/agent"

type MyAgent struct {
    *agentbase.BaseAgent
    // ... other fields
}

func NewMyAgent(cfg *config.Config) (*MyAgent, error) {
    base, err := agentbase.NewBaseAgent(cfg, 30) // 30 second timeout
    if err != nil {
        return nil, err
    }

    return &MyAgent{
        BaseAgent: base,
    }, nil
}

// Use base.Model for LLM calls
// Use base.FetchURL() for HTTP requests
// Use base.LogInfo() for logging

Troubleshooting¶

LLM Errors¶

"failed to create model" - Check API key is set: echo $GEMINI_API_KEY - Verify provider is correct: echo $LLM_PROVIDER

"LLM generation failed" - Check API key has sufficient quota - Verify network connectivity - Try a different provider

JSON Parsing Errors¶

The synthesis agent handles malformed JSON by: 1. Attempting direct JSON parsing 2. Removing markdown code fences (```json) 3. Extracting JSON from LLM response

If still failing, check LLM output format.

Future Enhancements¶

[ ] Support for Anthropic native API (in addition to ADK)
[ ] Streaming responses for faster synthesis
[ ] Caching of LLM responses to reduce costs
[ ] Fine-tuned models for statistics extraction
[ ] Batch processing for multiple URLs