Skip to main content
Finance Agent implements a sophisticated Retrieval-Augmented Generation (RAG) architecture that combines semantic routing, multi-source data retrieval, and iterative self-improvement to deliver accurate financial analysis.

Architecture Overview

The agent system orchestrates access to three specialized data source tools:
  1. Earnings Transcript Search - Hybrid vector + keyword search over quarterly earnings calls
  2. SEC 10-K Filings Agent - Specialized retrieval agent for annual SEC filings
  3. Tavily News Search - Real-time web search for breaking news
                              AGENT PIPELINE
 ═══════════════════════════════════════════════════════════════════════

 ┌──────────┐    ┌───────────────────┐    ┌──────────────────────────┐
 │ Question │───►│ Question Analyzer │───►│  Semantic Data Routing   │
 └──────────┘    │  (LLM via config) │    │                          │
                 │                   │    │  • Earnings Transcripts  │
                 │ Extracts:         │    │  • SEC 10-K Filings      │
                 │ • Tickers         │    │  • Real-Time News        │
                 │ • Time periods    │    │  • Hybrid (multi-source) │
                 │ • Intent          │    └────────────┬─────────────┘
                 └───────────────────┘                 │

                 ┌─────────────────────────────────────────────────────┐
                 │              RESEARCH PLANNING                       │
                 │  Agent generates reasoning: "I need to find..."     │
                 └────────────────────────┬────────────────────────────┘

                 ┌─────────────────────────────────────────────────────┐
                 │                  RETRIEVAL LAYER                     │
                 │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  │
                 │  │  Earnings   │  │  SEC 10-K   │  │   Tavily    │  │
                 │  │ Transcripts │  │   Filings   │  │    News     │  │
                 │  │             │  │             │  │             │  │
                 │  │ Vector DB   │  │ Section     │  │  Live API   │  │
                 │  │ + Hybrid    │  │ Routing +   │  │             │  │
                 │  │   Search    │  │ Reranking   │  │             │  │
                 │  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘  │
                 └─────────┴───────────┬────┴────────────────┴─────────┘
                                       │ ▲
                                       │ │ Re-query with
                                       │ │ follow-up questions
                                       ▼ │
                 ┌─────────────────────────────────────────────────────┐
                 │               ITERATIVE IMPROVEMENT                  │
                 │                                                      │
                 │    ┌──────────┐    ┌──────────┐    ┌──────────┐     │
                 │    │ Generate │───►│ Evaluate │───►│ Iterate? │─────┼───┐
                 │    │  Answer  │    │ Quality  │    │          │     │   │
                 │    └──────────┘    └──────────┘    └──────────┘     │   │
                 │                                         │ NO        │   │ YES
                 └─────────────────────────────────────────┼───────────┘   │
                                                           ▼               │
                                                    ┌─────────────┐        │
                                                    │   ANSWER    │        │
                                                    │ + Citations │        │
                                                    └─────────────┘        │
                                                           ▲               │
                                                           └───────────────┘

Key Architectural Concepts

Routes to data sources based on question intent, not keywords. The LLM analyzes what type of information would best answer the question.
The agent explains its reasoning before searching (“I need to find…”), making the research approach transparent and structured.
Combines multiple data sources (earnings transcripts, SEC filings, news) based on the question’s requirements.
Evaluates answer quality and iterates until confidence thresholds are met, ensuring comprehensive responses.
Configurable iteration depth (2-10 iterations) and quality thresholds (70-95%) based on question complexity.
Generates keyword phrases optimized for semantic search, not verbose questions, for better RAG retrieval.

The Six-Stage Pipeline

Every question flows through a carefully orchestrated six-stage pipeline:

Stage 1: Setup & Initialization

Initializes RAG components and loads configuration:
  • Initialize search engine and response generator
  • Load available quarters from database
  • Set up streaming event handlers
  • Configure iteration limits based on question complexity
# From agent/rag/rag_agent.py
self.search_engine = SearchEngine(self.config, self.database_manager)
self.response_generator = ResponseGenerator(self.config, self.openai_api_key)
self.tavily_service = TavilyService()
self.sec_service = SECFilingsService(self.database_manager, self.config)

Stage 2: Combined Reasoning + Analysis

A single LLM call (via ReasoningPlanner) performs comprehensive question analysis:
  • Extract entities: Company tickers ($AAPL, $MSFT)
  • Detect time references: “Q4 2024”, “last 3 quarters”, “latest”
  • Semantic routing: Choose data source based on intent
  • Detect answer mode: direct, standard, or detailed
  • Explain research approach: 2-3 sentence reasoning statement
  • Validate question: Reject off-topic or invalid questions
  • Preserve temporal phrases: Exact time references (no resolution yet)
{
  "reasoning": "The user is asking about META's AI-related capital expenditure commentary across the last 3 quarters. I'll search earnings transcripts for management's statements on AI infrastructure investments and forward-looking capex guidance.",
  "tickers": ["META"],
  "time_refs": ["last 3 quarters"],
  "topic": "AI capital expenditures commentary",
  "question_type": "specific_company",
  "data_sources": ["earnings_transcripts"],
  "answer_mode": "standard",
  "is_valid": true,
  "confidence": 0.95
}
Why combine reasoning + analysis? This single LLM call is faster than two separate calls and produces more coherent results because the reasoning drives the analysis.

Stage 2.1: Search Planning

Resolves temporal references to specific quarters and builds declarative searches:
  • Resolve time references: “latest” → get_last_n_quarters_for_company(ticker, 1)
  • Company-specific quarters: Each ticker gets its own most recent quarters
  • Build search queries: Optimized for each data source (transcripts, 10-K, news)
  • Return reasoning string: Streamed to frontend for transparency
Quarter resolution uses company-specific database queries:
SELECT DISTINCT year, quarter 
FROM transcript_chunks
WHERE ticker = %s 
ORDER BY year DESC, quarter DESC
This ensures each company gets its own most recent quarters, not a global “latest”.
Parallel execution of specialized data source searches: News Search (if needs_latest_news=true):
  • Query Tavily API for real-time news
  • Format with [N1], [N2] citation markers
  • Include publication dates and URLs
SEC 10-K Retrieval (if data_source="10k"):
  • Invoke specialized retrieval agent for annual filings
  • Planning-driven sub-question generation
  • LLM-based section routing (Item 1, Item 7, Item 8, etc.)
  • Hybrid search with cross-encoder reranking
  • Iterative retrieval (up to 5 iterations)
  • Format with [10K1], [10K2] citation markers
Current limitation: 10-K only for now. Support for 10-Q (quarterly) and 8-K (current events) filings is under development.
Hybrid vector + keyword search over earnings transcripts:
  • Single-ticker: Direct search with quarter filtering
  • Multi-ticker: Parallel search per company
  • Hybrid scoring: 70% vector similarity + 30% keyword matching
  • Deduplication: Remove duplicate chunks across searches
# From agent/rag/search_engine.py:73-78
def search_similar_chunks(self, query: str, max_results: int = None, 
                         target_quarter: str = None) -> List[Dict[str, Any]]:
    """
    Hybrid search combining:
    - Vector search: 70% weight (semantic similarity via pgvector)
    - Keyword search: 30% weight (TF-IDF)
    """

Stage 4: Initial Answer Generation

Generates the first answer using all retrieved context:
  • Single ticker: generate_openai_response() with company-specific context
  • Multiple tickers: generate_multi_ticker_response() with cross-company synthesis
  • Maintains period metadata: Preserves quarter information (“Q1 2025”, “FY 2024”)
  • Includes all figures: Every financial metric from all sources

Stage 5: Iterative Improvement

The agent evaluates and improves the answer through iteration:
1

Evaluate Quality

Score the answer on completeness, specificity, accuracy, and clarity (0-100 scale).
2

Check Reasoning Goals

Verify if the research goals from Stage 2 reasoning were met.
3

Generate Follow-up Keywords

Create search-optimized keyword phrases (not verbose questions) for missing information.
4

Parallel Quarter Search

Search ALL target quarters in parallel with each keyword phrase.
5

Request Additional Sources

Agent may request news or transcript search if gaps remain.
6

Regenerate Answer

Build improved answer with expanded context.
Stop conditions:
  • Confidence ≥ threshold (varies by answer mode: 70-95%)
  • Max iterations reached (2-10 depending on mode)
  • Agent decides answer is sufficient
  • No follow-up keyword phrases generated
Answer Mode Configuration:
ModeIterationsConfidenceWhen Used
direct270%Quick factual lookups
standard380%Default balanced analysis
detailed490%Comprehensive research
deep_search1095%Exhaustive search (reserved)

Stage 6: Final Response Assembly

Assembles and streams the final response:
  • Stream final answer with citations
  • Include all source attributions (transcripts, 10-K, news)
  • Return metadata (confidence, chunks used, timing)
  • Update conversation memory for follow-up questions

Key Components

Core Files

FileDescription
__init__.pyPublic API — exports Agent, RAGAgent, create_agent()
agent_config.pyAgent configuration and iteration settings
prompts.pyCentralized LLM prompt templates
rag/rag_agent.pyOrchestration engine with pipeline stages
rag/question_analyzer.pyLLM-based semantic routing
rag/reasoning_planner.pyCombined reasoning + analysis

Data Source Tools

FileToolDescription
rag/search_engine.pyTranscript SearchHybrid vector + keyword search
rag/sec_filings_service_smart_parallel.py10-K AgentPlanning-driven parallel retrieval
rag/tavily_service.pyNews SearchReal-time news via Tavily API

Supporting Components

FileDescription
rag/response_generator.pyLLM response generation and evaluation
rag/database_manager.pyPostgreSQL/pgvector operations
rag/conversation_memory.pyMulti-turn conversation state
rag/config.pyRAG configuration

Streaming Events

The agent streams real-time progress to the frontend:
Event TypeDescription
progressGeneric progress updates
analysisQuestion analysis complete
reasoningAgent’s research planning statement
news_searchNews search results
10k_search10-K SEC search results
iteration_startBeginning of iteration N
agent_decisionAgent’s quality assessment
iteration_followupFollow-up questions being searched
iteration_searchNew chunks found
iteration_completeIteration finished
resultFinal answer with citations
rejectedQuestion rejected (out of scope)
errorError occurred
{
  "type": "reasoning",
  "message": "The user is asking about Microsoft's cloud strategy...",
  "step": "planning",
  "data": {
    "reasoning": "Full reasoning statement..."
  }
}

Next Steps

Semantic Routing

Learn how the agent chooses the right data sources

RAG Pipeline

Deep dive into the retrieval and generation process

Data Sources

Explore the three specialized data source tools

Build docs developers (and LLMs) love