Skip to main content
The components directory contains reusable, self-contained LangChain building blocks that implement specific retrieval and generation sub-tasks. Feature pipelines compose these components rather than reimplementing the same logic.

Available components

AgenticRouter

LLM-based decision-making for agentic RAG pipelines

ContextCompressor

LLM-based context compression for token optimization

QueryEnhancer

Multi-query, HyDE, and step-back prompting

AgenticRouter

An LLM-based decision-making component for agentic RAG pipelines. Uses ChatGroq and LangChain’s PromptTemplate for structured decision-making.

State machine

The router implements a three-state state machine:
  • search: Retrieve more documents from the vector store. Selected when no documents have been retrieved yet or when reflection identified information gaps.
  • reflect: Evaluate and improve the current answer. Selected when documents exist but the answer’s quality, completeness, or grounding is uncertain.
  • generate: Produce the final answer. Selected when sufficient information has been gathered or when max_iterations is reached (forced fallback).

Implementation

src/vectordb/langchain/components/agentic_router.py
import json
from langchain_core.prompts import PromptTemplate
from langchain_groq import ChatGroq

class AgenticRouter:
    """Route queries to search, reflect, or generate actions using LLM reasoning.
    
    The router implements an agentic decision-making pattern where an LLM evaluates
    the current pipeline state and determines the optimal next action.
    """

    ROUTING_TEMPLATE = """You are a query routing agent. Given a query and optional current answer, decide what action to take next.

Current State:
- Query: {query}
- Has Retrieved Documents: {has_documents}
- Current Answer: {current_answer}
- Iteration: {iteration}/{max_iterations}

Your task is to decide ONE of the following actions:
1. 'search': Retrieve documents from vector database (choose this if you need more information)
2. 'reflect': Verify and improve the current answer (choose this to validate answer quality)
3. 'generate': Create final answer (choose this when you have enough information)

Return a JSON object with this exact format:
{{"action": "search|reflect|generate", "reasoning": "brief explanation"}}

Do NOT include any other text. Return ONLY the JSON object."""

    def __init__(self, llm: ChatGroq) -> None:
        """Initialize AgenticRouter with a LangChain LLM instance.
        
        Args:
            llm: ChatGroq instance for routing decisions. Should be configured
                with low temperature (0.0-0.3) for consistent routing.
        """
        self.llm = llm

    def route(
        self,
        query: str,
        has_documents: bool = False,
        current_answer: str | None = None,
        iteration: int = 1,
        max_iterations: int = 3,
    ) -> dict[str, Any]:
        """Route a query to the appropriate action.
        
        Args:
            query: The user's original query text.
            has_documents: Whether documents have been retrieved.
            current_answer: The answer generated so far, if any.
            iteration: Current iteration number (1-indexed).
            max_iterations: Maximum iterations allowed.
        
        Returns:
            Dictionary with 'action' and 'reasoning' keys.
        
        Raises:
            ValueError: If LLM response is invalid JSON or contains invalid action.
        """
        # Enforce iteration limit as safety mechanism
        if iteration >= max_iterations:
            return {
                "action": "generate",
                "reasoning": f"Reached maximum iterations ({max_iterations})",
            }

        answer_str = current_answer if current_answer else "No answer yet"

        # Construct prompt using LangChain's PromptTemplate
        prompt = PromptTemplate(
            template=self.ROUTING_TEMPLATE,
            input_variables=[
                "query",
                "has_documents",
                "current_answer",
                "iteration",
                "max_iterations",
            ],
        )
        formatted_prompt = prompt.format(
            query=query,
            has_documents=has_documents,
            current_answer=answer_str,
            iteration=iteration,
            max_iterations=max_iterations,
        )

        # Invoke LLM to get routing decision
        response = self.llm.invoke(formatted_prompt)
        response_text = response.content.strip()

        # Parse JSON response
        try:
            decision = json.loads(response_text)
        except json.JSONDecodeError as e:
            raise ValueError(f"Invalid JSON from router: {response_text}") from e

        # Validate response structure
        if "action" not in decision or "reasoning" not in decision:
            raise ValueError(f"Router response missing required fields: {decision}")

        # Validate action value
        action = decision["action"].lower().strip()
        if action not in ("search", "reflect", "generate"):
            raise ValueError(
                f"Invalid action: {action}. Must be 'search', 'reflect', or 'generate'"
            )

        return {
            "action": action,
            "reasoning": decision["reasoning"],
        }

Usage

from langchain_groq import ChatGroq
from vectordb.langchain.components import AgenticRouter

llm = ChatGroq(model="llama-3.3-70b-versatile", temperature=0.0)
router = AgenticRouter(llm)

# Initial routing - should suggest 'search'
decision = router.route("What is quantum computing?", has_documents=False)
# {"action": "search", "reasoning": "No documents retrieved yet"}

# After retrieval - may suggest 'reflect' or 'generate'
decision = router.route(
    "What is quantum computing?",
    has_documents=True,
    current_answer="Quantum computing uses qubits...",
    iteration=2,
    max_iterations=3,
)

QueryEnhancer

Generates improved retrieval queries from the user’s original input using ChatGroq and PromptTemplate.

Strategies

Generates 5 alternative phrasings of the original query. Returns a list of up to 5 query strings. The original query is NOT included.Best for: Simple factual queries where different phrasings might match different documents.
Generates a hypothetical 2-3 sentence document answer. Returns [original_query, hypothetical_answer].Best for: Very short queries or when query/document distributions differ significantly.
Generates 3 broader context questions. Returns [step_back_1, step_back_2, step_back_3, original_query].Best for: Complex questions requiring background knowledge.

Implementation

src/vectordb/langchain/components/query_enhancer.py
from langchain_core.prompts import PromptTemplate
from langchain_groq import ChatGroq

class QueryEnhancer:
    """Generate multiple query perspectives for enhanced retrieval.
    
    Implements three complementary strategies: multi-query generation,
    HyDE, and step-back prompting.
    """

    MULTI_QUERY_TEMPLATE = """You are an AI language model assistant. Your task is to generate 5 different search queries that would help answer the given question. Provide only the queries, one per line, without numbering or bullet points.

Original question: {query}

Alternative queries:"""

    HYDE_TEMPLATE = """You are an AI language model assistant. Your task is to generate a hypothetical document that would answer the given question. Write a brief, focused response (2-3 sentences) that directly answers the question.

Question: {query}

Hypothetical document:"""

    STEP_BACK_TEMPLATE = """You are an AI language model assistant. Your task is to generate 3 step-back questions that would provide broader context for answering the given question. These are more general, foundational questions. Provide only the questions, one per line, without numbering.

Original question: {query}

Step-back questions:"""

    def __init__(self, llm: ChatGroq) -> None:
        """Initialize QueryEnhancer with a LangChain LLM instance.
        
        Args:
            llm: ChatGroq instance for query generation. Recommended temperature: 0.3-0.7.
        """
        self.llm = llm

    def generate_multi_queries(self, query: str) -> list[str]:
        """Generate alternative query formulations.
        
        Args:
            query: The original user query text.
        
        Returns:
            List of alternative query strings (up to 5). Original query NOT included.
        """
        prompt = PromptTemplate(
            template=self.MULTI_QUERY_TEMPLATE,
            input_variables=["query"],
        )
        formatted_prompt = prompt.format(query=query)

        response = self.llm.invoke(formatted_prompt)

        queries = response.content.strip().split("\n")
        queries = [q.strip() for q in queries if q.strip()]

        return queries[:5]

    def generate_hyde_queries(self, query: str) -> list[str]:
        """Generate hypothetical document for HyDE-based retrieval.
        
        Args:
            query: The original user query text.
        
        Returns:
            List containing [original_query, hypothetical_answer].
        """
        prompt = PromptTemplate(
            template=self.HYDE_TEMPLATE,
            input_variables=["query"],
        )
        formatted_prompt = prompt.format(query=query)

        response = self.llm.invoke(formatted_prompt)
        hyde_response = response.content.strip()

        return [query, hyde_response]

    def generate_step_back_queries(self, query: str) -> list[str]:
        """Generate step-back questions for broader context retrieval.
        
        Args:
            query: The original user query text.
        
        Returns:
            List of [step_back_q1, step_back_q2, step_back_q3, original_query].
        """
        prompt = PromptTemplate(
            template=self.STEP_BACK_TEMPLATE,
            input_variables=["query"],
        )
        formatted_prompt = prompt.format(query=query)

        response = self.llm.invoke(formatted_prompt)

        step_back_queries = response.content.strip().split("\n")
        step_back_queries = [q.strip() for q in step_back_queries if q.strip()]

        return step_back_queries[:3] + [query]

    def generate_queries(self, query: str, mode: str = "multi_query") -> list[str]:
        """Generate enhanced queries based on the specified mode.
        
        Args:
            query: The original user query text.
            mode: Enhancement mode ('multi_query', 'hyde', or 'step_back').
        
        Returns:
            List of enhanced query strings.
        
        Raises:
            ValueError: If mode is not recognized.
        """
        if mode == "multi_query":
            return self.generate_multi_queries(query)
        if mode == "hyde":
            return self.generate_hyde_queries(query)
        if mode == "step_back":
            return self.generate_step_back_queries(query)

        raise ValueError(
            f"Unknown mode: {mode}. Must be 'multi_query', 'hyde', or 'step_back'"
        )

Usage

from langchain_groq import ChatGroq
from vectordb.langchain.components import QueryEnhancer

llm = ChatGroq(model="llama-3.3-70b-versatile", temperature=0.3)
enhancer = QueryEnhancer(llm)

# Multi-query generation
queries = enhancer.generate_queries("What is photosynthesis?", mode="multi_query")
# ['Define photosynthesis', 'How do plants make food', ...]

# HyDE generation
queries = enhancer.generate_queries("Explain neural networks", mode="hyde")
# ['Explain neural networks', 'Neural networks are computational models...']

# Step-back prompting
queries = enhancer.generate_queries("What is backpropagation?", mode="step_back")
# ['What is machine learning?', 'How do neural networks learn?', ...]

LLM configuration

All LangChain components use ChatGroq from langchain-groq. Recommended settings:
from langchain_groq import ChatGroq

# For routing (deterministic)
routing_llm = ChatGroq(model="llama-3.3-70b-versatile", temperature=0.0)

# For query generation (diverse)
generation_llm = ChatGroq(model="llama-3.3-70b-versatile", temperature=0.3)
Set the GROQ_API_KEY environment variable or pass api_key directly to ChatGroq.

When to use components directly

  • When building a custom pipeline that does not match the existing feature module templates
  • When experimenting with one pipeline stage (testing different compression strategies with a fixed retriever)
  • When combining components from different feature modules into a custom pipeline

Common pitfalls

Over-composing before baseline validation: Build and validate the simplest pipeline first, then add components.
Inconsistent LLM temperature: Use low temperature (0.0) for routing decisions and higher (0.3-0.7) for creative tasks like query generation.
Not logging routing decisions: All components log at INFO level. Set LOG_LEVEL=DEBUG to see full prompts and responses for debugging routing and compression behavior.

Next steps

Agentic RAG

See AgenticRouter in action with the full agentic RAG pipeline

Semantic search

Build your first semantic search pipeline

Hybrid search

Combine dense and sparse retrieval with ResultMerger

Build docs developers (and LLMs) love