跳转至

Agentic RAG: Let The Agent Decide How To Search

Basic RAG usually does this: receive a question, retrieve once, pass results to the model, answer.

That works for simple questions. Travel planning often needs more: check weather, search rainy-day tea places, estimate route, and maybe rewrite the query if evidence is weak. You also want to know which evidence supports which claim.

Agentic RAG puts retrieval tools and an evidence ledger inside an agent loop.

One Sentence

Agentic RAG turns one-shot RAG into retrieval as an agent action, so the model decides when to search, what to search, whether evidence is enough, and what goes into the ledger.

What Breaks Without It

Problem What it looks like Risk
One search only Fast Bad query means bad answer
Retrieval and answer are tied together Simple flow Cannot search again from new evidence
No evidence ledger Has citations Claims and evidence may not match

What This Pattern Changes

Who Owns
Model Chooses search or final
Retrieval tool Returns doc ids and snippets
Evidence ledger Stores evidence actually used
Python Executes actions, deduplicates, limits rounds, traces

Retrieved text is not trusted instructions. It is evidence to inspect.

Walk Through One Trace

Round Action Observation Next
1 search(capital of France) Hit document paris Evidence is enough
2 final Paris is the capital of France. [paris] Stop

A travel task may run longer: search(weather)search(indoor tea places)search(route)final with evidence.

Flow

flowchart TD
  Q["Question"] --> S["State: question + ledger"]
  S --> M["Model chooses action"]
  M -->|search| R["Retrieve"]
  R --> L["Update evidence ledger"]
  L --> S
  M -->|final| A["Answer + citations"]
  S --> B["Budget / stagnation checks"]
  B -->|limit hit| X["Stop"]

Code Walk

The example exposes search as an agent action:

model = MockLLM(
    [
        '{"type":"tool","tool":"search","args":{"query":"capital of France","k":2}}',
        '{"type":"final","answer":"Paris is the capital of France. [paris]"}',
    ]
)

Full example:

from __future__ import annotations

import json
from pathlib import Path

from agent_patterns_lab.patterns.agentic_rag import agentic_rag
from agent_patterns_lab.runtime import Document, MockLLM, SimpleSearchIndex, Tracer


def load_docs(path: Path) -> list[Document]:
    docs: list[Document] = []
    for line in path.read_text(encoding="utf-8").splitlines():
        if not line.strip():
            continue
        obj = json.loads(line)
        docs.append(Document(doc_id=obj["doc_id"], text=obj["text"]))
    return docs


def main() -> None:
    tracer = Tracer()
    docs = load_docs(Path("data/mini_corpus.jsonl"))
    index = SimpleSearchIndex(docs)

    model = MockLLM(
        [
            '{"type":"tool","tool":"search","args":{"query":"capital of France","k":2}}',
            '{"type":"final","answer":"Paris is the capital of France. [paris]"}',
        ]
    )

    result = agentic_rag(model, question="What is the capital of France?", index=index, tracer=tracer)
    print(result.answer)

    trace_path = tracer.export_jsonl(Path(".traces") / "41_agentic_rag.jsonl")
    print(f"[trace] {trace_path}")


if __name__ == "__main__":
    main()

Run:

UV_CACHE_DIR=.uv_cache PYTHONPATH=src uv run --no-sync python examples/41_agentic_rag.py

Nearby Patterns

Pattern Who decides next Use when
Retrieval Loop Search loop only Query needs repair
ReAct Any tool action loop Tools are broader than retrieval
Agentic RAG ReAct + retrieval + ledger Evidence must be auditable
STORM Section-level retrieval and writing Long articles/reports

When To Use It

  • The question needs multiple searches.
  • Claims must trace to doc ids.
  • Retrieval may be incomplete and query needs repair.
  • Search behavior should be traced and evaluated.

When Not To Use It

  • One retrieval is enough.
  • You do not need citations or a ledger.
  • Sources are untrusted and you have no isolation strategy.
  • Cost/latency cannot support an open loop.

Costs And Common Failures

Failure Symptom Fix
Retrieval injection Document tells model to ignore rules Treat retrieved text as untrusted
Fake citation Claim does not match cited doc Run claim-evidence checks
Over-searching Many rounds, no new evidence Add budget and stagnation checks
Dirty ledger Duplicate or conflicting evidence Deduplicate and tag source/time

Agentic RAG fits knowledge tasks that converge through search.

For long articles, read STORM. For post-generation claim checking, read CoVe.

References