Agentic RAG: Let The Agent Decide How To Search

Basic RAG usually does this: receive a question, retrieve once, pass results to the model, answer.

That works for simple questions. Travel planning often needs more: check weather, search rainy-day tea places, estimate route, and maybe rewrite the query if evidence is weak. You also want to know which evidence supports which claim.

Agentic RAG puts retrieval tools and an evidence ledger inside an agent loop.

One Sentence

Agentic RAG turns one-shot RAG into retrieval as an agent action, so the model decides when to search, what to search, whether evidence is enough, and what goes into the ledger.

What Breaks Without It

Problem	What it looks like	Risk
One search only	Fast	Bad query means bad answer
Retrieval and answer are tied together	Simple flow	Cannot search again from new evidence
No evidence ledger	Has citations	Claims and evidence may not match

What This Pattern Changes

Who	Owns
Model	Chooses `search` or `final`
Retrieval tool	Returns doc ids and snippets
Evidence ledger	Stores evidence actually used
Python	Executes actions, deduplicates, limits rounds, traces

Retrieved text is not trusted instructions. It is evidence to inspect.

Walk Through One Trace

Round	Action	Observation	Next
1	`search(capital of France)`	Hit document `paris`	Evidence is enough
2	`final`	`Paris is the capital of France. [paris]`	Stop

A travel task may run longer: search(weather) → search(indoor tea places) → search(route) → final with evidence.

Flow

flowchart TD
  Q["Question"] --> S["State: question + ledger"]
  S --> M["Model chooses action"]
  M -->|search| R["Retrieve"]
  R --> L["Update evidence ledger"]
  L --> S
  M -->|final| A["Answer + citations"]
  S --> B["Budget / stagnation checks"]
  B -->|limit hit| X["Stop"]

Code Walk

The example exposes search as an agent action:

model = MockLLM(
    [
        '{"type":"tool","tool":"search","args":{"query":"capital of France","k":2}}',
        '{"type":"final","answer":"Paris is the capital of France. [paris]"}',
    ]
)

Full example:

from __future__ import annotations

import json
from pathlib import Path

from agent_patterns_lab.patterns.agentic_rag import agentic_rag
from agent_patterns_lab.runtime import Document, MockLLM, SimpleSearchIndex, Tracer


def load_docs(path: Path) -> list[Document]:
    docs: list[Document] = []
    for line in path.read_text(encoding="utf-8").splitlines():
        if not line.strip():
            continue
        obj = json.loads(line)
        docs.append(Document(doc_id=obj["doc_id"], text=obj["text"]))
    return docs


def main() -> None:
    tracer = Tracer()
    docs = load_docs(Path("data/mini_corpus.jsonl"))
    index = SimpleSearchIndex(docs)

    model = MockLLM(
        [
            '{"type":"tool","tool":"search","args":{"query":"capital of France","k":2}}',
            '{"type":"final","answer":"Paris is the capital of France. [paris]"}',
        ]
    )

    result = agentic_rag(model, question="What is the capital of France?", index=index, tracer=tracer)
    print(result.answer)

    trace_path = tracer.export_jsonl(Path(".traces") / "41_agentic_rag.jsonl")
    print(f"[trace] {trace_path}")


if __name__ == "__main__":
    main()

Run:

UV_CACHE_DIR=.uv_cache PYTHONPATH=src uv run --no-sync python examples/41_agentic_rag.py

Nearby Patterns

Pattern	Who decides next	Use when
Retrieval Loop	Search loop only	Query needs repair
ReAct	Any tool action loop	Tools are broader than retrieval
Agentic RAG	ReAct + retrieval + ledger	Evidence must be auditable
STORM	Section-level retrieval and writing	Long articles/reports

When To Use It

The question needs multiple searches.
Claims must trace to doc ids.
Retrieval may be incomplete and query needs repair.
Search behavior should be traced and evaluated.

When Not To Use It

One retrieval is enough.
You do not need citations or a ledger.
Sources are untrusted and you have no isolation strategy.
Cost/latency cannot support an open loop.

Costs And Common Failures

Failure	Symptom	Fix
Retrieval injection	Document tells model to ignore rules	Treat retrieved text as untrusted
Fake citation	Claim does not match cited doc	Run claim-evidence checks
Over-searching	Many rounds, no new evidence	Add budget and stagnation checks
Dirty ledger	Duplicate or conflicting evidence	Deduplicate and tag source/time

What To Read Next

Agentic RAG fits knowledge tasks that converge through search.

For long articles, read STORM. For post-generation claim checking, read CoVe.