Agentic RAG: Let The Agent Decide How To Search
Basic RAG usually does this: receive a question, retrieve once, pass results to the model, answer.
That works for simple questions. Travel planning often needs more: check weather, search rainy-day tea places, estimate route, and maybe rewrite the query if evidence is weak. You also want to know which evidence supports which claim.
Agentic RAG puts retrieval tools and an evidence ledger inside an agent loop.
One Sentence
Agentic RAG turns one-shot RAG into retrieval as an agent action, so the model decides when to search, what to search, whether evidence is enough, and what goes into the ledger.
What Breaks Without It
| Problem | What it looks like | Risk |
|---|---|---|
| One search only | Fast | Bad query means bad answer |
| Retrieval and answer are tied together | Simple flow | Cannot search again from new evidence |
| No evidence ledger | Has citations | Claims and evidence may not match |
What This Pattern Changes
| Who | Owns |
|---|---|
| Model | Chooses search or final |
| Retrieval tool | Returns doc ids and snippets |
| Evidence ledger | Stores evidence actually used |
| Python | Executes actions, deduplicates, limits rounds, traces |
Retrieved text is not trusted instructions. It is evidence to inspect.
Walk Through One Trace
| Round | Action | Observation | Next |
|---|---|---|---|
| 1 | search(capital of France) |
Hit document paris |
Evidence is enough |
| 2 | final |
Paris is the capital of France. [paris] |
Stop |
A travel task may run longer: search(weather) → search(indoor tea places) → search(route) → final with evidence.
Flow
flowchart TD
Q["Question"] --> S["State: question + ledger"]
S --> M["Model chooses action"]
M -->|search| R["Retrieve"]
R --> L["Update evidence ledger"]
L --> S
M -->|final| A["Answer + citations"]
S --> B["Budget / stagnation checks"]
B -->|limit hit| X["Stop"]
Code Walk
The example exposes search as an agent action:
model = MockLLM(
[
'{"type":"tool","tool":"search","args":{"query":"capital of France","k":2}}',
'{"type":"final","answer":"Paris is the capital of France. [paris]"}',
]
)
Full example:
from __future__ import annotations
import json
from pathlib import Path
from agent_patterns_lab.patterns.agentic_rag import agentic_rag
from agent_patterns_lab.runtime import Document, MockLLM, SimpleSearchIndex, Tracer
def load_docs(path: Path) -> list[Document]:
docs: list[Document] = []
for line in path.read_text(encoding="utf-8").splitlines():
if not line.strip():
continue
obj = json.loads(line)
docs.append(Document(doc_id=obj["doc_id"], text=obj["text"]))
return docs
def main() -> None:
tracer = Tracer()
docs = load_docs(Path("data/mini_corpus.jsonl"))
index = SimpleSearchIndex(docs)
model = MockLLM(
[
'{"type":"tool","tool":"search","args":{"query":"capital of France","k":2}}',
'{"type":"final","answer":"Paris is the capital of France. [paris]"}',
]
)
result = agentic_rag(model, question="What is the capital of France?", index=index, tracer=tracer)
print(result.answer)
trace_path = tracer.export_jsonl(Path(".traces") / "41_agentic_rag.jsonl")
print(f"[trace] {trace_path}")
if __name__ == "__main__":
main()
Run:
UV_CACHE_DIR=.uv_cache PYTHONPATH=src uv run --no-sync python examples/41_agentic_rag.py
Nearby Patterns
| Pattern | Who decides next | Use when |
|---|---|---|
| Retrieval Loop | Search loop only | Query needs repair |
| ReAct | Any tool action loop | Tools are broader than retrieval |
| Agentic RAG | ReAct + retrieval + ledger | Evidence must be auditable |
| STORM | Section-level retrieval and writing | Long articles/reports |
When To Use It
- The question needs multiple searches.
- Claims must trace to doc ids.
- Retrieval may be incomplete and query needs repair.
- Search behavior should be traced and evaluated.
When Not To Use It
- One retrieval is enough.
- You do not need citations or a ledger.
- Sources are untrusted and you have no isolation strategy.
- Cost/latency cannot support an open loop.
Costs And Common Failures
| Failure | Symptom | Fix |
|---|---|---|
| Retrieval injection | Document tells model to ignore rules | Treat retrieved text as untrusted |
| Fake citation | Claim does not match cited doc | Run claim-evidence checks |
| Over-searching | Many rounds, no new evidence | Add budget and stagnation checks |
| Dirty ledger | Duplicate or conflicting evidence | Deduplicate and tag source/time |
What To Read Next
Agentic RAG fits knowledge tasks that converge through search.
For long articles, read STORM. For post-generation claim checking, read CoVe.