Retrieval Loop: Search Until Evidence Is Enough
If the travel assistant searches once, a bad query can sink the answer. The user asks for a "Hangzhou tea culture route", but the query is just "Hangzhou attractions", so retrieval returns generic places.
Retrieval Loop is not a full agent. It adds a small loop to RAG: propose a query, retrieve, decide whether evidence is enough, and either answer or search again.
One Sentence
Retrieval Loop turns one-shot retrieve-then-answer into query → retrieve → decide → answer/retry, so simple RAG can repair weak retrieval.
What Breaks Without It
| Problem | What it looks like | Risk |
|---|---|---|
| One retrieval only | Fast | Bad query means bad answer |
| No evidence sufficiency check | Has context | Model may answer from weak evidence |
| Hits are not recorded | Has citations | Hard to replay why those docs were used |
What This Pattern Changes
| Who | Owns |
|---|---|
| Model | Proposes query and decides done/not done |
| Retriever | Returns snippets and doc ids |
| Python | Controls rounds, stores hits, writes trace |
It is narrower than Agentic RAG: the only loop action is retrieval.
Walk Through One Trace
| Round | Model action | Retrieval result | Next |
|---|---|---|---|
| 1 | query: Paris capital |
Hit document paris |
Evidence is enough |
| 2 | done + answer | Paris is the capital of France. [paris] |
Stop |
The travel version is similar: search "Hangzhou tea culture easy walking"; if results are generic, search a more specific rainy-day tea route query.
Flow
flowchart TD
Q["Question"] --> M["Model proposes query"]
M --> R["Retrieve"]
R --> H["Store hits"]
H --> D{"Enough evidence?"}
D -->|No| M
D -->|Yes| A["Answer + cite doc ids"]
Code Walk
The example uses a tiny corpus:
docs = load_docs(Path("data/mini_corpus.jsonl"))
index = SimpleSearchIndex(docs)
The model proposes a query, then marks the run done:
model = MockLLM(
[
'{"query":"Paris capital"}',
'{"done": true, "answer": "Paris is the capital of France. [paris]"}',
]
)
Full example:
from __future__ import annotations
import json
from pathlib import Path
from agent_patterns_lab.patterns.retrieval_loop import retrieval_loop
from agent_patterns_lab.runtime import Document, MockLLM, SimpleSearchIndex, Tracer
def load_docs(path: Path) -> list[Document]:
docs: list[Document] = []
for line in path.read_text(encoding="utf-8").splitlines():
if not line.strip():
continue
obj = json.loads(line)
docs.append(Document(doc_id=obj["doc_id"], text=obj["text"]))
return docs
def main() -> None:
tracer = Tracer()
docs = load_docs(Path("data/mini_corpus.jsonl"))
index = SimpleSearchIndex(docs)
# 1) propose query 2) decide done+answer
model = MockLLM(
[
'{"query":"Paris capital"}',
'{"done": true, "answer": "Paris is the capital of France. [paris]"}',
]
)
result = retrieval_loop(model, question="What is the capital of France?", index=index, rounds=2, tracer=tracer)
print(result.answer)
trace_path = tracer.export_jsonl(Path(".traces") / "40_retrieval_loop.jsonl")
print(f"[trace] {trace_path}")
if __name__ == "__main__":
main()
Run:
UV_CACHE_DIR=.uv_cache PYTHONPATH=src uv run --no-sync python examples/40_retrieval_loop.py
Nearby Patterns
| Pattern | Who decides next | Use when |
|---|---|---|
| One-shot RAG | Code retrieves once | Query is obvious |
| Retrieval Loop | Model decides whether to search again | Query may need repair |
| Agentic RAG | Model can choose retrieval as an action | Retrieval is one tool in an agent loop |
| STORM | Sections retrieve separately | You are writing an article/report |
When To Use It
- One retrieval often misses.
- The query needs rewriting.
- You want to trace each retrieval round.
- A full agent loop would be too much.
When Not To Use It
- One retrieval is enough.
- Retrieval quality is poor; more search just adds noise.
- You need multiple tool types; use Agentic RAG or ReAct.
- There is no round limit.
Costs And Common Failures
| Failure | Symptom | Fix |
|---|---|---|
| Query churn | Many rounds, no new evidence | Add rounds and stagnation checks |
| Fake citation | Cites unrelated doc | Force answer from hits |
| Dirty context | All results are appended | Keep top-k and summaries |
| Query too broad | Generic docs only | Ask for specific query fields |
What To Read Next
Retrieval Loop is the small loop inside RAG.
If retrieval is one action among many, read Agentic RAG. For long reports, read STORM.