跳转至

Retrieval Loop: Search Until Evidence Is Enough

If the travel assistant searches once, a bad query can sink the answer. The user asks for a "Hangzhou tea culture route", but the query is just "Hangzhou attractions", so retrieval returns generic places.

Retrieval Loop is not a full agent. It adds a small loop to RAG: propose a query, retrieve, decide whether evidence is enough, and either answer or search again.

One Sentence

Retrieval Loop turns one-shot retrieve-then-answer into query → retrieve → decide → answer/retry, so simple RAG can repair weak retrieval.

What Breaks Without It

Problem What it looks like Risk
One retrieval only Fast Bad query means bad answer
No evidence sufficiency check Has context Model may answer from weak evidence
Hits are not recorded Has citations Hard to replay why those docs were used

What This Pattern Changes

Who Owns
Model Proposes query and decides done/not done
Retriever Returns snippets and doc ids
Python Controls rounds, stores hits, writes trace

It is narrower than Agentic RAG: the only loop action is retrieval.

Walk Through One Trace

Round Model action Retrieval result Next
1 query: Paris capital Hit document paris Evidence is enough
2 done + answer Paris is the capital of France. [paris] Stop

The travel version is similar: search "Hangzhou tea culture easy walking"; if results are generic, search a more specific rainy-day tea route query.

Flow

flowchart TD
  Q["Question"] --> M["Model proposes query"]
  M --> R["Retrieve"]
  R --> H["Store hits"]
  H --> D{"Enough evidence?"}
  D -->|No| M
  D -->|Yes| A["Answer + cite doc ids"]

Code Walk

The example uses a tiny corpus:

docs = load_docs(Path("data/mini_corpus.jsonl"))
index = SimpleSearchIndex(docs)

The model proposes a query, then marks the run done:

model = MockLLM(
    [
        '{"query":"Paris capital"}',
        '{"done": true, "answer": "Paris is the capital of France. [paris]"}',
    ]
)

Full example:

from __future__ import annotations

import json
from pathlib import Path

from agent_patterns_lab.patterns.retrieval_loop import retrieval_loop
from agent_patterns_lab.runtime import Document, MockLLM, SimpleSearchIndex, Tracer


def load_docs(path: Path) -> list[Document]:
    docs: list[Document] = []
    for line in path.read_text(encoding="utf-8").splitlines():
        if not line.strip():
            continue
        obj = json.loads(line)
        docs.append(Document(doc_id=obj["doc_id"], text=obj["text"]))
    return docs


def main() -> None:
    tracer = Tracer()
    docs = load_docs(Path("data/mini_corpus.jsonl"))
    index = SimpleSearchIndex(docs)

    # 1) propose query  2) decide done+answer
    model = MockLLM(
        [
            '{"query":"Paris capital"}',
            '{"done": true, "answer": "Paris is the capital of France. [paris]"}',
        ]
    )

    result = retrieval_loop(model, question="What is the capital of France?", index=index, rounds=2, tracer=tracer)
    print(result.answer)

    trace_path = tracer.export_jsonl(Path(".traces") / "40_retrieval_loop.jsonl")
    print(f"[trace] {trace_path}")


if __name__ == "__main__":
    main()

Run:

UV_CACHE_DIR=.uv_cache PYTHONPATH=src uv run --no-sync python examples/40_retrieval_loop.py

Nearby Patterns

Pattern Who decides next Use when
One-shot RAG Code retrieves once Query is obvious
Retrieval Loop Model decides whether to search again Query may need repair
Agentic RAG Model can choose retrieval as an action Retrieval is one tool in an agent loop
STORM Sections retrieve separately You are writing an article/report

When To Use It

  • One retrieval often misses.
  • The query needs rewriting.
  • You want to trace each retrieval round.
  • A full agent loop would be too much.

When Not To Use It

  • One retrieval is enough.
  • Retrieval quality is poor; more search just adds noise.
  • You need multiple tool types; use Agentic RAG or ReAct.
  • There is no round limit.

Costs And Common Failures

Failure Symptom Fix
Query churn Many rounds, no new evidence Add rounds and stagnation checks
Fake citation Cites unrelated doc Force answer from hits
Dirty context All results are appended Keep top-k and summaries
Query too broad Generic docs only Ask for specific query fields

Retrieval Loop is the small loop inside RAG.

If retrieval is one action among many, read Agentic RAG. For long reports, read STORM.

References