STORM: Retrieve By Section, Then Write The Article
If the user wants a long report, one-shot RAG gets messy. You cannot dump all sources into one prompt and ask for a polished article. Structure drifts, evidence mixes across sections, and citations become hard to audit.
STORM starts with an outline, retrieves per section, writes per section, then assembles the article.
One Sentence
STORM turns retrieve-everything-then-write into outline → section retrieval → section writing → editing, so long-form structure and evidence stay manageable.
What Breaks Without It
| Problem | What it looks like | Risk |
|---|---|---|
| One-shot article | Fast | Loose structure |
| All sources in one context | Lots of information | Evidence crosses section boundaries |
| No final editor pass | Content exists | Repetition and inconsistent voice |
What This Pattern Changes
| Who | Owns |
|---|---|
| Outline step | Defines sections |
| Section retriever | Retrieves only for that section |
| Section writer | Writes from section evidence |
| Editor / assembler | Merges, deduplicates, harmonizes |
Walk Through One Trace
| Stage | Action | Output |
|---|---|---|
| 1 | Generate sections | Agent Loop, RAG |
| 2 | Retrieve for Agent Loop |
agent_loop document |
| 3 | Write Agent Loop section |
Cited section |
| 4 | Retrieve for RAG |
rag document |
| 5 | Assemble | Final article |
Flow
flowchart TD
T["Topic"] --> O["Create outline"]
O --> S["Pick section"]
S --> Q["Generate section query"]
Q --> R["Retrieve section evidence"]
R --> W["Write section"]
W --> N{"More sections?"}
N -->|Yes| S
N -->|No| A["Assemble / edit article"]
Code Walk
The example first generates sections:
model = MockLLM(
[
'{"sections":["Agent Loop","RAG"]}',
'{"query":"agent loop"}',
"Agent loops iterate decide/act/observe. [agent_loop]",
'{"query":"RAG retrieval augmented generation"}',
"RAG grounds answers by retrieving docs. [rag]",
"Final article:\n## Agent Loop\n...\n## RAG\n...",
]
)
Full example:
from __future__ import annotations
import json
from pathlib import Path
from agent_patterns_lab.patterns.storm import storm_write_article
from agent_patterns_lab.runtime import Document, MockLLM, SimpleSearchIndex, Tracer
def load_docs(path: Path) -> list[Document]:
docs: list[Document] = []
for line in path.read_text(encoding="utf-8").splitlines():
if not line.strip():
continue
obj = json.loads(line)
docs.append(Document(doc_id=obj["doc_id"], text=obj["text"]))
return docs
def main() -> None:
tracer = Tracer()
index = SimpleSearchIndex(load_docs(Path("data/mini_corpus.jsonl")))
model = MockLLM(
[
'{"sections":["Agent Loop","RAG"]}',
'{"query":"agent loop"}',
"Agent loops iterate decide/act/observe. [agent_loop]",
'{"query":"RAG retrieval augmented generation"}',
"RAG grounds answers by retrieving docs. [rag]",
"Final article:\n## Agent Loop\n...\n## RAG\n...",
]
)
article = storm_write_article(model, topic="Agent patterns", index=index, tracer=tracer)
print(article.article)
trace_path = tracer.export_jsonl(Path(".traces") / "56_storm.jsonl")
print(f"[trace] {trace_path}")
if __name__ == "__main__":
main()
Run:
UV_CACHE_DIR=.uv_cache PYTHONPATH=src uv run --no-sync python examples/56_storm.py
Nearby Patterns
| Pattern | Who decides next | Use when |
|---|---|---|
| Retrieval Loop | Query loop for one question | Short answer |
| Agentic RAG | Agent searches dynamically | Medium-complex QA |
| STORM | Section-level search and writing | Articles, reports, reviews |
| CoVe | Claims are verified | Facts need post-write checking |
When To Use It
- The output is an article, report, or review.
- Section structure matters.
- Evidence should be scoped per section.
- A final editor pass is useful.
When Not To Use It
- The user wants a short answer.
- Budget cannot support many retrieval and writing calls.
- Citations do not matter.
- The outline is unclear; human planning should happen first.
Costs And Common Failures
| Failure | Symptom | Fix |
|---|---|---|
| Shallow outline | Every section is generic | Review outline before writing |
| Evidence cross-talk | Section A cites Section B sources | Keep per-section ledgers |
| Context explosion | All docs go into final prompt | Summarize per section |
| Fake citation | Citation does not exist | Require retriever doc ids |
What To Read Next
STORM is a retrieval workflow for long-form writing.
For dynamic QA, read Agentic RAG. For claim verification after writing, read CoVe.