LATS: Search Over Candidate Answers
Some tasks should not stop at the first answer. A travel assistant may propose several itineraries: West Lake first, Tea Museum first, or minimal walking. To choose well, it needs candidates, scores, and pruning.
LATS-style patterns treat reasoning as search: expand candidates, score them, keep top-K, and continue.
One Sentence
LATS turns one answer into expand candidates → score → keep beam → search, trading compute for better candidate selection.
What Breaks Without It
| Problem | What it looks like | Risk |
|---|---|---|
| One candidate only | Fast | Gets stuck on a weak route |
| Candidates are not scored | Many options | No way to choose |
| Search has no budget | Looks smart | Token and latency explode |
What This Pattern Changes
| Who | Owns |
|---|---|
| Proposer | Generates candidates |
| Evaluator | Scores candidates |
| Search controller | Controls depth, branch factor, beam size |
| Python | Stores candidate tree and trace |
Walk Through One Trace
| Stage | Output |
|---|---|
| Expand | draft A, draft B |
| Score | A=3, B=8 |
| Keep | Keep draft B |
| Final | Best candidate is B |
For travel, candidates can be routes; scores can measure walking effort, weather fit, and preference coverage.
Flow
flowchart TD
T["Task"] --> E["Expand candidates"]
E --> S["Score"]
S --> K["Keep top-K"]
K --> D{"Depth/budget reached?"}
D -->|No| E
D -->|Yes| O["Best candidate"]
Code Walk
The proposer returns two candidates:
proposer = MockLLM(['{"candidates":["draft A","draft B"]}'])
evaluator = MockLLM(['{"score": 3}', '{"score": 8}'])
Search parameters stay in Python:
result = lats_beam_search(
proposer,
evaluator,
task="Write the best draft.",
depth=1,
branch_factor=2,
beam_size=1,
tracer=tracer,
)
Full example:
from __future__ import annotations
from pathlib import Path
from agent_patterns_lab.patterns.lats import lats_beam_search
from agent_patterns_lab.runtime import MockLLM, Tracer
def main() -> None:
tracer = Tracer()
proposer = MockLLM(['{"candidates":["draft A","draft B"]}'])
evaluator = MockLLM(['{"score": 3}', '{"score": 8}'])
result = lats_beam_search(
proposer,
evaluator,
task="Write the best draft.",
depth=1,
branch_factor=2,
beam_size=1,
tracer=tracer,
)
print({"best": result.best, "score": result.score})
trace_path = tracer.export_jsonl(Path(".traces") / "54_lats.jsonl")
print(f"[trace] {trace_path}")
if __name__ == "__main__":
main()
Run:
UV_CACHE_DIR=.uv_cache PYTHONPATH=src uv run --no-sync python examples/54_lats.py
Nearby Patterns
| Pattern | Who decides next | Use when |
|---|---|---|
| Voting | Complete answers vote | Short normalizable answers |
| Plan & Solve | One plan executes | Plan quality is stable |
| LATS | Search controller keeps candidates | Multiple solution paths matter |
| Self-Discovery | Strategy modules are selected | Choosing the strategy matters most |
When To Use It
- One generation is unstable.
- You have a usable evaluator.
- The task is worth extra tokens.
- Candidates can be improved or expanded.
When Not To Use It
- There is no scoring signal.
- Budget or latency is tight.
- A simple checker is enough.
- Search space is large and unpruned.
Costs And Common Failures
| Failure | Symptom | Fix |
|---|---|---|
| Weak evaluator | Bad candidate scores high | Use tests, rules, or multiple judges |
| Search explosion | Too many nodes | Limit depth, branch, beam |
| Similar candidates | No real exploration | Increase diversity or constraints |
| Reward hacking | Candidates flatter evaluator | Spot-check with rules or humans |
What To Read Next
LATS fits tasks worth searching when scoring is possible.
For short-answer sampling, read Voting. For strategy selection, read Self-Discovery.