Maker-Checker: Draft, Check, Revise
The travel assistant can write an itinerary, but "sounds like a guide" does not mean it is good. It may be vague, omit packing advice, or ignore "easy walking".
The smallest useful fix is not a complicated multi-agent system. Split the job into two roles: one makes, one checks.
One Sentence
Maker-Checker turns one-shot generation into draft → check → revise, so quality criteria become an executable feedback loop.
What Breaks Without It
| Problem | What it looks like | Risk |
|---|---|---|
| Direct final answer | Fast | Vague, incomplete, uncontrolled |
| Same prompt self-checks | Simple | It may justify its own draft |
| No pass/fail standard | Sounds reviewed | You cannot tell whether it passed |
What This Pattern Changes
| Who | Owns |
|---|---|
| Maker | Drafts an answer |
| Checker | Applies a rubric and returns pass/fail feedback |
| Python | Controls rounds, sends feedback back, stops at a limit |
Checker output should be structured:
{"passed": false, "feedback": "Too vague. Add concrete details."}
Walk Through One Trace
| Round | Maker draft | Checker feedback | Next |
|---|---|---|---|
| 1 | draft v1 (bad) |
Too vague; add details | Revise |
| 2 | draft v2 (better) |
Pass | Return |
Flow
flowchart TD
T["Task"] --> M["Maker drafts"]
M --> C["Checker reviews"]
C -->|fail + feedback| M
C -->|pass| O["Final answer"]
C -->|round limit| X["Stop with best available result"]
Code Walk
The maker produces two versions:
maker = MockLLM(["draft v1 (bad)", "draft v2 (better)"])
The checker rejects the first and accepts the second:
checker = MockLLM(
[
'{"passed": false, "feedback": "Too vague. Add concrete details."}',
'{"passed": true, "feedback": ""}',
]
)
Full example:
from __future__ import annotations
from pathlib import Path
from agent_patterns_lab.patterns.maker_checker import maker_checker
from agent_patterns_lab.runtime import MockLLM, Tracer
def main() -> None:
tracer = Tracer()
maker = MockLLM(["draft v1 (bad)", "draft v2 (better)"])
checker = MockLLM(
[
'{"passed": false, "feedback": "Too vague. Add concrete details."}',
'{"passed": true, "feedback": ""}',
]
)
out = maker_checker(maker, checker, task="Write a short project summary.", max_rounds=2, tracer=tracer)
print(out)
trace_path = tracer.export_jsonl(Path(".traces") / "30_maker_checker.jsonl")
print(f"[trace] {trace_path}")
if __name__ == "__main__":
main()
Run:
UV_CACHE_DIR=.uv_cache PYTHONPATH=src uv run --no-sync python examples/30_maker_checker.py
Nearby Patterns
| Pattern | Who decides next | Use when |
|---|---|---|
| Maker-Checker | Checker passes or fails | You have quality criteria |
| Voting | Majority or judge selects | Answers are short and normalizable |
| CoVe | Claims are extracted and verified | Factual claims matter |
| Reflexion | Lessons are stored | Similar failures repeat |
When To Use It
- You can write a rubric.
- Draft quality is unstable but feedback improves it.
- Risk is moderate.
- You want traceable review feedback.
When Not To Use It
- There is no executable quality standard.
- Maker and checker share the same blind spot.
- One answer is enough.
- Fact verification matters more; use CoVe or retrieval.
Costs And Common Failures
| Failure | Symptom | Fix |
|---|---|---|
| Vague checker | "Looks good" | Use rubric and structured JSON |
| Endless revision | Never passes | Add max_rounds |
| Unusable feedback | "Make it better" | Require concrete missing items |
| Shared hallucination | Both believe a false fact | Add tool verification or CoVe |
What To Read Next
Maker-Checker makes unstable quality visible and repairable.
For factual claims, read CoVe. For sampling variance, read Voting.