Agent Patterns Lab
This site follows one thread:
When does a normal chatbot break, and what structure should you add next?
So we do not start with names like ReAct, RAG, or multi-agent orchestration. We start with one LLM API call: the user sends text, the model returns text. Then the failures appear: it forgets context, returns prose that code cannot parse, lacks live facts, or turns a fixed workflow into a growing if-else tree.
Agent design patterns grow from those failures.
How to Read
The site now has three parts.
flowchart TD
A["Part 1: chatbot to agent"] --> B["Part 2: choose by failure"]
B --> C["Part 3: add safety and evaluation"]
Part 1: Chatbot to Agent
Read these 6 chapters in order. Each chapter adds exactly one layer:
| Chapter | What it owns | Why it comes here |
|---|---|---|
| 00: One API Call | LLM APIs, providers, message formats | First make the request boundary clear. |
| 01: Conversation History | Keeping the current session coherent | Users refer to earlier turns. |
| 02: Structured Output | JSON, parsers, repair retries | Code cannot reliably consume prose. |
| 03: Tool Calling | Turning Python functions into tools | The model should not guess weather or facts. |
| 04: Workflow | Letting code control fixed steps | Not every task needs an agent. |
| 05: Agent Loop | Choosing the next step from observations | Use a loop only when observations change the next action. |
After this section, the boundary between chatbot and agent should be visible.
Part 2: Choose by Failure
Do not choose patterns by name. Choose by what is broken.
For example:
- Plausible answers are often wrong: read Maker-Checker, CoVe, or Voting.
- One retrieval pass misses evidence: read Retrieval Loop or Agentic RAG.
- Plans expire mid-task: read Planner-Executor-Replanner.
- Tool calls have dependencies: read ReWOO or LLM Compiler.
- One agent owns too much: read Manager-Worker, Agents-as-Tools, Handoff, or Group Chat.
The full decision table is in Choose a Pattern.
Part 3: Safety and Evaluation
When tools affect the real world — booking, paying, emailing, deleting files — intelligence is not enough.
The supporting section keeps four pages:
- Policy: which tools are allowed, and which arguments are out of bounds.
- Guardrails: runtime checks for unsafe input, output, or actions.
- HITL: human approval for high-risk actions.
- Eval Harness: fixed tasks for checking whether a change regressed behavior.
These pages are not the first step, but real agents need them.
Page Responsibilities
Each page now has one job:
- Home explains the thread of the whole guide.
- Start Here explains why chapters 00–05 are in this order.
- Choose a Pattern routes concrete failures to concrete patterns.
- Each pattern page explains one pattern: what failure it fixes, how it flows, and when not to use it.
- Safety and evaluation pages come last: permissions, tripwires, human approval, and regression checks.
Start
If you are reading from scratch, start with Start Here.
If you already know the failure you are facing, jump to Choose a Pattern.