Skip to content

Agent Patterns Lab

This site follows one thread:

When does a normal chatbot break, and what structure should you add next?

So we do not start with names like ReAct, RAG, or multi-agent orchestration. We start with one LLM API call: the user sends text, the model returns text. Then the failures appear: it forgets context, returns prose that code cannot parse, lacks live facts, or turns a fixed workflow into a growing if-else tree.

Agent design patterns grow from those failures.

How to Read

The site now has three parts.

flowchart TD
  A["Part 1: chatbot to agent"] --> B["Part 2: choose by failure"]
  B --> C["Part 3: add safety and evaluation"]

Part 1: Chatbot to Agent

Read these 6 chapters in order. Each chapter adds exactly one layer:

Chapter What it owns Why it comes here
00: One API Call LLM APIs, providers, message formats First make the request boundary clear.
01: Conversation History Keeping the current session coherent Users refer to earlier turns.
02: Structured Output JSON, parsers, repair retries Code cannot reliably consume prose.
03: Tool Calling Turning Python functions into tools The model should not guess weather or facts.
04: Workflow Letting code control fixed steps Not every task needs an agent.
05: Agent Loop Choosing the next step from observations Use a loop only when observations change the next action.

After this section, the boundary between chatbot and agent should be visible.

Part 2: Choose by Failure

Do not choose patterns by name. Choose by what is broken.

For example:

  • Plausible answers are often wrong: read Maker-Checker, CoVe, or Voting.
  • One retrieval pass misses evidence: read Retrieval Loop or Agentic RAG.
  • Plans expire mid-task: read Planner-Executor-Replanner.
  • Tool calls have dependencies: read ReWOO or LLM Compiler.
  • One agent owns too much: read Manager-Worker, Agents-as-Tools, Handoff, or Group Chat.

The full decision table is in Choose a Pattern.

Part 3: Safety and Evaluation

When tools affect the real world — booking, paying, emailing, deleting files — intelligence is not enough.

The supporting section keeps four pages:

  • Policy: which tools are allowed, and which arguments are out of bounds.
  • Guardrails: runtime checks for unsafe input, output, or actions.
  • HITL: human approval for high-risk actions.
  • Eval Harness: fixed tasks for checking whether a change regressed behavior.

These pages are not the first step, but real agents need them.

Page Responsibilities

Each page now has one job:

  • Home explains the thread of the whole guide.
  • Start Here explains why chapters 00–05 are in this order.
  • Choose a Pattern routes concrete failures to concrete patterns.
  • Each pattern page explains one pattern: what failure it fixes, how it flows, and when not to use it.
  • Safety and evaluation pages come last: permissions, tripwires, human approval, and regression checks.

Start

If you are reading from scratch, start with Start Here.

If you already know the failure you are facing, jump to Choose a Pattern.