HITL: Ask a Human Before Risky Actions
HITL means Human-in-the-Loop. The idea is simple:
The agent may prepare an action, but a human approves high-risk actions before execution.
A travel assistant can recommend attractions automatically. But booking, paying, canceling orders, or sending passport information should not happen silently.
What It Fixes
Some mistakes cannot be repaired afterward:
- paying the wrong amount
- canceling an irreversible booking
- emailing the wrong person
- sending sensitive data to a third party
Policy and guardrails can block some cases, but they cannot confirm the user's intent. HITL makes confirmation part of the flow.
Flow
flowchart TD
A["Agent proposes risky action"] --> R["Create approval request"]
R --> H["Human reviews reason, args, and impact"]
H -->|approve| T["Execute tool"]
H -->|deny| P["Change plan or ask user"]
T --> O["Record result"]
Minimal Code Shape
class NeedsApproval(Exception):
def __init__(self, request: dict):
self.request = request
approved = set()
def require_approval(tool: str, args: dict, reason: str) -> None:
request_id = f"{tool}:{hash(str(sorted(args.items())))}"
if request_id not in approved:
raise NeedsApproval({
"id": request_id,
"tool": tool,
"args": args,
"reason": reason,
})
Check before execution:
try:
require_approval("book_ticket", {"price": 680}, "requires_payment")
except NeedsApproval as error:
show_to_human(error.request)
In a real system, approvals must be persisted. Otherwise the user approves once, and the retry gets stuck again.
What the Human Must See
Do not show only “approve or deny.” Include:
- which tool will run
- the arguments
- why approval is needed
- what could be affected
- what happens if the action is denied
The human is not a rubber stamp. If the impact is unclear, approval is not meaningful.
Use It When
- The action pays, books, cancels, deletes, or sends messages.
- Compliance requires human confirmation.
- The agent is in a staged rollout and needs limited autonomy.
- A guardrail triggers and needs judgment.
Avoid It When
Do not add approval everywhere for low-risk read-only tasks. Too many approvals create fatigue; then people approve dangerous requests by habit.
Common Failure Modes
| Mistake | Result | Fix |
|---|---|---|
| Approving every step | Approval fatigue | Gate only high-risk actions |
| Empty approval requests | Humans cannot judge | Include reason, args, and impact |
| No resume protocol | Bad user experience | Design interrupt/resume |
| Assuming approval is automatic | HITL becomes theater | Log denials and plan changes |
Next
Policy, Guardrails, and HITL make one run safer. To see whether changes regress behavior, read Eval Harness.