Magentic Orchestration: Delegate, Watch Progress, Change Strategy
The Failure It Fixes
A travel assistant gets a broad request: “Plan a ten-day family trip to Japan under $3,000, keep the pace relaxed, and also check visas, transport, hotels, and rainy-day options.” This kind of task is hard to decompose correctly on the first try. You may start with cities, then discover flights break the budget. You may plan Kyoto first, then learn the child mainly wants Universal Studios.
Fixed plans become brittle. Manager-Worker helps with delegation, but what if the delegation itself is wrong? Magentic Orchestration adds a higher-level loop: watch progress, delegate to specialists, record results, and change strategy when the system is stuck.
One-Sentence Version
Replace “plan once, execute forever” with “read the ledger, delegate one narrow task, record the result, check for stalls, then decide again.”
The Naive Version
plan = planner.complete(task)
result = execute_all(plan)
This code trusts the first plan too much. For open-ended tasks, the first plan is often a guess. The real question is whether the runtime can notice “we made no progress” and force a different move.
What Magentic Adds
This pattern is heavier than ordinary multi-agent orchestration. It needs:
orchestrator: decides whether to delegate or finalize.Specialist: handles narrow tasks such as calculation, search, writing, or checking.messages: a simplified task ledger in this repo.stall_limit: repeated identical delegation triggersSTALL DETECTED.RunLimits: a hard cap on total loop steps.
Flow
flowchart TD
U["User task"] --> L["Task ledger / messages"]
L --> O["orchestrator chooses next move"]
O -->|delegate| S["specialist runs narrow task"]
S --> R["result written back"]
R --> D{"same delegation repeated?"}
D -->|no| O
D -->|yes| X["inject STALL DETECTED"]
X --> O
O -->|final| A["Final answer"]
Trace Walkthrough
The example still uses 3+4, but the point is stall detection:
- orchestrator returns
{"type":"delegate","agent":"calc","task":"Compute 3+4"}. calcreturns7; Python writes the delegation and result back intomessages.- orchestrator returns the exact same delegation again.
- Python detects the repeat and injects
STALL DETECTED, telling the orchestrator to change strategy or finish. - orchestrator returns
{"type":"final","answer":"3+4=7."}.
In a travel assistant, “stuck” might mean repeatedly searching the same city, failing to find a budget-feasible route, or looping on the same preference trade-off. The value is not “more agents.” The value is admitting that the last move did not advance the task.
Code
from __future__ import annotations
from pathlib import Path
from agent_patterns_lab.patterns.magentic_orchestration import Specialist, run_magentic_orchestration
from agent_patterns_lab.runtime import MockLLM, RunLimits, Tracer
def main() -> None:
tracer = Tracer()
orchestrator = MockLLM(
[
'{"type":"delegate","agent":"calc","task":"Compute 3+4"}',
'{"type":"delegate","agent":"calc","task":"Compute 3+4"}',
'{"type":"final","answer":"3+4=7."}',
]
)
specialists = [
Specialist(
name="calc",
description="Arithmetic specialist.",
model=MockLLM(["7", "7"]),
)
]
out = run_magentic_orchestration(
orchestrator,
specialists,
task="Compute 3+4.",
limits=RunLimits(max_steps=5),
stall_limit=1,
tracer=tracer,
)
print(out)
trace_path = tracer.export_jsonl(Path(".traces") / "65_magentic_orchestration.jsonl")
print(f"[trace] {trace_path}")
if __name__ == "__main__":
main()
Run it:
UV_CACHE_DIR=.uv_cache PYTHONPATH=src uv run --no-sync python examples/65_magentic_orchestration.py
What to Notice in the Code
orchestratorreturns structured actions:delegateorfinal.delegate_key = (action.agent, action.task)checks whether the same work was assigned again.- specialist output is written back as a
toolmessage. - when repeated delegation reaches
stall_limit, Python appendsSTALL DETECTED. run_loopenforces the step limit; the model cannot run forever by itself.
Boundaries to Decide
- Ledger contents: task state, decisions, tool results, and open questions.
- Ledger writers: only the orchestrator, or specialists too.
- Stall definition: repeated delegation, no new artifact, oscillating plans, failing tools.
- Stall response: re-split the task, switch tools, switch agents, narrow scope, or ask a human.
- Budget: cap steps, tokens, tool calls, and wall-clock time.
Use It When
- The task is open-ended and the first plan is likely to be wrong.
- Intermediate results should change later steps.
- You need an audit trail for why the system continued, reassigned, or stopped.
Avoid It When
- The path is fixed. Workflow or Manager-Worker is cheaper and easier to test.
- The task is small. A ledger and stall detector can be overkill.
- You lack trace, budget, and stop rules. Then this becomes expensive improvisation.
Common Failure Modes
- Ledger trash pile: log only information that affects the next decision.
- Fake progress: every cycle should produce a checkable artifact, decision, file, or verification.
- Crude stall detection: repeated actions are not always wrong; make retry limits explicit.
- Permission bypass: specialists must not gain extra access just because they were delegated to.
Nearby Patterns
- Manager-Worker: fixed delegation; Magentic changes delegation based on progress.
- Group Chat: discussion among agents; Magentic is closer to a scheduler.
- ReAct: a single-agent thought/action loop; Magentic is a multi-agent loop with a task ledger.
- Planning / Replanning: Magentic puts replanning inside the runtime loop.
References
- Azure Architecture Center — Magentic orchestration: https://learn.microsoft.com/en-us/azure/architecture/ai-ml/guide/ai-agent-design-patterns
- Microsoft Agent Framework — Magentic orchestration: https://learn.microsoft.com/en-us/agent-framework/user-guide/workflows/orchestrations/magentic
- Fourney et al. (2024), Magentic-One: https://arxiv.org/abs/2411.04468