Agent Loop

Plan → call tool → observe → reflect → loop. The control-flow shape of every modern LLM agent. LangGraph, AutoGen, custom kubebuilder operators.

Plan. Call tool. Observe. Reflect. Repeat — under trajectory bounds. The control-flow shape every agent inherits. Status: STUB — promoted to OUTLINE in Y5 Phase 48.

What this pattern is

The agent loop is the control-flow shape every modern LLM agent inherits. Each iteration: the agent plans (decides what to do based on the goal + history); calls a tool (a function, an MCP server, an HTTP endpoint); observes the tool’s output; reflects (updates its understanding); decides whether to terminate or continue. The loop runs under trajectory bounds — max iterations, max wall-clock, max cost — to prevent unbounded loops. LangGraph is the canonical OSS framework (uses an explicit state machine model); AutoGen is a parallel framework with multi-agent emphasis; custom agent runtimes built directly on Anthropic’s / OpenAI’s tool-use APIs are operationally common.

The pattern composes with tool-use (tools are how the agent interacts with the world), mcp-protocol (MCP is the open standard for exposing tools), and evals (trajectory evals score whether the agent reached the goal efficiently).

The agent loop is control-loops applied to LLMs. The same pattern that runs Kubernetes reconciliation drives LLM agent behavior. The controller (LLM) observes the state (context so far + tool outputs), decides on an action (which tool to call), executes it, observes the new state, and iterates. The parallel is precise — same conceptual shape, different actuators. This is one reason SRE and platform engineers pick up agent design quickly; the underlying pattern is familiar.

The pattern’s central operational concern is trajectory bounds. LLM agents can loop indefinitely without them — calling tools that don’t help, exploring dead ends, spinning on unproductive plans. Bounds are the safety mechanism: max iterations, max wall-clock, max cost, max tools calls. Every production agent has them. Setting them correctly is workload-specific — too tight and legitimate tasks fail; too loose and runaway agents consume resources. Observability of trajectory shape (how many iterations do real tasks take?) informs bound tuning.

Trajectory evals are the pattern’s equivalent of tests. Given a goal and a trajectory (sequence of tool calls + observations + final answer), was the agent’s behavior good? Did it reach the goal? Did it use tools efficiently? Did it produce the right final output? Evaluating trajectories, not just final outputs, is what separates well-designed agents from ones that succeed by luck. Trajectory eval is a specialization emerging in 2025-2026; the tooling is still maturing.

Concrete instances in the wild

  • LangGraph. OSS agent framework using explicit state machine model. Common in production.
  • AutoGen (Microsoft). OSS multi-agent framework. Popular for agent-collaboration patterns.
  • CrewAI. OSS framework focused on role-based multi-agent workflows.
  • Anthropic Claude Agent SDK. Direct SDK for building agents on Claude API.
  • OpenAI Assistants API. OpenAI’s managed agent runtime.
  • basecamp agent runtime (Y5 Phase 48). Custom runtime on Anthropic API + MCP + kubebuilder operator.
  • AWS Bedrock Agents. AWS-managed agent runtime.
  • GCP Vertex AI Agent Builder. GCP equivalent.
  • Google Agent Development Kit (ADK, 2024+). Google’s newer agent framework.
  • Semantic Kernel (Microsoft). Framework for orchestrating LLM + tool workflows.
  • Devin, Cursor Composer, Claude Code. Product-level agent implementations. Public docs reveal design patterns.

Why this pattern matters

Static LLM applications (single prompt in → single response out) hit a ceiling. Complex tasks require multiple steps: gather information, reason about it, take action, observe results, iterate. Without an agent loop, applications either try to do everything in one prompt (which fails for anything complex) or hard-code multi-step workflows (which don’t handle unexpected situations). The agent loop gives LLMs the ability to work through complex tasks flexibly.

The pattern also matches how humans work on complex tasks. Break the problem into steps. Do one step. Observe the result. Adjust. Continue. LLMs given this loop structure produce dramatically better results on complex tasks than LLMs given one big prompt. The loop provides the framework; the LLM provides the intelligence within each step.

For SRE and platform engineering specifically, agent loops enable AIOps patterns that were previously impossible. Alert triage agents that read alerts, retrieve similar past incidents, propose remediation. Deployment agents that apply changes, observe outcomes, decide whether to continue. Cost-optimization agents that analyze usage, propose changes, execute approved ones. Each is an agent loop with domain-specific tools.

The trajectory-bound discipline matters because agents can genuinely go wrong in ways single-shot LLMs can’t. A confused agent can loop forever calling the same tool. A misconfigured agent can burn thousands of dollars in tool calls before someone notices. Bounds and observability aren’t optional — they’re what makes agents safe to run in production.

For basecamp specifically, the agent loop is the mechanism underneath AIOps (Y5 Phase 50). Alerts arrive as inputs. The agent iterates: retrieve context (RAG), reason about causes, propose actions, execute safe ones through platform-ctl, pause for approval on destructive ones. This is a real agent loop operating on real infrastructure — the culmination of the /root stack.

Modern agent frameworks are converging on similar designs. Explicit state machines (LangGraph). Message-passing between agents (AutoGen). Tool-first design (Anthropic Claude SDK). Framework choice matters less than framework understanding — each embodies the same underlying pattern in slightly different vocabulary. Learning one deeply transfers to others.

The failure modes to know: agents that plan but never act (over-thinking); agents that act without planning (thrashing); agents that loop without progress (need cycle detection); agents that succeed at the wrong task (unclear goal specification); agents that violate approval gates (bad capability scoping). Each has known mitigations, but building production agents means engineering for these failure modes explicitly.

Depth progression

STUB     ← you are here.
OUTLINE  Promoted when Y5 Phase 48 stands LangGraph + MCP servers on basecamp.
DEEP     Promoted after Y5 Phase 50 — agent loops operating in AIOps with
         measured trajectories, real incidents triaged, real approvals gated.

Preview: what OUTLINE will answer

When Y5 Phase 48 promotes this entry to OUTLINE, it will name:

  • PROBLEM. How do you enable LLMs to work through complex multi-step tasks with tools and observations?
  • PRINCIPLES. Loop: plan, act, observe, reflect. Trajectory bounds prevent runaway. Tool interface constrains what the agent can do. Observability of trajectory shape enables tuning. Goals should be crisp; actions should be scoped.
  • TRADE-OFFS. Explicit state machines (LangGraph — clear, more setup) vs message-passing (AutoGen — flexible, harder to debug). Framework-based (LangGraph, AutoGen — batteries included) vs custom (control, more work). Single-agent (simple) vs multi-agent (powerful, coordination cost). Tight bounds (safe, limits scope) vs loose (flexible, cost risk).
  • TOOLS (time-stamped as of 2026-06): LangGraph, AutoGen, CrewAI, Anthropic Claude Agent SDK, OpenAI Assistants API, AWS Bedrock Agents, GCP Vertex AI Agent Builder, Google ADK, Semantic Kernel, custom runtimes.

The DEEP promotion, after Y5 Phase 50 with AIOps operating real incidents, will add MASTERY (operating agents on basecamp), COMPARE (LangGraph vs custom runtime vs Assistants API), OPERATE (a specific agent behavior tuning event), and CONTRIBUTE (a LangGraph or AutoGen documentation improvement or public case study).

Canonical references

  • LangGraph documentation. Free at langchain-ai.github.io/langgraph.
  • AutoGen documentation. Free at microsoft.github.io/autogen.
  • Anthropic’s “Building Effective Agents” guide. Free at anthropic.com.
  • Weng, Lilian, “LLM Powered Autonomous Agents” (2023). Free at lilianweng.github.io.
  • OpenAI Assistants API documentation. Free at platform.openai.com.

Cross-references