Tool Use

LLMs invoking functions, APIs, and other tools as part of an agent loop. Function-calling schemas, tool registries, capability gating. The mechanism that makes agents useful.

The agent isn’t just a text generator. It calls functions. Reads APIs. Executes commands. Tool use is what makes the loop do something. Status: STUB — promoted to OUTLINE in Y5 Phase 48.

What this pattern is

Tool use is the mechanism by which LLM agents interact with the world beyond their context window. A tool is a function or API the agent can invoke — read a file, query a database, call an HTTP endpoint, execute shell commands. The agent receives a tool registry (schemas describing each tool’s name + parameters + return type) as part of its system prompt; the LLM emits a structured function-call (structured-outputs) selecting a tool + arguments; the agent runtime executes the tool and returns the output to the next LLM turn. Anthropic’s tool-use and OpenAI’s function-calling are the major API-side implementations; MCP (mcp-protocol) is the open protocol for declaring tools cross-vendor.

Tool use raises critical safety questions. Capability allowlisting (the agent can only invoke tools in its declared scope). Approval gates (destructive tools require human confirmation). Trajectory bounds (prevent unbounded tool-call loops). Audit (every tool invocation logged with reasoning).

The pattern is what makes LLM agents actually useful. A pure text-generating LLM can produce advice about how to solve a problem; a tool-using LLM can solve the problem directly. Reading a customer’s account state, updating a database, sending an email, provisioning infrastructure — all of these require tool use. The gap between “chatbot that gives advice” and “agent that takes action” is entirely tool use.

The design of tool interfaces matters enormously. Good tool interfaces are narrow (one obvious purpose), typed (clear parameters), documented (schema descriptions the LLM can reason about), and side-effect-honest (the LLM knows what will happen when it calls the tool). Bad tool interfaces are broad (“do stuff with this parameter”), untyped (string in, string out), undocumented (the LLM guesses semantics), and side-effect-hidden (calling looks safe, isn’t). The former produce reliable agents; the latter produce unpredictable ones.

Concrete instances in the wild

  • Anthropic Claude tool use. Native API support with typed schemas. Well-documented for both simple tools and complex agent workflows.
  • OpenAI function calling. Similar mechanism from OpenAI. Same underlying pattern.
  • basecamp agent tools (Y5 Phase 48). MCP-based tools exposing platform-ctl, ops-handbook RAG, telemetry queries.
  • LangChain Tools. Framework abstraction over tool definitions. Wraps many providers.
  • LlamaIndex Tools. Similar framework-level tool abstraction.
  • OpenAI Assistants API code interpreter and file search. Managed tools built into the API.
  • Anthropic Computer Use. Tool for screen manipulation. Enables agent control of GUIs.
  • AWS Bedrock Agents actions. AWS-managed tool integration.
  • Claude Code tools. Deeply-engineered tool set: file operations, bash, web fetch, agent spawning. Public tool schemas.
  • Cursor Composer tools. Similar rich tool set for code-editing agents.
  • Semantic Kernel skills. Microsoft’s tool abstraction.

Why this pattern matters

LLMs by themselves can’t do most things you want production applications to do. They can’t fetch fresh data. They can’t modify state. They can’t call APIs. They can’t execute code. Tool use is what bridges this gap — the LLM decides what to do; the tool runtime executes it. Without tool use, LLM applications are limited to producing text from text. With tool use, they can be actual agents that operate on the world.

The pattern also determines what agents can safely do. Tools define the action surface. If a tool exists to delete production databases, the agent can potentially delete production databases. If the only tools are read-only, the agent can’t cause damage. Designing the tool set is designing the agent’s capabilities. This is one reason MCP servers and internal tool libraries are load-bearing infrastructure — they’re the mechanism through which platform teams control what agents can do.

For SRE and platform engineering, tool use enables agent-driven operations. Read alerts. Query metrics. Retrieve runbooks. Execute changes through the platform’s API. Each is a tool the platform team provides to the agent. The tools compose into agent workflows. The platform team retains control by controlling the tool surface.

Approval gates are the pattern’s safety mechanism for destructive operations. A tool that deletes resources doesn’t execute immediately — it emits a proposal, waits for human approval, then executes if approved. The agent proposes; the human authorizes; the tool runtime executes. This preserves human control over consequential actions while allowing agents to do the reasoning and proposing.

For basecamp specifically, tool use through MCP is what makes AIOps work. platform-ctl is exposed as an MCP server. The ops-handbook RAG is exposed as an MCP server. Telemetry queries are exposed as an MCP server. The AIOps agent composes these tools to triage incidents and propose remediations. Without tool use, AIOps would be a smart chat bot; with tool use, it’s an actual operational assistant.

The failure modes to know: overly broad tool permissions (agents doing more than intended); tools without approval gates for destructive operations (accidental damage); tool errors that the LLM misinterprets (cascade failures); tool call loops (agents calling the same tool repeatedly without progress); prompt injection through tool outputs (untrusted data reaching the LLM as instructions). Each has known mitigations, but adopting tool use means engineering for these failure modes.

Modern tooling is standardizing rapidly. MCP is emerging as the cross-vendor protocol for tool declaration. Framework abstractions (LangChain Tools, LlamaIndex Tools) provide portability. Anthropic and OpenAI both provide first-class tool support with similar semantics. The gap between building tools for one LLM provider and another is narrowing.

Depth progression

STUB     ← you are here.
OUTLINE  Promoted when Y5 Phase 48 wires tool-use into the agent runtime.
DEEP     Promoted after Y5 Phase 50 — AIOps agents invoking real platform-ctl
         actions under approval gates, with full audit.

Preview: what OUTLINE will answer

When Y5 Phase 48 promotes this entry to OUTLINE, it will name:

  • PROBLEM. How do you give LLM agents the ability to act on the world safely and effectively?
  • PRINCIPLES. Tools are typed, narrow, documented, side-effect-honest. Capability allowlists scope what the agent can do. Approval gates protect destructive operations. Full audit of every tool invocation. Compose tools rather than building one giant tool.
  • TRADE-OFFS. Rich tool set (powerful agent, larger action surface) vs narrow (safe, limited). Auto-execute (fast) vs approval-gated (safe). Vendor-specific (Anthropic tool use, OpenAI function calling) vs MCP-based (portable). Framework tools (LangChain, LlamaIndex — abstract) vs native (direct control).
  • TOOLS (time-stamped as of 2026-06): Anthropic Claude tool use, OpenAI function calling, MCP-based tools (basecamp default), LangChain Tools, LlamaIndex Tools, OpenAI Assistants tools, Anthropic Computer Use, AWS Bedrock Agents actions, Semantic Kernel skills.

The DEEP promotion, after Y5 Phase 50 with AIOps invoking real platform-ctl actions, will add MASTERY (operating tool-use in production agents), COMPARE (Anthropic tool use vs OpenAI function calling vs MCP), OPERATE (a specific approval gate scenario or capability scoping decision), and CONTRIBUTE (an MCP server or documentation improvement).

Canonical references

  • Anthropic tool use documentation. Free at docs.anthropic.com.
  • OpenAI function calling documentation. Free at platform.openai.com.
  • MCP specification. Free at modelcontextprotocol.io.
  • Anthropic’s “Building Effective Agents” guide. Free.
  • LangChain and LlamaIndex tool documentation. Free.

Cross-references