LLM Fundamentals: Part 8 -- Tool Use | BDIGITAL

This is Part 8 of the LLM Fundamentals series.

In Post 7, I showed how structured output gives you guaranteed JSON shapes from the model. Reliable structure solved one problem: getting data out in a format code can consume. The model still could not do anything itself. It could describe actions, suggest next steps, and recommend API calls, but it could not make them. Tool use closes that gap by letting the model request actions that your code executes.

What Tool Use Is

Tool use is a contract between your application and the model. You define what operations are available and what shape their inputs take. Claude decides when and how to call them based on the conversation. Your code runs the operation and returns the result. Claude never executes anything on its own, it emits a structured request, and you fulfill it.

I think of tools as a typed callback interface. If you have written event handlers, webhook receivers, or plugin systems, you already understand the pattern. You register capabilities, something else decides when to invoke them, and you handle the execution.

Defining a Tool

Every tool definition has three parts: a name, a description, and an input_schema that follows JSON Schema. Here is a weather lookup tool:

tools = [
    {
        "name": "get_weather",
        "description": "Get the current weather in a given location. Returns temperature and conditions. Use when the user asks about current weather anywhere in the world.",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA"
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature unit, defaults to fahrenheit"
                }
            },
            "required": ["location"]
        }
    }
]

Notice that the description does more than name the tool. It explains what the tool returns, when to use it, and what the parameters mean. Anthropic’s documentation is explicit about this: detailed descriptions are the single most important factor in tool performance. I default to writing a few sentences per description, covering what the tool does, when it should be used, and any limitations. A vague one-liner like “Gets weather” forces Claude to guess at behavior, and guessing leads to misuse.

How the Cycle Works

Sending tools to Claude changes the response structure. Instead of just text, Claude can return tool_use content blocks alongside (or instead of) its text response. Each tool_use block contains an id, the tool name, and the input arguments matching your schema. When this happens, the response comes back with stop_reason: "tool_use" instead of "end_turn".

Here is the full cycle in practice:

import anthropic

client = anthropic.Anthropic()

# Step 1: Send the request with tools
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=tools,
    messages=[
        {"role": "user", "content": "What's the weather in Denver?"}
    ],
)

# Step 2: Claude responds with a tool_use block
# response.stop_reason == "tool_use"
# response.content includes:
# {
#     "type": "tool_use",
#     "id": "toolu_01A09q90qw90lq917835lq9",
#     "name": "get_weather",
#     "input": {"location": "Denver, CO", "unit": "fahrenheit"}
# }

# Step 3: Execute the tool and return the result
tool_use = response.content[-1]
weather_data = call_weather_api(tool_use.input["location"])

followup = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=tools,
    messages=[
        {"role": "user", "content": "What's the weather in Denver?"},
        {"role": "assistant", "content": response.content},
        {
            "role": "user",
            "content": [
                {
                    "type": "tool_result",
                    "tool_use_id": tool_use.id,
                    "content": "72°F, sunny with light winds"
                }
            ]
        }
    ],
)

# Step 4: Claude uses the result to answer naturally
# "It's currently 72°F and sunny in Denver with light winds."

You send tool definitions in the request. Claude decides to call one and returns a tool_use block. You execute the actual operation in your codebase, then send the result back as a tool_result block in the next message. Claude incorporates that result into a natural language response. Four steps, and the model never touched your weather API directly.

Error Handling That Keeps the Loop Alive

Tools fail. APIs time out, databases go down, permissions get revoked. When your tool execution hits an error, you do not need to abandon the conversation. Setting is_error: true on the tool_result block tells Claude the call failed and gives it the error message to work with:

{
    "type": "tool_result",
    "tool_use_id": "toolu_01A09q90qw90lq917835lq9",
    "content": "ConnectionError: weather service returned HTTP 500",
    "is_error": true
}

Claude will typically explain the failure to the user or try an alternative approach. I write descriptive error messages here rather than generic “failed” strings, because Claude can only recover intelligently if it understands what went wrong. “Rate limit exceeded, retry after 60 seconds” gives the model something to work with. “Error” does not.

Client Tools vs Server Tools

Not all tools require your code to run them. Anthropic draws a clear line between client tools and server tools.

Client tools are what I have been describing: you define the schema, Claude requests a call, your code executes, you return the result. Every custom integration, every database query, every internal API call goes through this path.

Server tools like web_search, code_execution, and web_fetch run on Anthropic’s infrastructure. You enable them in the request and the server handles everything. No tool_result needed from your side, because execution happens internally before the response reaches you. I use web search this way when I need Claude to look something up without building my own search integration.

A third category, Anthropic-schema tools like bash and text_editor, gives you the best of both worlds. Anthropic publishes the schema and Claude has been trained on thousands of successful trajectories using these exact signatures, so it calls them more reliably than a custom equivalent. But your code still handles execution.

Parallel Tool Calls

By default, Claude can return multiple tool_use blocks in a single response. If a user asks “What is the weather in Denver and what time is it in Tokyo?”, Claude might call both get_weather and get_time simultaneously rather than sequentially. You execute all the tools, bundle the results into a single user message with multiple tool_result blocks, and send them back together.

For simpler orchestration, set disable_parallel_tool_use: true inside the tool_choice parameter. With tool_choice: auto, that combination limits Claude to at most one tool per turn. With tool_choice: any or tool_choice: tool, the same flag forces exactly one tool. I reach for this when tools have side effects that depend on execution order, because parallel calls remove the ability to control sequencing.

Controlling When Tools Get Called

tool_choice gives you four options for steering tool selection:

auto (default) lets Claude decide whether to call a tool or respond directly.
any forces Claude to call one of the provided tools, but Claude picks which one.
tool forces a specific tool by name, which is the approach I showed in Post 7 for structured extraction.
none prevents tool use entirely, even when tools are defined in the request.

In practice, auto handles most situations well. I reserve forced tool use for pipelines where I know the model should always call a specific function and never respond with just text.

Descriptions Are the Interface

Tool descriptions outweigh schemas in production. A perfect JSON Schema with a one-line description performs worse than a loose schema with a thorough description. Claude uses the description to decide whether a tool is appropriate for the current request, what the parameters mean in context, and how to interpret the result.

Adding a new tool to a project means spending more time on the description than on the schema. Explain what the tool does, when to use it, when not to use it, what it returns, and any caveats about its behavior. If Claude is misusing a tool or calling it at the wrong time, the first fix is always the description, not the schema.

What This Unlocks

Tool use transforms the model from a text generator into something that can interact with the world through your code. Any external system your application touches, whether an API, database, queue, or file system, becomes reachable through a tool definition. Because tool inputs follow JSON Schema, the arguments are always structured and validated, building directly on the guarantees from Post 7.

A single tool call is still one request-response pair. Real utility comes from chaining multiple tool calls together: search for data, analyze it, write a result, verify it. That chain of decisions, where the model keeps calling tools until a task is complete, is the agentic loop. The next post covers the agentic loop, where tool use becomes autonomous multi-step execution.