LangGraph Tool Calling: ToolNode, Parallel Tools, and Custom Tools
Wire real capabilities into LangGraph agents: @tool decorator, ToolNode, bind_tools, parallel execution, and error handling.
Abstract Algorithms
TLDR: Wire @tool, ToolNode, and bind_tools into LangGraph for agents that call APIs at runtime.
π The Stale Knowledge Problem: Why LLMs Need Runtime Tools
Your agent confidently tells you the current stock price of NVIDIA. It's from its training data β six months out of date. The model doesn't know what it doesn't know. It has no concept of "today." Every fact it gives you was frozen at training cutoff, and it will recite that frozen snapshot with the same unwavering confidence it uses to state that water is wet.
This is the fundamental limitation of a bare LLM in production: it is a lookup table, not a live system. It can reason, summarize, and plan. What it cannot do on its own is fetch a live API response, run a shell command, query a database, or check whether a flight has been delayed. For agentic applications β systems that must act on real-world state β this is a hard blocker.
Tools are the solution. A tool is simply a Python function with a well-defined signature that the LLM can choose to invoke at runtime. The LLM doesn't execute the function itself; it emits a structured instruction ("call get_stock_price with ticker='NVDA'"), and your graph's execution layer runs the function and feeds the result back. The LLM then reasons over the live result.
LangGraph provides a clean, composable infrastructure for this entire loop: the @tool decorator to define tools, bind_tools() to register them with any LLM, ToolNode to execute them, and tools_condition to route the graph based on whether the model wants to call a tool or is ready to respond. Together, these four pieces turn a static LLM into an agent that interacts with the real world.
π Tool Fundamentals: @tool, Schemas, and the bind_tools() Pattern
Before wiring tools into a graph, you need to understand what a "tool" is from LangGraph's perspective: a Python callable decorated with @tool that carries enough metadata for an LLM to decide when and how to call it.
Defining a Tool with @tool
The @tool decorator from langchain_core.tools does three things. It wraps your function, uses the docstring as the tool description (what the model reads to decide whether to call it), and derives the JSON schema for the function's arguments from Python type hints.
from langchain_core.tools import tool
@tool
def get_stock_price(ticker: str) -> str:
"""
Fetch the current stock price for a given ticker symbol.
Returns a formatted string with the latest price.
"""
# In production, call a real market data API here
prices = {"NVDA": "118.42", "AAPL": "213.07", "TSLA": "172.30"}
price = prices.get(ticker.upper(), "unknown")
return f"{ticker.upper()} is currently trading at ${price}"
The docstring is not decoration β it's the model's only window into what this tool does. Write it like documentation for a smart but uninformed colleague. Mention what the tool accepts, what it returns, and when it should (or shouldn't) be called.
Custom Schemas with Pydantic for Complex Inputs
For tools that require structured or multi-field inputs, you can define the input schema explicitly using a Pydantic BaseModel. This gives you validation, default values, and richer descriptions per field.
from pydantic import BaseModel, Field
from langchain_core.tools import tool
class WebSearchInput(BaseModel):
query: str = Field(description="The search query string")
max_results: int = Field(default=5, description="Maximum number of results to return")
@tool("web_search", args_schema=WebSearchInput)
def web_search(query: str, max_results: int = 5) -> str:
"""Search the web for current information on a topic."""
# In production: call Tavily, SerpAPI, or Brave Search here
return f"Top {max_results} results for '{query}': [result1, result2, ...]"
LLM-Agnostic Binding with bind_tools()
Once your tools are defined, you attach them to any LangChain-compatible LLM using bind_tools(). This method injects the tool schemas into the model's system context so it knows what tools exist and can emit tool_calls in its response when appropriate.
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
tools = [get_stock_price, web_search]
# OpenAI
llm_openai = ChatOpenAI(model="gpt-4o").bind_tools(tools)
# Anthropic β same API, different model
llm_anthropic = ChatAnthropic(model="claude-3-5-sonnet-20241022").bind_tools(tools)
# Groq (fast inference)
from langchain_groq import ChatGroq
llm_groq = ChatGroq(model="llama-3.3-70b-versatile").bind_tools(tools)
The key insight is that bind_tools() is LLM-agnostic β you swap the model class without changing the tool definitions or the graph structure. This is one of LangGraph's most important architectural properties: tool-augmented agents are portable across providers.
| LLM Provider | Class | Notes |
| OpenAI | ChatOpenAI | GPT-4o natively excellent at tool selection |
| Anthropic | ChatAnthropic | Claude 3.5 Sonnet reliable for structured calls |
| Groq | ChatGroq | Low-latency; use with Llama/Mistral |
| Ollama (local) | ChatOllama | Works with Llama3-based models |
βοΈ Wiring Tools into LangGraph: ToolNode, Conditional Routing, and the Tool Loop
With tools defined and bound to the LLM, the next step is wiring them into the graph. This requires three components: a state, an agent node, a ToolNode, and conditional routing between them.
State and the Agent Node
The graph state holds the conversation as a list of messages. The agent node calls the LLM with the current state and appends the model's response.
from typing import Annotated
from typing_extensions import TypedDict
from langgraph.graph import StateGraph, END
from langgraph.graph.message import add_messages
from langgraph.prebuilt import ToolNode, tools_condition
class AgentState(TypedDict):
messages: Annotated[list, add_messages]
def agent_node(state: AgentState):
"""Call the LLM with the current message history."""
response = llm_openai.invoke(state["messages"])
return {"messages": [response]}
ToolNode: The Prebuilt Executor
ToolNode is a prebuilt LangGraph node that reads the tool_calls field from the last AIMessage in the state, executes each referenced tool, and appends ToolMessage results back to the state. You don't need to write dispatch logic yourself.
tool_node = ToolNode(tools) # Pass the same list you used in bind_tools()
Conditional Routing with tools_condition
tools_condition is a prebuilt routing function that inspects the last message in state. If it contains tool_calls, it returns "tools" to route to ToolNode. If there are no tool calls, it returns END, signalling the agent is done.
graph_builder = StateGraph(AgentState)
graph_builder.add_node("agent", agent_node)
graph_builder.add_node("tools", tool_node)
graph_builder.set_entry_point("agent")
graph_builder.add_conditional_edges(
"agent",
tools_condition, # Routes to "tools" or END
)
graph_builder.add_edge("tools", "agent") # After tools run, go back to agent
graph = graph_builder.compile()
The add_edge("tools", "agent") line is what creates the loop: after tools execute and results are added to state, control returns to the agent node for the next reasoning step. The loop exits when the model chooses to respond directly without calling any tools.
π§ Deep Dive: How LangGraph Executes Tool Calls
The Internals
When the LLM with bound tools generates a response that includes tool calls, it returns an AIMessage with a tool_calls field. Each entry in this list is a dict with three keys: id (a unique call identifier), name (the tool function name), and args (a dict of parsed arguments).
# Example AIMessage.tool_calls structure:
[
{
"id": "call_abc123",
"name": "get_stock_price",
"args": {"ticker": "NVDA"}
},
{
"id": "call_def456",
"name": "web_search",
"args": {"query": "NVIDIA Q4 2025 earnings", "max_results": 3}
}
]
ToolNode iterates this list and dispatches each call to the matching function by name. It uses the tool registry it was initialized with β the same tools list passed to ToolNode(tools). After execution, it wraps each result in a ToolMessage that carries the matching tool_call_id so the LLM can correlate each result to the call that produced it.
Parallel execution: When a single AIMessage contains multiple tool_calls, ToolNode executes them concurrently using Python's ThreadPoolExecutor. This means if your agent asks for get_stock_price("NVDA") and web_search("NVIDIA news") in the same turn, both calls happen simultaneously. The state receives all ToolMessage results before the agent node is invoked again.
State threading: LangGraph's add_messages reducer deduplicates messages by id. Because each ToolMessage carries the original tool_call_id, the message list remains coherent even with multiple parallel results arriving at once.
Performance Analysis
Latency composition: A single tool-calling round trip adds at least three network hops β the initial LLM call (to decide to use a tool), the tool execution (calling the external API), and a second LLM call (to interpret the result). For a typical setup with OpenAI and an external search API, this means 1β3 seconds per round trip under normal conditions.
Parallel vs. sequential: If the agent calls two tools in a single step (parallel), total latency is max(tool_A_latency, tool_B_latency) plus two LLM calls. If the tools were called in two sequential turns, the latency would be tool_A + tool_B plus three LLM calls. Designing prompts that encourage parallel tool calls for independent sub-tasks is a meaningful optimization.
Timeout handling: ToolNode does not enforce timeouts by default. For tools that call external APIs, wrap the underlying call with concurrent.futures.TimeoutError handling or use httpx with explicit timeout= parameters. Unhandled slow tools will block the graph thread indefinitely.
| Scenario | LLM Calls | Tool Calls | Approx. Latency |
| Direct answer (no tools) | 1 | 0 | ~0.8s |
| Single tool call | 2 | 1 | ~2.0s |
| Two tools, sequential | 3 | 2 | ~3.5s |
| Two tools, parallel (one step) | 2 | 2 | ~2.2s |
π The Tool Calling Loop: Graph Diagram
The core pattern is a tight loop between an agent node and a tool executor, gated by a conditional router:
flowchart TD
A([User Input]) --> B[Agent Node\nLLM + bind_tools]
B --> C{tools_condition}
C -- has tool_calls --> D[ToolNode\nExecute tools]
C -- no tool_calls --> E([Final Response])
D --> B
style A fill:#e8f4f8,stroke:#2196f3
style E fill:#e8f5e9,stroke:#4caf50
style C fill:#fff3e0,stroke:#ff9800
style D fill:#fce4ec,stroke:#e91e63
The agent loop: the LLM reasons β calls tools if needed β receives results β reasons again. The cycle exits when the model produces a direct answer.
Every iteration of the loop passes the full message history to the LLM, so the model always has context on what tools it called and what results they returned. This is how multi-step reasoning works: the agent sees its own prior actions and builds on them.
π Real-World Applications: How Production Agents Use Tool Calling
Financial Research Assistants
A hedge fund assistant needs to answer: "Should we increase our NVIDIA position given today's macro environment?" A bare LLM would hallucinate a confident answer based on stale training data. With tools, the agent calls get_stock_price, get_earnings_data, and web_search in parallel, receives live data, then synthesizes a grounded analysis.
Input: "Evaluate NVIDIA given today's market."
Process: Parallel tool calls β stock price + recent news + analyst ratings fetched live
Output: Structured memo citing today's actual data, not training-cutoff prices.
Customer Support Bots with CRM Access
A support agent needs to check an order's shipping status. Without tools, it tells the customer to "check the website." With a get_order_status(order_id: str) tool bound to the CRM API, the agent can answer "Your order #4829 shipped yesterday and arrives Thursday" β pulling the live record in real time.
Scaling note: In production deployments, tool calls are typically logged as audit events. The tool_call_id in each ToolMessage gives you a correlation key to trace exactly which tool was called, with what arguments, and what it returned β essential for debugging and compliance.
Code Execution Agents
Agents equipped with a python_repl tool can run generated code and feed the result back to the LLM for interpretation. This pattern powers tools like Jupyter AI and OpenAI's Code Interpreter. The critical safety requirement is sandboxing: tools that execute arbitrary code must run in isolated containers with restricted syscalls.
βοΈ Trade-offs and Failure Modes: When Tool Calling Breaks
Hallucinated Tool Calls
LLMs occasionally invent tool call arguments that violate the schema, or call a tool with plausible-sounding but wrong input (e.g., passing "NVIDIA" instead of "NVDA" to a ticker lookup). Pydantic args_schema catches schema violations, but semantic errors still slip through. Mitigation: validate arguments inside the tool body and return structured error messages the model can recover from.
API Failures and Transient Errors
If get_stock_price raises an exception, the default ToolNode wraps it in an error message and feeds it back to the agent. This prevents graph crashes but risks infinite retry loops β the model may keep trying the same failing tool. Use handle_tool_errors=True (the ToolNode default) and implement retry limits at the tool level.
Infinite Tool Loops
An agent can get stuck in a loop: call tool β get result β call tool again in the next step indefinitely. Guard against this with a recursion_limit on the graph:
graph.invoke({"messages": [HumanMessage(content="research NVDA")]},
config={"recursion_limit": 10})
Latency Creep at Scale
Every tool call adds a network round trip. A five-step agent chain with three tools per step can easily exceed 30 seconds of wall-clock time. Design agents to batch tool calls in a single step (parallel execution) wherever tools are independent, and use streaming (graph.stream()) to provide intermediate feedback to the user.
| Failure Mode | Root Cause | Mitigation |
| Hallucinated args | LLM schema misinterpretation | Pydantic args_schema + tool-level validation |
| API error loop | Transient failures not caught | try/except inside tool; explicit error return string |
| Infinite tool loop | No exit condition | recursion_limit in graph config |
| Slow response | Sequential tool calls | Design prompts to encourage parallel calls |
| Stale data edge case | Tool cache not invalidated | TTL-based caching in tool body |
π§ Decision Guide: ToolNode vs Custom Tool Execution vs LangChain AgentExecutor
| Situation | Recommendation |
| Use ToolNode when | You want the standard agent loop with automatic parallel execution, error handling, and message threading. It covers 90% of production use cases. |
| Use custom tool execution when | You need fine-grained control: custom retry policies per tool, dynamic tool selection at runtime, or tool output post-processing before it hits the LLM. |
| Use LangChain AgentExecutor when | You're prototyping or working with an existing LangChain-based codebase that predates LangGraph. AgentExecutor is simpler but less composable and harder to debug. |
| Avoid ToolNode when | Your tools are stateful and must execute in a strict sequence with intermediate graph decisions between each call. Use separate nodes per tool with explicit edges instead. |
| Edge case: tool-as-node pattern | For tools that trigger multi-step subgraphs (e.g., a "research" tool that itself spawns a RAG pipeline), implement them as a LangGraph subgraph node rather than a @tool function. |
π§ͺ Practical Example: Market Research Agent with Parallel Tool Calls
This example demonstrates the full tool-calling loop: tool definition, LLM binding, ToolNode wiring, and parallel execution in a single runnable agent. The market research scenario was chosen because it requires live data from multiple independent sources simultaneously β exactly the case where LangGraph's parallel ToolNode execution pays off over a sequential loop. As you read through the code, watch for step 2 in the execution trace: the LLM emits two tool_calls in a single AIMessage, which ToolNode executes concurrently β that is the parallel tool calling pattern in action.
Here is the complete market research agent with three tools β web_search, get_stock_price, and summarize_findings β with parallel execution and graceful error handling.
from typing import Annotated
from typing_extensions import TypedDict
from langchain_core.tools import tool
from langchain_core.messages import HumanMessage
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, END
from langgraph.graph.message import add_messages
from langgraph.prebuilt import ToolNode, tools_condition
# ββ Tool definitions ββββββββββββββββββββββββββββββββββββββββββββββββ
@tool
def web_search(query: str) -> str:
"""Search the web for recent news and analysis on a topic."""
# Replace with Tavily or SerpAPI in production
return f"[Search results for '{query}']: Market analysts expect strong Q1. Supply constraints easing."
@tool
def get_stock_price(ticker: str) -> str:
"""Get the current stock price for a publicly traded company by ticker symbol."""
mock_prices = {"NVDA": "118.42", "AAPL": "213.07", "MSFT": "415.30"}
price = mock_prices.get(ticker.upper(), "Price unavailable")
return f"{ticker.upper()}: ${price}"
@tool
def summarize_findings(raw_data: str) -> str:
"""Condense multiple data points into a brief investment summary."""
return f"Summary: Based on provided data β {raw_data[:80]}... β outlook is cautiously positive."
tools = [web_search, get_stock_price, summarize_findings]
# ββ LLM with bound tools βββββββββββββββββββββββββββββββββββββββββββββ
llm = ChatOpenAI(model="gpt-4o", temperature=0).bind_tools(tools)
# ββ State βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
class ResearchState(TypedDict):
messages: Annotated[list, add_messages]
# ββ Agent node ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
def agent_node(state: ResearchState):
response = llm.invoke(state["messages"])
return {"messages": [response]}
# ββ Graph assembly ββββββββββββββββββββββββββββββββββββββββββββββββββββ
tool_node = ToolNode(tools)
builder = StateGraph(ResearchState)
builder.add_node("agent", agent_node)
builder.add_node("tools", tool_node)
builder.set_entry_point("agent")
builder.add_conditional_edges("agent", tools_condition)
builder.add_edge("tools", "agent")
graph = builder.compile()
# ββ Run βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
result = graph.invoke(
{"messages": [HumanMessage(
content="Research NVIDIA (NVDA): get the current price and recent news. Then summarize."
)]},
config={"recursion_limit": 10}
)
for msg in result["messages"]:
print(f"[{msg.__class__.__name__}] {msg.content[:120]}")
What happens at runtime:
- The agent calls the LLM with the user's question.
- The LLM returns an
AIMessagewith two simultaneoustool_calls:get_stock_price(ticker="NVDA")andweb_search(query="NVIDIA recent news"). ToolNodeexecutes both calls in parallel and adds twoToolMessageresults.- The agent calls the LLM again with the full updated state. The LLM now calls
summarize_findingswith the combined results. ToolNoderuns the summary tool.- The agent calls the LLM one final time. The model produces a direct answer with no more tool calls.
tools_conditionroutes toEND.
The parallel execution in step 2 means the stock price and news search happen simultaneously, not one after the other β shaving roughly 1 second off the total latency.
π οΈ LangChain Tools Hub: Prebuilt Tools You Can Use Today
You don't need to build every tool from scratch. LangChain's langchain_community package ships dozens of production-ready tool integrations:
# Tavily web search (recommended for agents)
from langchain_community.tools.tavily_search import TavilySearchResults
search = TavilySearchResults(max_results=3)
# Python REPL for code execution
from langchain_experimental.tools import PythonREPLTool
repl = PythonREPLTool()
# Wikipedia lookup
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper
wiki = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())
# DuckDuckGo search (no API key required)
from langchain_community.tools import DuckDuckGoSearchRun
ddg = DuckDuckGoSearchRun()
# Combine prebuilt and custom tools seamlessly
all_tools = [search, repl, wiki, get_stock_price]
llm_with_tools = ChatOpenAI(model="gpt-4o").bind_tools(all_tools)
tool_node = ToolNode(all_tools)
Prebuilt tools follow the same @tool-like interface β they expose .name, .description, and .args_schema β so they drop into a ToolNode without any adaptation. Mix them freely with your custom @tool functions.
For a full deep-dive on building LangChain applications with memory and chains, see LangChain Development Guide.
π Lessons Learned
1. Docstrings are the model's tool selection logic. A vague docstring like "Get price" will lead the model to call the wrong tool or skip it entirely. Write one-sentence-clear descriptions: what it does, what it takes, what it returns, and when not to use it.
2. Don't let exceptions crash your graph silently. The default ToolNode catches exceptions and returns an error string, but you should handle errors inside the tool body too. Return a structured error message (e.g., "ERROR: Ticker NFLXX not found") so the model can decide to retry with a corrected argument rather than hallucinating a response.
3. Parallel tool calls are the cheapest latency optimization. If you find your agent doing two tool calls across two separate turns where the calls are independent, that's a prompt design problem β not a LangGraph limitation. Instruct the model in the system prompt to batch independent lookups into a single step.
4. Don't use AgentExecutor for new projects. LangGraph's graph-based architecture gives you full observability (every state transition is a traceable checkpoint), controllable loops (recursion limit, human-in-the-loop), and composability (subgraphs, parallel branches). AgentExecutor is a black box by comparison.
5. Test tools independently before wiring them into the graph. Call your @tool-decorated function directly in a unit test to verify its output shape before the LLM ever sees it. A tool that returns a dict when the model expects a string is a silent failure source.
π TLDR: Summary and Key Takeaways
- The capability gap is real: LLMs are frozen at training cutoff; tools give them runtime access to live systems.
@toolis your entry point: Decorate any Python function; the docstring becomes the model's tool description; type hints become the argument schema.bind_tools()is LLM-agnostic: The same tool list works with OpenAI, Anthropic, Groq, and Ollama β swap the model class without touching your tools or graph.ToolNodehandles the heavy lifting: It dispatches tool calls, executes them in parallel when the LLM emits multiple calls in a single step, and threads results back into state asToolMessageobjects.tools_conditioncreates the loop: Route toToolNodewhen tool calls are present; route toENDwhen the model is ready to respond directly.- Guard against failure modes: Set
recursion_limit, handle exceptions inside tool bodies, and validate arguments with Pydanticargs_schema. - The memorable rule: An LLM without tools is a knowledgeable advisor who has never left the library. With tools, it becomes an agent that picks up the phone.
π Practice Quiz
What does the
@tooldecorator use to generate the tool's JSON argument schema?- A) The function's return type annotation
- B) The function's parameter type hints
- C) A separate
schema=keyword argument - D) The tool's docstring Correct Answer: B
Your LangGraph agent calls
get_stock_priceandweb_searchin the sameAIMessage. How doesToolNodeexecute them by default?- A) Sequentially, in the order they appear in
tool_calls - B) Randomly, depending on the Python GIL scheduler
- C) In parallel, using a thread pool
- D) Only the first call runs; the second is queued for the next turn Correct Answer: C
- A) Sequentially, in the order they appear in
You set
recursion_limit=10in the graph config, but your agent keeps calling a failing tool. After how many total node invocations will LangGraph stop the graph?- A) After 10 tool calls specifically
- B) After 10 total node invocations across the entire graph
- C) After 10 round trips between the agent node and ToolNode
- D) Never β
recursion_limitonly affects subgraphs Correct Answer: B
(Open-ended) You're building an agent that fetches data from five different APIs, but only two of those APIs are relevant for any given user query. How would you design the tool definitions, the system prompt, and the graph routing so the agent consistently picks the right two tools without calling the unnecessary three? Consider the trade-offs between tool description clarity, schema constraints, and graph-level filtering.
π Related Posts
- LangChain Development Guide β The LangChain foundation: chains, memory, and prompts before you reach LangGraph
- AI Agents Explained: When LLMs Start Using Tools β Conceptual grounding for why agents need tools and how the ReAct loop works
- Multi-Step AI Agents: The Power of Planning β How agents decompose complex tasks and chain tool calls across multiple reasoning steps

Written by
Abstract Algorithms
@abstractalgorithms
More Posts
Software Engineering Principles: Your Complete Learning Roadmap
TLDR: This roadmap organizes the Software Engineering Principles series into a problem-first learning path β starting with the code smell before the principle. New to SOLID? Start with Single Responsibility. Facing messy legacy code? Jump to the smel...
Machine Learning Fundamentals: Your Complete Learning Roadmap
TLDR: πΊοΈ Most ML courses dive into math formulas before explaining what problems they solve. This roadmap guides you through 9 essential posts across 3 phases: understanding ML fundamentals β mastering core algorithms β deploying production models. ...
Low-Level Design Guide: Your Complete Learning Roadmap
TLDR TLDR: LLD interviews ask you to design classes and interfaces β not databases and caches.This roadmap sequences 8 problems across two phases: Phase 1 (6 beginner posts) builds your core OOP vocabulary through increasingly complex domains; Phase...

LLM Engineering: Your Complete Learning Roadmap
TLDR: The LLM space moves so fast that engineers end up reading random blog posts and never build a mental model of how everything connects. This roadmap organizes 35+ LLM Engineering posts into 7 tra
