From LangChain to LangGraph: When Agents Need State Machines
LangChain's AgentExecutor handles simple loops — but stateful branching, long-running tasks, and human-in-the-loop require LangGraph's graph model.
Abstract AlgorithmsTLDR: LangChain's
AgentExecutoris a solid starting point — but it has five hard limits (no branching, no pause/resume, no parallelism, no human-in-the-loop, no crash recovery). LangGraph replaces the implicit loop with an explicit graph, unlocking every one of those capabilities. This post explains the mental model, compares both approaches in code, and helps you decide when to upgrade.
📖 When a Working Agent Hits a Wall
You built a LangChain agent. It queries a knowledge base, calls a few tools, and returns useful answers. It works. Then the product requirements arrive.
"We need it to ask for user approval before it sends the email."
You patch it with a flag. Then the next requirement:
"If the search tool returns nothing useful, fall back to a different source."
You add a try/except. Then:
"We need it to research two sources at the same time instead of sequentially."
Now you are fighting the framework. The agent loop in AgentExecutor was never designed for any of these patterns. You are bolting conditional logic onto a linear pipe and discovering what every team eventually discovers: a loop is not a graph.
This post is the bridge between the two halves of the "Agentic AI with LangGraph" series. You understand LangChain — chains, tools, agents, the ReAct loop. Now you are going to understand why LangGraph exists, where the boundaries are, and how to move from one mental model to the other.
🔍 Five Places Where AgentExecutor Runs Out of Road
AgentExecutor implements a ReAct loop: reason, act (pick a tool), observe (read the output), repeat. That loop covers a large fraction of real-world agent tasks — but five patterns break it reliably.
| Breaking Point | What You Wanted | What AgentExecutor Does |
| Conditional branching | "If tool A returns an error, try tool B instead" | Runs tools in the LLM's chosen order; the loop cannot fork |
| Pause and resume | "Wait for a human to approve before continuing" | Has no concept of mid-execution suspension |
| Parallel execution | "Run the web search and the database lookup at the same time" | All tool calls are sequential — one finishes before the next starts |
| Human-in-the-loop | "Show the draft to the user and incorporate their edits" | The loop runs to completion autonomously; no external injection point |
| Durable persistence | "If the server restarts mid-run, pick up where we left off" | State lives only in RAM for the duration of run() |
None of these is a bug in LangChain. AgentExecutor was designed as a general-purpose, easy-to-use agent loop. It handles the 80 % case elegantly. The 20 % — stateful, branching, long-running workflows — requires a fundamentally different execution model.
⚙️ The State Machine Mental Model: From Recipes to Flowcharts
The key mental shift is this:
LangChain chain = a recipe. Execute step 1, then step 2, then step 3. Perfect for linear transformations.
LangGraph StateGraph = a flowchart. Start at a node, evaluate where to go next, possibly loop back, possibly branch, eventually reach a terminal state.
A state machine has three ingredients: nodes (discrete states or processing steps), edges (transitions between nodes), and shared state (the data all nodes can read and write). Every node receives the current state, does work, and returns partial updates. Edges can be unconditional ("always go to node B after node A") or conditional ("look at the state and decide which node to go to next").
This is not an exotic computer-science concept. It is what you already draw on a whiteboard when planning complex workflows. LangGraph turns that whiteboard diagram into executable Python.
| Concept | LangGraph Term | Everyday Analogy |
| Discrete processing step | Node (Python function) | A team member doing their part |
| Shared memory everyone reads | State (TypedDict) | The whiteboard the whole team writes on |
| Always-on transition | Normal edge | "When Alice is done, always pass to Bob" |
| Decision point | Conditional edge | "If the report is approved, go to Bob; if rejected, go back to Alice" |
| Task completion | END constant | The team signs off the whiteboard |
📊 Visualizing the Contrast: Linear Loop vs. Branching Graph
The diagrams below show the same "research and draft an email" task implemented in each model.
graph TD
subgraph LC["🔗 LangChain AgentExecutor — Linear ReAct Loop"]
direction TB
A1([User Input]) --> B1[LLM Reasons]
B1 --> C1{Tool Needed?}
C1 -->|Yes| D1[Call Tool]
D1 --> B1
C1 -->|No| E1([Final Output])
end
style A1 fill:#e3f2fd,stroke:#1976D2,color:#000
style E1 fill:#e8f5e9,stroke:#388E3C,color:#000
style B1 fill:#fff8e1,stroke:#F9A825,color:#000
style C1 fill:#fce4ec,stroke:#C62828,color:#000
style D1 fill:#f3e5f5,stroke:#6A1B9A,color:#000
graph TD
subgraph LG["🕸️ LangGraph StateGraph — Explicit Branching Graph"]
direction TB
S([START]) --> R[router_node\nDecide: research or draft?]
R -->|needs_research| T[research_node\nSearch sources]
R -->|ready_to_draft| W[draft_node\nWrite email draft]
T --> H{human_approval_node\nApprove or reject?}
H -->|approved| W
H -->|rejected| T
W --> E([END])
end
style S fill:#e3f2fd,stroke:#1976D2,color:#000
style E fill:#e8f5e9,stroke:#388E3C,color:#000
style R fill:#fff8e1,stroke:#F9A825,color:#000
style T fill:#f3e5f5,stroke:#6A1B9A,color:#000
style H fill:#fce4ec,stroke:#C62828,color:#000
style W fill:#e8f5e9,stroke:#2E7D32,color:#000
The top diagram has one decision point — "tool needed?" — and no branching between tools. The bottom diagram has named nodes for each responsibility, a conditional edge that branches between research and drafting based on state, and a human approval node that can loop back to research if the draft is rejected. Neither diagram is inherently "better" — the LangGraph graph is more code and more setup, but it handles requirements the linear loop structurally cannot.
🧠 Deep Dive: LangGraph's Graph Primitives vs. LangChain's Chain Abstraction
LangChain's core abstraction is the Runnable — a composable unit with invoke, stream, and batch methods. Chains are built by piping Runnables together: prompt | llm | parser. This is expressive and concise for sequential pipelines.
LangGraph's core abstraction is the StateGraph — a directed graph where nodes are Python functions and edges encode the routing logic. The key differences from a LangChain chain:
LangChain chain LangGraph StateGraph
────────────────────── ──────────────────────────────────────
prompt | llm | parser add_node() + add_edge() + compile()
Implicit data flow Explicit typed State dict
Linear execution Conditional/cyclic execution
Stateless between calls Persistent state per run (+ checkpoints)
No suspend/resume interrupt() for mid-run pauses
The StateGraph requires you to declare your state schema upfront as a TypedDict. Every node receives that schema as input and returns a partial update — only the fields it touched. LangGraph merges those updates into the running state before routing to the next node.
from typing import Annotated
from typing_extensions import TypedDict
import operator
class EmailAgentState(TypedDict):
topic: str # set once by the caller
research_results: str # written by research_node
draft: str # written by draft_node
human_feedback: str # written by human_approval_node
approved: bool # routing flag
retry_count: Annotated[int, operator.add] # accumulates with operator.add
The Annotated[int, operator.add] on retry_count is LangGraph's reducer syntax. Rather than overwriting the field, each node's partial update is added to the existing value. This is how messages lists accumulate across steps without any bookkeeping code.
The Internals: Superstep Execution and State Merging
When you call StateGraph.compile(), LangGraph validates the graph (checks for disconnected nodes, unreachable END, missing entry points) and produces a Pregel-style executor. At each superstep, the executor:
- Identifies which node(s) are currently active.
- Calls each active node with the current full state dict.
- Collects the partial-update dicts returned by each node.
- Merges each field using its annotated reducer —
operator.addappends; unannotated fields overwrite. - Evaluates all conditional edges against the newly merged state to determine the next active node(s).
- Serializes the merged state to the attached checkpointer (if any) before the next superstep begins.
This merge-before-route guarantee means conditional edges always see a fully consistent state — there is no race condition between a node writing a field and the router reading it, even when parallel branches (via Send) write different fields in the same superstep.
Performance Considerations: When Graph Overhead Is Worth It
LangGraph adds measurable overhead compared to a direct LCEL chain:
| Operation | LCEL Chain | LangGraph StateGraph |
| State serialization | None | After each node (JSON) |
| Routing evaluation | None | Conditional function call per superstep |
| Graph compilation | None (no compile step) | One-time at startup (~5–20 ms) |
| Checkpoint persistence | None | Optional; negligible for text state |
For the typical agent (state = text + a few flags), serialization adds under a millisecond per superstep. The overhead only becomes material when state carries large payloads (raw document text, embeddings). The mitigation: store large payloads externally (vector DB, object storage) and keep only a short reference key in the state dict.
The practical rule: compile once at startup, reuse the compiled graph object. Never compile inside a request handler — the validation pass runs every time and adds tens of milliseconds unnecessarily.
🧪 The Same Task, Two Ways: Research + Draft an Email
The LangChain AgentExecutor Version
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain_core.tools import tool
from langchain import hub
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
@tool
def search_web(query: str) -> str:
"""Search the web for information on a topic."""
# Replace with a real search integration (Tavily, Serper, etc.)
return f"[Search results for: {query}] — Key facts retrieved."
@tool
def send_email(recipient: str, subject: str, body: str) -> str:
"""Send an email. USE THIS ONLY AFTER RESEARCH IS COMPLETE."""
# In reality: sends the email. Here: simulates it.
return f"Email sent to {recipient} with subject '{subject}'."
tools = [search_web, send_email]
prompt = hub.pull("hwchase17/react") # standard ReAct prompt
agent = create_react_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
# This works — but you cannot pause for approval before send_email runs,
# cannot branch to a fallback if search_web returns empty,
# and cannot recover if the process crashes mid-run.
result = executor.invoke({
"input": "Research climate change impacts and draft an email to the team."
})
print(result["output"])
This is clean and gets the job done. The LLM will call search_web, read the results, then decide to call send_email. But notice: there is no point in this code where you can intercept the agent before it sends the email. The loop runs to completion. Adding an approval step means restructuring the entire calling code.
The LangGraph Version — With Human Approval and Retry Logic
import operator
from typing import Annotated
from typing_extensions import TypedDict
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
from langgraph.graph import StateGraph, END
from langgraph.checkpoint.memory import MemorySaver
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
# ── 1. Typed state schema ────────────────────────────────────────────────────
class EmailAgentState(TypedDict):
topic: str
research_results: str
draft: str
human_feedback: str
approved: bool
retry_count: Annotated[int, operator.add]
# ── 2. Nodes — each is a plain Python function ──────────────────────────────
def research_node(state: EmailAgentState) -> dict:
"""Fetch research on the topic. Falls back gracefully if results are thin."""
response = llm.invoke([
HumanMessage(content=f"Summarize key facts about: {state['topic']}")
])
return {"research_results": response.content}
def draft_node(state: EmailAgentState) -> dict:
"""Draft the email body using research results."""
context = state["research_results"]
feedback = state.get("human_feedback", "")
prompt = f"Write a professional team email about {state['topic']}.\nResearch: {context}"
if feedback:
prompt += f"\nPrevious feedback to incorporate: {feedback}"
response = llm.invoke([HumanMessage(content=prompt)])
return {"draft": response.content, "approved": False}
def human_approval_node(state: EmailAgentState) -> dict:
"""
In production: suspend here and wait for a real human API call.
For this demo: auto-approve after the first retry to avoid an infinite loop.
"""
if state["retry_count"] > 0:
# Simulate approval after one retry
return {"approved": True, "human_feedback": ""}
# Simulate a rejection on first pass with feedback
return {
"approved": False,
"human_feedback": "Please make the tone more concise.",
"retry_count": 1, # operator.add accumulates this
}
def send_email_node(state: EmailAgentState) -> dict:
"""Send the approved draft — only reachable after human_approval_node approves."""
print(f"\n✅ Email sent!\n\n{state['draft']}")
return {}
# ── 3. Routing functions ─────────────────────────────────────────────────────
def route_after_approval(state: EmailAgentState) -> str:
"""Branch on approval flag — either send or loop back to drafting."""
return "send_email_node" if state["approved"] else "draft_node"
# ── 4. Build the graph ───────────────────────────────────────────────────────
builder = StateGraph(EmailAgentState)
builder.add_node("research_node", research_node)
builder.add_node("draft_node", draft_node)
builder.add_node("human_approval_node", human_approval_node)
builder.add_node("send_email_node", send_email_node)
builder.set_entry_point("research_node")
builder.add_edge("research_node", "draft_node")
builder.add_edge("draft_node", "human_approval_node")
builder.add_conditional_edges(
"human_approval_node",
route_after_approval,
{"send_email_node": "send_email_node", "draft_node": "draft_node"},
)
builder.add_edge("send_email_node", END)
# ── 5. Compile with a checkpointer for crash recovery ───────────────────────
checkpointer = MemorySaver()
graph = builder.compile(checkpointer=checkpointer)
# ── 6. Run it ────────────────────────────────────────────────────────────────
config = {"configurable": {"thread_id": "email-run-001"}}
result = graph.invoke(
{"topic": "climate change team update", "retry_count": 0,
"research_results": "", "draft": "", "human_feedback": "", "approved": False},
config=config,
)
This is more code — but every requirement that broke the AgentExecutor version is now handled natively:
- Conditional branching:
route_after_approvalreads state and branches to either send or redraft. - Human-in-the-loop:
human_approval_nodeis the interrupt point; in production, swap the stub for a realinterrupt()call that suspends the graph. - Retry logic:
human_approval_node→draft_node→human_approval_nodeis a proper loop with aretry_countguard. - Crash recovery:
MemorySaverpersists the full state after each node; swap it for aSqliteSaverorPostgresSaverfor production durability.
🌍 Real-World Applications: Where the Graph Model Earns Its Complexity
The recipe-vs-flowchart analogy holds in practice. Consider three production patterns that LangGraph handles and AgentExecutor cannot:
Document approval pipelines — A legal review agent drafts a clause, routes it through compliance review, routes approved clauses to the final document, and loops rejected clauses back for revision. Each "route to compliance vs. route back for revision" decision is a conditional edge reading a review_status field in state.
Multi-source research with fallback — A researcher first queries an internal database node. A conditional edge checks results_found; if false, it routes to a web search node. Both paths converge at a synthesis node. Sequential fallback logic that requires four lines of code in AgentExecutor is one conditional edge in LangGraph.
Long-running batch processing with checkpointing — An agent processes 200 documents one by one, writing progress to state after each. If the server restarts at document 147, the graph resumes from the last checkpoint — no re-processing, no data loss.
⚖️ Trade-offs and Failure Modes: When LangGraph Adds Overhead and When It Doesn't
A common misconception: "LangGraph replaces LangChain." It does not. LangGraph is built on top of LangChain. LCEL chains run inside LangGraph nodes. Every tool decorator, every ChatOpenAI wrapper, every output parser you already know is still in use.
The relationship is:
LangGraph StateGraph
└── Node: research_node
└── LangChain LCEL: prompt | llm | StrOutputParser()
└── Node: draft_node
└── LangChain LCEL: draft_prompt | llm | StrOutputParser()
└── Node: send_email_node
└── LangChain Tool: @tool send_email(...)
LangChain handles what each node does. LangGraph handles when each node runs and where execution goes next. They are complementary layers.
Failure modes to design against when adopting LangGraph:
| Failure Mode | What Goes Wrong | Mitigation |
| Overengineering | Wrapping a single LLM call in a four-node graph | Use LCEL for stateless transformations; add LangGraph only when you need loops or branches |
| Infinite loop | A conditional edge never routes to END | Add a retry_count field; force-terminate after N iterations |
| State explosion | Appending every intermediate result bloats state | Design field lifetimes; clear working fields between phases |
| Schema migration pain | Removing a state field breaks saved checkpoints | Treat TypedDict like a DB schema — add carefully, remove with a migration plan |
| Cold-start cost | compile() re-runs on every request | Compile once at startup; reuse the compiled graph object |
Reach for LangGraph only when you need the graph layer — not as a wholesale replacement for LCEL.
🧭 Decision Guide: Which Tool for Which Task?
| Scenario | Recommended Approach | ||
| Simple Q&A, single LLM call | LangChain LCEL (`prompt \ | llm \ | parser`) |
| Multi-step pipeline, no branching | LangChain LCEL with chained steps | ||
| Multi-turn chatbot with memory | LangChain + RunnableWithMessageHistory | ||
| Tool-using agent, 2–5 tools, linear | LangChain AgentExecutor | ||
| Conditional branching on tool results | LangGraph | ||
| Human approval / interrupt and resume | LangGraph | ||
| Retry logic with state across attempts | LangGraph | ||
| Long-running tasks needing checkpointing | LangGraph | ||
| Multi-agent coordination | LangGraph |
The trigger: the moment your agent needs to decide where to go next based on what it just learned, you have outgrown a pipeline and need a graph.
🛠️ LangGraph: The Open-Source Project and Its Architecture
LangGraph is an open-source Python library from LangChain, Inc., released under the MIT license.
- GitHub: github.com/langchain-ai/langgraph
- Install:
pip install langgraph langchain-openai - Current stable: 0.2.x (check PyPI before pinning)
The architecture has three layers:
| Layer | What It Does | Key Class/API |
| Graph definition | Declare nodes, edges, state schema | StateGraph, TypedDict |
| Checkpointing | Persist state after each step | MemorySaver, SqliteSaver, PostgresSaver |
| Platform (optional) | Hosted runtime, REST API, visual debugger | LangGraph Platform (managed) |
The OSS langgraph package is everything you need for local development and self-hosted production. LangGraph Platform adds the managed deployment layer — useful when you need a hosted interrupt API, a web-based graph debugger, or auto-scaling, but entirely optional.
Under the hood, LangGraph's executor uses a superstep model inspired by Google's Pregel framework: each step activates one or more nodes, collects their partial state updates, merges them via reducers, evaluates conditional edges, and advances to the next superstep. This gives every node a clean, consistent view of state and makes the execution reproducible from any saved checkpoint.
For a full hands-on introduction to LangGraph's primitives, see LangGraph 101: Building Your First Stateful Agent.
📚 Lessons Learned from Making the Switch
Teams migrating from AgentExecutor to LangGraph consistently surface the same lessons:
Don't graph everything. If a task is genuinely linear — prompt, tool, answer — keep it in LCEL. Adding LangGraph overhead to a three-node linear chain solves no problem and adds indirection.
Name your nodes after responsibilities, not actions.
human_approval_nodeis better thancheck_node. When you read the graph visualisation six months later, you will thank yourself.Every loop needs an exit guard. A
retry_countfield in state with a ceiling checked by the conditional edge is the minimum viable protection against an infinite loop when LLM outputs are inconsistent.State schema changes are migrations. Once you have saved checkpoints in production, removing or renaming a state field is a breaking change. Treat the
TypedDictlike a database schema — add carefully, remove with a migration plan.Compile once, invoke many. Call
builder.compile()at application startup and reuse the compiled graph object. Re-compiling on every request runs graph validation again and adds unnecessary latency.
📌 TLDR: Summary & Key Takeaways
- AgentExecutor covers the 80 % case — a linear ReAct loop is right for most simple tool-using agents.
- Five patterns break AgentExecutor: conditional branching, pause/resume, parallel execution, human-in-the-loop, and durable persistence.
- LangGraph replaces the implicit loop with an explicit graph: named nodes, typed shared state, and conditional edges that you control.
- The mental model shift: from a recipe (execute steps in order) to a flowchart (evaluate state, decide where to go next).
- LangGraph uses LangChain — LCEL chains live inside LangGraph nodes; the two are complementary, not competing.
- Decision trigger: the moment your agent needs to decide its next step based on runtime output, you need a graph.
- Start small: migrate the one workflow that is already fighting AgentExecutor, not your entire codebase at once.
One-liner to remember: If your agent draws a linear arrow, use LangChain. If it draws a diamond, use LangGraph.
🔭 What's Next in the "Agentic AI with LangGraph" Series
This post is the conceptual bridge. Every post that follows dives into a specific LangGraph capability:
| Post | What You'll Build |
| LangGraph 101: Building Your First Stateful Agent | StateGraph, typed state, nodes, edges — your first runnable agent |
| LangGraph ReAct Agent Pattern | Replicate AgentExecutor's ReAct loop inside a LangGraph graph |
| LangGraph Tool Calling: ToolNode and Custom Tools | ToolNode, bind_tools(), and writing custom tool nodes |
| Human-in-the-Loop Workflows with LangGraph | interrupt(), Command, update_state() — pause and resume |
| LangGraph Memory and State Persistence | MemorySaver, SqliteSaver, cross-session memory |
| Streaming Agent Responses with LangGraph | stream(), astream(), token-level streaming to UIs |
| Multi-Agent Supervisor Pattern in LangGraph | Supervisor + specialist sub-agents wired as a graph |
| LangGraph Deployment: LangServe and Production | Deploy your graph as a REST API with LangGraph Platform |
Read them in order for the full progression, or jump directly to the capability you need.
📝 Practice Quiz
Which of the following is a genuine structural limitation of
AgentExecutorthat LangGraph is designed to solve?- A) It cannot call more than five tools in a single run
- B) It cannot pause mid-run to wait for human approval
- C) It requires OpenAI and does not work with other LLM providers Correct Answer: B
A developer wants to build an agent that calls a search tool, shows the results to a user for approval, incorporates their feedback, and then drafts a report. Which LangGraph feature makes the human approval step possible?
- A)
operator.addreducer on the messages field - B)
add_conditional_edges()routing to anENDnode - C) A checkpointer that persists state, enabling mid-run suspension and resume Correct Answer: C
- A)
In LangGraph, the
Annotated[int, operator.add]type annotation on a state field means:- A) The field is read-only and cannot be updated by any node
- B) Each node's partial update is added to the existing value instead of overwriting it
- C) The field is automatically incremented by LangGraph after every node call Correct Answer: B
(Open-ended — no single correct answer) You have a LangGraph agent that loops between a
draft_nodeand ahuman_approval_nodeuntil a draft is approved. What is the most important safety measure to add to the state schema, and why? Correct Answer: Add aretry_countinteger field (accumulated withoperator.add) and a conditional edge that routes toEND(with a failure message) once the count exceeds a maximum. Without a ceiling, an LLM that consistently produces rejected drafts will spin the graph indefinitely, consuming tokens and compute without ever terminating.
🔗 Related Posts
- LangChain Development Guide: Chains, Tools, and Agents in Practice — The LangChain building blocks this post assumes you know: LCEL chains, prompts, output parsers, memory, and AgentExecutor.
- AI Agents Explained: When LLMs Start Using Tools — A deep dive into tool-calling agents, the ReAct loop, and why the loop model has limits — the direct precursor to this post.
- LangGraph 101: Building Your First Stateful Agent — The next post in the series: hands-on
StateGraph, TypedDict state, nodes, and edges from scratch.

Written by
Abstract Algorithms
@abstractalgorithms
More Posts
Software Engineering Principles: Your Complete Learning Roadmap
TLDR: This roadmap organizes the Software Engineering Principles series into a problem-first learning path — starting with the code smell before the principle. New to SOLID? Start with Single Responsibility. Facing messy legacy code? Jump to the smel...
Machine Learning Fundamentals: Your Complete Learning Roadmap
TLDR: 🗺️ Most ML courses dive into math formulas before explaining what problems they solve. This roadmap guides you through 9 essential posts across 3 phases: understanding ML fundamentals → mastering core algorithms → deploying production models. ...
Low-Level Design Guide: Your Complete Learning Roadmap
TLDR TLDR: LLD interviews ask you to design classes and interfaces — not databases and caches.This roadmap sequences 8 problems across two phases: Phase 1 (6 beginner posts) builds your core OOP vocabulary through increasingly complex domains; Phase...

LLM Engineering: Your Complete Learning Roadmap
TLDR: The LLM space moves so fast that engineers end up reading random blog posts and never build a mental model of how everything connects. This roadmap organizes 35+ LLM Engineering posts into 7 tra
