LangGraph 101: Building Your First Stateful Agent
Learn LangGraph from scratch: StateGraph, typed state, nodes, edges, and your first LLM-agnostic stateful agent.
Abstract Algorithms
TLDR: LangGraph adds state, branching, and loops to LLM chains β build stateful agents with graphs, nodes, and typed state.
π The Stateless Chain Problem: Why Your Agent Forgets Everything
You built a LangChain chain that answers questions. Then you tried to make it ask a follow-up. The chain had no idea what it just said.
This is the stateless chain problem β and it trips up almost every developer who moves beyond basic LangChain tutorials. Here is what the failure looks like:
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
llm = ChatOpenAI(model="gpt-4o-mini")
chain = (
ChatPromptTemplate.from_template("Answer this: {question}")
| llm
| StrOutputParser()
)
answer = chain.invoke({"question": "What is photosynthesis?"})
# β
Works great
follow_up = chain.invoke({"question": "Can you give me an example?"})
# β The chain has no idea what "it" refers to β no memory, no context
Every invoke call is a brand-new, fully isolated conversation. The chain cannot loop back to refine an answer, cannot branch based on whether it needs to search the web, and cannot accumulate information across steps. The moment the call returns, all intermediate state evaporates.
This is not a bug in LangChain β it is a deliberate design. LCEL chains are composable pipelines optimized for single-pass transformations.
"But wait β doesn't LangChain have memory?" Yes. LangChain offers
ConversationBufferMemory,ConversationSummaryMemory, andRunnableWithMessageHistoryto persist conversation turns across calls. These solve the chat history problem well β a multi-turn chatbot that remembers what was said. See LangChain Memory: Conversation History and Summarization for the full walkthrough.But LangChain memory only tracks messages. It cannot track arbitrary typed state (a running tally, a set of retrieved documents, a retry counter). It cannot branch the execution path based on what the LLM just returned. It cannot pause, wait for human approval, then resume from where it stopped. The moment your agent needs any of those β you need a different abstraction entirely.
That is where LangGraph comes in. Rather than bolt memory onto a pipeline, LangGraph models the entire agent as a graph where a typed state object flows through nodes. Any node can read and write any part of the state. Edges can route execution conditionally. The graph can loop. And the full state is checkpointed after every step.
LangGraph was built for exactly this. It models your agent as a graph where shared typed state persists between steps, nodes can run conditionally, and the whole thing can loop until a goal is satisfied. The same question-and-answer use case, now as a stateful LangGraph agent:
# State flows through every node β no step ever starts from a blank slate
# Nodes branch, loop, or terminate based on what earlier nodes wrote to state
# The agent "remembers" everything that happened in prior steps
That mental shift β from a linear pipe to a stateful graph β is what this post is about. By the end you will have a working research assistant agent you can run locally in under five minutes.
π LangGraph's Mental Model: Graphs, State, Nodes, and Edges
Before writing any code, you need three clear ideas in your head. LangGraph is built from exactly these three concepts β nothing more.
State is a Python dictionary typed with TypedDict. Every node in the graph can read it and write partial updates to it. Think of it as the agent's shared whiteboard that persists for the entire duration of one run. You define what goes on it β messages, flags, search results, a step counter β and you control how each field evolves.
Nodes are ordinary Python functions. Each node receives the current state as input, does some work (calls an LLM, runs a search, formats a result), and returns a dictionary containing only the fields it wants to update. Nodes are completely decoupled from each other; they only interact through the shared state.
Edges are the directed connections between nodes. A normal edge always routes from node A to node B. A conditional edge calls a routing function that inspects the current state and returns the name of the next node to visit. A special END constant marks the terminal node β reaching it stops the run.
| Concept | What it is | Everyday analogy |
| StateGraph | The graph container | A whiteboard session |
| State (TypedDict) | Shared memory | Sticky notes everyone can read |
| Node | A Python function | A team member who reads and updates the notes |
| Normal edge | Always-on connection | "When Alice is done, always go to Bob" |
| Conditional edge | Decision point | "If the notes say SEARCH, go to Carol; otherwise go to Dave" |
| END | Terminal marker | The team declares the task complete |
This model is deliberately minimal. You write plain Python functions. LangGraph handles passing state between them, evaluating conditional routes, serializing checkpoints, and deciding which node runs next. There is no hidden orchestration magic to debug.
βοΈ Building Blocks: TypedDict State, Nodes, and How Edges Connect Them
Let's make everything above concrete with runnable code. Here is the minimum viable LangGraph agent β a two-node graph that greets a user and then summarizes the greeting.
Step 1 β Define your state schema with TypedDict
from typing import Annotated
from typing_extensions import TypedDict
import operator
class AgentState(TypedDict):
messages: Annotated[list[str], operator.add]
# operator.add means "append new values; never overwrite the whole list"
The Annotated[list[str], operator.add] syntax is the first thing that surprises most developers. When a node returns {"messages": ["Hello!"]}, LangGraph does not replace the entire messages list β it appends to it. This is called a reducer. Without a reducer, the default behaviour is a simple overwrite. Using operator.add means your message history accumulates automatically across every node call without any extra bookkeeping code.
Step 2 β Write node functions
def greet_node(state: AgentState) -> dict:
"""Add a greeting to the message list."""
return {"messages": ["Hi! How can I help you today?"]}
def summarize_node(state: AgentState) -> dict:
"""Summarize everything that has been said so far."""
history = " | ".join(state["messages"])
return {"messages": [f"[Summary]: {history}"]}
Both functions share the same contract: accept a state dict, return a partial state dict. That is the entire node interface.
Step 3 β Wire nodes together and compile
from langgraph.graph import StateGraph, END
builder = StateGraph(AgentState)
builder.add_node("greeter", greet_node)
builder.add_node("summarizer", summarize_node)
builder.set_entry_point("greeter") # where the graph starts
builder.add_edge("greeter", "summarizer") # always go from greeter to summarizer
builder.add_edge("summarizer", END) # then stop
graph = builder.compile() # validates the graph and produces an executable
result = graph.invoke({"messages": []})
print(result["messages"])
# ['Hi! How can I help you today?', '[Summary]: Hi! How can I help you today?']
Five method calls on the StateGraph builder. That is the complete lifecycle: add_node, set_entry_point, add_edge, compile, invoke. Every LangGraph agent β no matter how large β is composed from exactly these primitives.
Conditional edges add decision-making. The routing function returns a string key that selects the next node:
def should_search(state: AgentState) -> str:
"""Return the name of the next node based on current state."""
if state.get("needs_search"):
return "search_node"
return "answer_node"
builder.add_conditional_edges(
"router_node", # which node makes this decision
should_search, # the routing function
{
"search_node": "search_node", # key β node name mapping
"answer_node": "answer_node",
},
)
The routing function is just Python β you can inspect any field in state to make the decision.
π§ Deep Dive: How LangGraph Compiles and Executes Your Graph
The Internals: Pregel-Style Execution Under the Hood
When you call StateGraph.compile(), LangGraph validates the graph structure (checks for disconnected nodes, missing entry points, unreachable END) and produces an executor that runs your graph using a superstep model inspired by Google's Pregel framework.
At each superstep, LangGraph identifies the currently active nodes, runs them, collects their partial state updates, and merges those updates into the shared state dict before advancing to the next step. The merge strategy for each field is controlled by its annotated reducer β operator.add for append, or overwrite-by-default for unannotated fields.
LangGraph also serializes the full state after each superstep. This is not just for debugging β it is what enables checkpointing. Attach a MemorySaver or a database-backed checkpointer and your agent can pause mid-run, be restarted from any saved point, or support human-in-the-loop approval workflows where the graph literally suspends and waits for external input before continuing.
from langgraph.checkpoint.memory import MemorySaver
checkpointer = MemorySaver()
graph = builder.compile(checkpointer=checkpointer)
# Each run gets a thread_id β state is persisted per thread
config = {"configurable": {"thread_id": "user-session-42"}}
result = graph.invoke({"messages": []}, config=config)
Performance Analysis: invoke() vs stream() and When to Go Async
graph.invoke() runs the entire graph to completion and returns the final state dictionary. This is the simplest calling mode β use it when you only care about the end result.
graph.stream() yields a partial state snapshot after each node completes. Use this when you want to stream intermediate progress to a UI or log what each node produced in real time:
for step in graph.stream({"messages": []}):
node_name = list(step.keys())[0]
print(f"β
Node '{node_name}' completed: {step[node_name]}")
For concurrent production workloads, switch to the async variants to avoid blocking threads:
# Async invoke
result = await graph.ainvoke({"messages": []})
# Async stream
async for step in graph.astream({"messages": []}):
print(step)
State serialization overhead is negligible for typical agent state (a few hundred bytes of text). It becomes significant only when state carries large binary payloads like raw document text or embeddings. The pattern to avoid this: store large payloads externally (object storage, vector DB) and keep only a short reference key in the state dict.
π Your First Agent: A Research Assistant That Can Search
Now let's build the worked example. The agent takes a user question, decides whether it needs to search the web before answering, runs a search if needed, and synthesizes the final response. Here is the graph structure:
graph TD
A([π§ User Question]) --> B[router_node\nDecide: search needed?]
B -->|needs_search = true| C[search_node\nFetch results]
B -->|needs_search = false| D[answer_node\nSynthesize answer]
C --> D
D --> E([β
END])
style A fill:#e3f2fd,stroke:#1976D2,color:#000
style E fill:#e8f5e9,stroke:#388E3C,color:#000
style B fill:#fff8e1,stroke:#F57F17,color:#000
style C fill:#f3e5f5,stroke:#7B1FA2,color:#000
style D fill:#e8f5e9,stroke:#2E7D32,color:#000
The router LLM inspects the question and sets a needs_search flag. If true, the search node runs and injects results into state. Either way, the answer node synthesizes the final response from whatever is in state.
import operator
from typing import Annotated
from typing_extensions import TypedDict
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage, BaseMessage
from langgraph.graph import StateGraph, END
# --- 1. State schema ---
class ResearchState(TypedDict):
messages: Annotated[list[BaseMessage], operator.add]
search_results: str
needs_search: bool
# --- 2. LLM (swap freely β see the LLM-agnostic section below) ---
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
# --- 3. Node: decide whether to search ---
def router_node(state: ResearchState) -> dict:
question = state["messages"][-1].content
prompt = (
f"Does answering this question require current web information?\n"
f"Question: {question}\n"
f"Reply with only: YES or NO"
)
response = llm.invoke([HumanMessage(content=prompt)])
return {"needs_search": "YES" in response.content.upper()}
# --- 4. Node: run the search ---
def search_node(state: ResearchState) -> dict:
question = state["messages"][-1].content
# Replace this stub with TavilySearch, Serper, or Bing API in production
results = f"[Search stub] Top results for: '{question}' β relevant facts retrieved."
return {"search_results": results}
# --- 5. Node: synthesize the answer ---
def answer_node(state: ResearchState) -> dict:
context_messages = list(state["messages"])
if state.get("search_results"):
context_messages.append(
HumanMessage(content=f"Use this context to answer: {state['search_results']}")
)
context_messages.append(HumanMessage(content="Now answer the original question."))
response = llm.invoke(context_messages)
return {"messages": [AIMessage(content=response.content)]}
# --- 6. Routing function ---
def route_search(state: ResearchState) -> str:
return "search_node" if state.get("needs_search") else "answer_node"
# --- 7. Build and compile the graph ---
builder = StateGraph(ResearchState)
builder.add_node("router_node", router_node)
builder.add_node("search_node", search_node)
builder.add_node("answer_node", answer_node)
builder.set_entry_point("router_node")
builder.add_conditional_edges("router_node", route_search, {
"search_node": "search_node",
"answer_node": "answer_node",
})
builder.add_edge("search_node", "answer_node")
builder.add_edge("answer_node", END)
graph = builder.compile()
# --- 8. Run it ---
result = graph.invoke({
"messages": [HumanMessage(content="What is the current price of Bitcoin?")],
"search_results": "",
"needs_search": False,
})
print(result["messages"][-1].content)
Install dependencies with:
pip install langgraph langchain-openai langchain-core
The full route β search β synthesize cycle is expressed in under 60 lines of plain Python. Every piece of logic lives in a named function. The graph wires them together.
π Real-World Applications: Where Stateful Graphs Show Up in Production
LangGraph is not a toy framework. Stateful graphs power several real production patterns:
Customer support routing β A support bot classifies incoming messages: simple FAQs route directly to an answer node; complex issues enter a multi-turn diagnostic loop that gathers details across several exchanges before escalating or resolving. State tracks the conversation history, the issue category, and an escalation flag across every turn without external session storage.
Code generation with self-repair β An agent generates code, runs a linter node, reads the linter output from state, and conditionally loops back to the LLM to fix errors. The loop exits only when the linter passes or a maximum-attempts counter is reached. Expressing this cycle with a plain LCEL chain would require custom retry logic outside the chain; with LangGraph it is a single conditional edge.
Long-form document research β A document researcher fans out to query multiple sources in parallel (using LangGraph's Send API for map-reduce), accumulates retrieved snippets into state, deduplicates, and synthesizes a final report. The graph coordinates the fan-out and fan-in without any custom orchestration code.
Human-in-the-loop review workflows β A content agent flags borderline items and pauses the graph at a human_review node. A human approves or rejects via an API call that resumes the graph from the persisted checkpoint. This pattern β pause, external action, resume β is trivial with LangGraph's checkpointing support and essentially impossible with stateless chains.
βοΈ Trade-offs and Failure Modes: When LangGraph Helps and When It Adds Complexity
LangGraph solves real problems, but adding it to every project introduces overhead that is not always justified. Here is an honest look at both sides.
When LangGraph genuinely helps:
- Your agent needs to loop until a condition is met (retry, quality check, refinement cycle).
- Your agent needs to branch based on LLM output or tool results at runtime.
- You need state to persist across service restarts, user sessions, or human approval steps.
- You are coordinating multiple specialized sub-agents as separate graph nodes.
Failure modes to design against:
| Failure Mode | What Goes Wrong | Mitigation |
| Infinite loop | A conditional edge never routes to END | Add a step_count field; force-terminate after N iterations |
| State explosion | Appending every intermediate result bloats state | Design field lifetimes; clear working fields between phases |
| Overengineering | A single LLM call wrapped in a four-node graph | Use LCEL for stateless transformations; add LangGraph only when you need loops or branches |
| Cold-start cost | compile() re-runs on every request | Compile once at startup; reuse the compiled graph object |
| Unbounded retries | A self-repair loop that keeps failing | Cap retries; write a fallback node that returns a graceful error |
The simplest heuristic: if your agent makes exactly one LLM call and returns a result, use LCEL. If it ever needs to decide what to do next, use LangGraph.
π§ Decision Guide: StateGraph vs LCEL Chain vs LangChain AgentExecutor
| Situation | Recommendation | ||
| Single LLM call, no memory needed | LCEL Chain (`prompt \ | llm \ | parser`) |
| Multi-step pipeline, no branching or loops | LCEL Chain with sequential steps | ||
| Multi-turn chatbot, conversation history only | LangChain memory (RunnableWithMessageHistory) | ||
| Agent needs to branch OR loop at runtime | LangGraph StateGraph | ||
| Human-in-the-loop pause and resume | LangGraph with checkpointing | ||
| Simple tool-using agent, no custom flow control | LangChain AgentExecutor | ||
| Multi-agent coordination with shared state | LangGraph multi-agent graph | ||
| Prototype that must ship in 30 minutes | LCEL Chain β migrate to LangGraph once you hit a loop |
The key signal is decision-making inside the loop. If your agent needs to choose what to do next based on what it just learned, you have a graph. Everything else is a pipeline.
π§ͺ Practical Example: Making the Graph LLM-Agnostic
One of LangGraph's cleanest design decisions is that nodes are plain Python functions with no opinion about which LLM you use. Swapping providers requires changing exactly one line β the llm assignment at the top. This scenario was chosen because it is the most common "day 2" question new LangGraph users ask β and showing it is a one-liner is the best answer. As you read through the three provider options, watch how every other line (the research_node, the StateGraph, the compile() call) stays identical across all three β that invariance is LangGraph's LLM-agnostic promise made concrete.
# Option 1: OpenAI β best default for production
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
# Option 2: Anthropic Claude β great for long-context reasoning
from langchain_anthropic import ChatAnthropic
llm = ChatAnthropic(model="claude-3-haiku-20240307", temperature=0)
# Option 3: Ollama β fully local, zero API cost, great for development
from langchain_ollama import ChatOllama
llm = ChatOllama(model="llama3.2", temperature=0)
# The graph, nodes, state schema, and edges are IDENTICAL for all three.
To make this swap clean inside node functions, inject llm via a closure at graph construction time:
def make_router_node(llm):
"""Factory that binds a specific LLM into the router node."""
def router_node(state: ResearchState) -> dict:
question = state["messages"][-1].content
prompt = f"Does answering this require web search?\nQuestion: {question}\nReply YES or NO"
response = llm.invoke([HumanMessage(content=prompt)])
return {"needs_search": "YES" in response.content.upper()}
return router_node
# At graph build time, pick your LLM and inject it:
builder.add_node("router_node", make_router_node(llm))
Install only the provider you need:
pip install langchain-openai # OpenAI
pip install langchain-anthropic # Anthropic
pip install langchain-ollama # Ollama (requires a local Ollama server)
Zero graph changes. One-line provider swap. Adopt this pattern from day one β it makes testing much easier (swap in a cheap local model for unit tests) and keeps your graph code provider-neutral.
π οΈ LangGraph: The OSS Framework Behind the Graph
LangGraph is an open-source library from LangChain, Inc., purpose-built for stateful, multi-actor LLM applications. It sits on top of the LangChain ecosystem but has no hard dependency on LCEL chains β you can use it with any LLM client and plain Python functions.
- GitHub: github.com/langchain-ai/langgraph
- Install:
pip install langgraph - Current stable version: 0.2.x (verify on PyPI before pinning)
- Community: LangChain Discord (
#langgraphchannel), GitHub Discussions
LangGraph Cloud (the managed hosted layer) adds a deployment runtime, a visual graph debugger, and a REST API on top of your compiled graph. The core langgraph Python package is entirely free and open source β everything in this post requires only the OSS library.
For a full walkthrough of the LangChain building blocks (chains, prompts, output parsers, memory) that pair with LangGraph, see LangChain Development Guide.
π Lessons Learned
Building your first stateful agent surfaces several lessons that are not obvious from the documentation:
Compile once, invoke many times. Call
builder.compile()once at application startup and store the returned graph object. Re-compiling on every request adds unnecessary latency and re-runs the graph validation step each time.Every loop needs an exit condition. Add a
step_countfield to your state and a safety edge that routes toENDonce the counter exceeds a maximum. Without it, a misbehaving LLM response will spin the graph indefinitely.State schema is your API contract. The
TypedDictyou define is the shared interface for every node in the graph. Add fields deliberately and document their lifecycle. Removing a field later requires migrating any saved checkpoints β treat it like a database schema change.Start with
invoke, addstreamfor UX. Get the logic working first with synchronousinvoke. Streaming is easy to layer on once the graph behaviour is correct β a one-word change frominvoketostream.Small, focused graphs are easier to maintain. Resist the temptation to add every feature to a single mega-graph. Build separate, focused graphs for separate use cases. Wire them together with sub-graphs if needed, but design each graph to do one thing well.
π TLDR: Summary and Key Takeaways
- The root problem: Bare LCEL chains are stateless β they cannot loop, branch, or retain anything beyond conversation history. LangChain memory handles chat history, but not typed agent state, conditional routing, or durable checkpoints.
- LangGraph's solution: A
StateGraphwhere typed state flows through Python function nodes connected by normal or conditional edges. - The three primitives: State (TypedDict with reducers), Nodes (Python functions), Edges (normal + conditional + END).
- The compile/invoke lifecycle:
add_nodeβset_entry_pointβadd_edgeβcompileβinvoke/stream. - LLM-agnostic by design: Swap
ChatOpenAIforChatAnthropicorChatOllamawith a single line; the graph never changes. - Use it when: Your agent needs to branch based on output or loop until a condition is met.
- Skip it when: Your agent makes a single LLM call β a plain LCEL chain is lighter and sufficient.
Next step: Install LangGraph (
pip install langgraph langchain-openai), paste the research assistant example above, and run it. Swap the LLM. Add a node. Remove it. That hands-on loop is the fastest way to internalize the mental model.
π Practice Quiz
What is the primary limitation of a plain LCEL chain that LangGraph is designed to solve?
- A) It cannot call OpenAI models
- B) It is stateless β it cannot loop, branch, or share context between steps
- C) It runs too slowly for production workloads Correct Answer: B
In a LangGraph
StateGraph, what does annotating a state field withoperator.adddo?- A) Adds a numeric counter to the field on each node update
- B) Appends new values to the existing list instead of overwriting it
- C) Marks the field as read-only so nodes cannot modify it Correct Answer: B
Which
StateGraphmethod must you call before invoking the graph?- A)
builder.run() - B)
builder.start() - C)
builder.compile()Correct Answer: C
- A)
π Related Posts
- LangChain Development Guide β The LangChain building blocks (chains, prompts, output parsers, memory) that pair with LangGraph. Start here if you are new to LangChain.
- AI Agents Explained: When LLMs Start Using Tools β Understand the ReAct loop and tool-using agents before adding graph structure on top.
- Multi-Step AI Agents: The Power of Planning β How agents decompose goals into ordered steps β the planning layer that a LangGraph graph can then execute.

Written by
Abstract Algorithms
@abstractalgorithms
More Posts
Software Engineering Principles: Your Complete Learning Roadmap
TLDR: This roadmap organizes the Software Engineering Principles series into a problem-first learning path β starting with the code smell before the principle. New to SOLID? Start with Single Responsibility. Facing messy legacy code? Jump to the smel...
Machine Learning Fundamentals: Your Complete Learning Roadmap
TLDR: πΊοΈ Most ML courses dive into math formulas before explaining what problems they solve. This roadmap guides you through 9 essential posts across 3 phases: understanding ML fundamentals β mastering core algorithms β deploying production models. ...
Low-Level Design Guide: Your Complete Learning Roadmap
TLDR TLDR: LLD interviews ask you to design classes and interfaces β not databases and caches.This roadmap sequences 8 problems across two phases: Phase 1 (6 beginner posts) builds your core OOP vocabulary through increasingly complex domains; Phase...

LLM Engineering: Your Complete Learning Roadmap
TLDR: The LLM space moves so fast that engineers end up reading random blog posts and never build a mental model of how everything connects. This roadmap organizes 35+ LLM Engineering posts into 7 tra
