Human-in-the-Loop Workflows with LangGraph: Interrupts, Approvals, and Async Execution
Pause LangGraph agents mid-run for human approval: interrupt(), Command, update_state(), and async resume patterns.
Abstract AlgorithmsTLDR: Pause LangGraph agents mid-run with interrupt(), get human approval, resume with Command.
π The Autonomous Agent Risk: When Acting Without Permission Goes Wrong
Your autonomous coding agent refactored the authentication module while you were in a meeting. It looked right to the LLM. It broke production.
This is not a hypothetical. As LangGraph agents gain access to real tools β GitHub APIs, database write operations, email senders, cloud billing β each step they take can have irreversible real-world consequences. An agent that deletes the wrong S3 bucket, commits a breaking change to main, or triggers a $2,000 API bill does not stop to ask for permission. It just acts.
Human-in-the-Loop (HITL) is the architectural answer to this problem. Instead of letting the agent run to completion unchecked, you insert deliberate pause points where a human can inspect the proposed action, approve it, reject it, or correct it before execution continues.
| Mode | When It Acts | Who Can Stop It | Suitable For |
| Fully Autonomous | Immediately | Nobody | Low-stakes, reversible tasks |
| Interrupt-on-Action | Before irreversible steps | Human at decision points | API calls, file writes, deployments |
| Step-by-Step Approval | After every node | Human at each step | Sensitive pipelines, audited workflows |
LangGraph provides first-class support for all three modes through interrupt(), NodeInterrupt, Command, and update_state(). This post walks through each piece, shows you how they connect, and ends with a complete PR review agent that pauses before applying any code change.
π HITL Fundamentals: interrupt(), NodeInterrupt, and the Checkpointer Requirement
Three primitives make human-in-the-loop possible in LangGraph. Understanding what each one does β and what it requires β prevents the most common configuration mistakes.
interrupt() β The Primary Pause Mechanism
interrupt() is a function you call inside a node to pause the graph and return control to the caller with a payload. The graph stops at exactly that line. The return value of interrupt() is whatever the human sends back when they resume.
from langgraph.types import interrupt
def approval_node(state: AgentState):
# Graph pauses here; payload is sent to the caller
human_decision = interrupt({
"question": "Approve this action?",
"proposed_action": state["proposed_action"],
})
# Execution resumes here after the human responds
return {"approved": human_decision == "approve"}
interrupt() is the recommended pattern in current LangGraph versions. It reads naturally β pause, collect input, continue β and integrates cleanly with the async execution model.
NodeInterrupt β The Exception-Based Older Pattern
NodeInterrupt is a special exception you raise from inside a node. The graph catches it, stores the interrupt payload, and halts. The human then calls update_state() to inject their response into the graph state, and re-invokes without a Command to resume.
from langgraph.errors import NodeInterrupt
def review_node(state: AgentState):
if not state.get("human_reviewed"):
# Older pattern: raise an exception to interrupt
raise NodeInterrupt(f"Please review proposed changes: {state['proposed_action']}")
return {"status": "reviewed"}
When to use each:
interrupt() | NodeInterrupt | |
| Resume mechanism | Command(resume=value) | update_state() + re-invoke |
| Return value in node | The human's response directly | Read from state after update |
| Recommended | β Yes (current LangGraph) | β οΈ Legacy; still supported |
The Checkpointer: Non-Negotiable
Neither pattern works without a checkpointer. When interrupt() halts the graph, the entire state snapshot β including which node was running, what values are in every state key, and where in the node body execution stopped β must be persisted somewhere so it can be restored when the human responds.
Without a checkpointer, the graph has no memory between the pause and the resume. LangGraph will raise an error at compile time if you try to use interrupt() on a graph without one.
from langgraph.checkpoint.memory import MemorySaver
checkpointer = MemorySaver() # In-memory; use SqliteSaver or RedisSaver in production
graph = builder.compile(checkpointer=checkpointer)
Every invocation also requires a thread ID in the config so the checkpointer knows which conversation's state to restore:
config = {"configurable": {"thread_id": "pr-review-session-42"}}
graph.invoke(initial_state, config)
βοΈ Building an Approval Workflow: Pause, Inspect, Approve, Resume
The full HITL lifecycle has four phases. Here is the complete flow in code.
graph TD
A([Start: invoke graph]) --> B[analyze_node runs]
B --> C[approval_node hits interrupt]
C --> D{Checkpointer saves\nfrozen state}
D --> E([Return __interrupt__ to caller])
E --> F[Human reviews payload]
F --> G{Decision}
G -- approve --> H[invoke Command resume=approve]
G -- reject --> I[invoke Command resume=reject]
H --> J[Checkpointer restores state]
I --> J
J --> K[approval_node resumes;\ninterrupt returns decision]
K --> L[apply_node executes]
L --> M([Graph completes])
The checkpointer (D β J) is the invisible bridge that makes both invoke() calls part of the same workflow.
Phase 1 β Invoke the graph (first run):
config = {"configurable": {"thread_id": "deploy-approval-001"}}
result = graph.invoke({"task": "deploy service to production"}, config)
The graph runs until it hits interrupt(). At that point, execution halts and result contains the interrupt payload under result["__interrupt__"]:
# result["__interrupt__"] == [
# Interrupt(value={"question": "Approve deploy?", "service": "auth-api"}, ...)
# ]
interrupt_payload = result["__interrupt__"][0].value
Phase 2 β Surface the question to the user:
Your application layer (CLI, web UI, Slack bot) presents interrupt_payload to the human. This is pure application code β LangGraph has no opinion on how you collect the human's response.
Phase 3 β Resume with Command:
from langgraph.types import Command
# Human typed "approve" in the UI
resumed = graph.invoke(Command(resume="approve"), config)
Command(resume=user_input) tells the graph: the human responded with this value; resume from where you paused and give this back as the return value of interrupt().
Phase 4 β Execution continues from the frozen point:
The node that called interrupt() receives "approve" as the return value, evaluates it, and the graph continues to the next node as normal.
Key insight: The same
config(samethread_id) must be used for both the initial invoke and the resume. This is how the checkpointer knows which frozen state to restore.
π§ Deep Dive: How LangGraph Freezes and Resumes Graph Execution
The Internals
When interrupt() is called inside a node, three things happen in sequence:
Signal to the executor.
interrupt()raises a special internal exception (GraphInterrupt) that the LangGraph executor catches at the top of the execution loop. This unwinds the call stack cleanly without corrupting state.Checkpoint capture. Before returning to the caller, the executor serializes the full state snapshot β including the exact values of all state keys at the moment of interruption β and writes it to the checkpointer under the current
thread_id. The checkpoint also records which node interrupted and the interrupt payload.Control returned to caller. The graph's
invoke()returns to your application code with the interrupt payload visible. The graph is now "frozen" β persisted in the checkpointer, waiting.
When Command(resume=value) arrives:
- The executor looks up the thread's latest checkpoint in the checkpointer.
- It restores the node that interrupted and re-enters the node's function body from the top β but this time,
interrupt()returnsvalueimmediately instead of pausing again. - The node completes normally and the graph advances.
This re-entry model means the code before interrupt() in your node function runs twice: once on the way to the pause, and once on the way back. Keep any side effects (API calls, database writes) after the interrupt() call, not before.
def deploy_node(state: AgentState):
# β
Safe: pure computation before interrupt
proposed = build_deploy_plan(state["task"])
# Graph pauses here β code above runs again on resume (idempotent is fine)
decision = interrupt({"plan": proposed})
# β
Safe: side effects AFTER interrupt, only execute once (after resume)
if decision == "approve":
call_deploy_api(proposed) # Only called post-resume
return {"deployed": decision == "approve"}
Performance Analysis
| Concern | Detail | Mitigation |
| Human latency | Graphs can be paused for seconds, hours, or days | Use persistent checkpointers (SQLite, Redis, Postgres) not MemorySaver |
| Checkpoint storage cost | Each interrupt serializes full state; large states (long message histories) grow fast | Trim state before interrupt; store only the delta needed for resume |
| Timeout risk | If the human never responds, the thread is stranded | Implement a background job that expires stale threads after N hours |
| Concurrent interrupts | Multiple threads interrupted simultaneously require unique thread IDs and isolated checkpoint rows | Use UUIDs for thread IDs, never share them across users |
For production deployments, MemorySaver is only suitable for development and testing. Use SqliteSaver for single-process deployments or LangGraph Cloud's built-in Postgres checkpointer for distributed multi-instance setups.
π The Interrupt-Resume Cycle: Full Lifecycle in One Diagram
sequenceDiagram
participant App as Application Layer
participant Graph as LangGraph Executor
participant Node as Agent Node
participant CP as Checkpointer
participant Human as Human Reviewer
App->>Graph: invoke(initial_state, config)
Graph->>Node: execute analyze_node()
Node-->>Graph: returns state update
Graph->>Node: execute approval_node()
Node->>Graph: interrupt(proposed_action)
Graph->>CP: save_checkpoint(thread_id, state_snapshot)
Graph-->>App: return {"__interrupt__": [payload]}
App->>Human: display(payload["proposed_action"])
Human-->>App: "approve" / "reject" / edited_value
App->>Graph: invoke(Command(resume=decision), config)
Graph->>CP: restore_checkpoint(thread_id)
Graph->>Node: re-enter approval_node(); interrupt() returns decision
Node-->>Graph: returns {"approved": True}
Graph->>Node: execute apply_node()
Node-->>Graph: returns {"applied": True}
Graph-->>App: final state
The diagram makes two non-obvious points explicit: the checkpointer is the bridge between the two invoke() calls, and approval_node is re-entered from its beginning on resume β it does not continue from a saved instruction pointer.
π Real-World Applications: Where Human-in-the-Loop Is Non-Negotiable
Autonomous DevOps Pipelines
A LangGraph agent monitors production metrics, detects anomalies, and generates a runbook of remediation steps. Steps like "restart service" might auto-execute, but "scale down the database replica" triggers an interrupt(). The on-call engineer receives the proposed change in Slack, approves or rejects it with one click, and the agent resumes.
Input: Prometheus alert payload, current service topology
Interrupt payload: {"action": "scale_down", "target": "db-replica-3", "reason": "p99 latency normal"}
Human decision: Approve or reject with a comment added via update_state()
Financial Transaction Approval
An expense automation agent categorizes invoices and schedules payments. Payments under $500 auto-approve; anything above interrupts the workflow and routes to the finance manager's approval queue. The graph is paused indefinitely until the manager acts β potentially overnight β thanks to the persistent checkpointer.
Legal and Compliance Document Generation
An AI drafts contract clauses based on deal terms. Before finalizing, the graph pauses at every clause that modifies liability language. A lawyer reviews each proposed clause in a web UI, edits the text via update_state(), and marks it approved. The agent then formats the final document with the reviewed text.
These three patterns share a common architecture: the agent handles the mechanical labor (analysis, proposal generation, formatting) while the human handles the judgment calls (approve, edit, reject). LangGraph's HITL primitives make this division of responsibility a first-class design choice rather than an afterthought.
βοΈ Trade-offs and Failure Modes: Deadlocked Graphs, Stale State, and Human Latency
Performance vs. Safety
Adding an interrupt to a workflow introduces unbounded latency. A fully autonomous graph completes in seconds; a human-gated one can sit frozen for hours. If your system has SLAs, you need to decide which nodes are worth gating. The rule of thumb: interrupt on irreversible, high-blast-radius actions only β not on every step.
Failure Mode 1 β The Deadlocked Graph
A thread is interrupted and nobody resumes it. The checkpointer holds the frozen state indefinitely. In production, implement a TTL-based expiry job that scans for threads not resumed within a threshold (e.g., 24 hours) and marks them as abandoned. Without this, you accumulate orphaned threads that silently consume checkpoint storage.
Failure Mode 2 β Stale State on Late Resume
The LLM proposed changes based on a codebase snapshot from 09:00. The human approves at 17:00. In the meantime, three other PRs merged. The proposed changes now conflict. HITL does not automatically re-validate the proposal against the new world state β you must build that check into your apply_changes node explicitly.
def apply_changes(state: AgentState):
if state["approved"]:
# Re-validate: is the proposal still valid in current state?
if is_proposal_stale(state["proposed_changes"]):
return {"applied": False, "error": "state changed while waiting for approval"}
execute_changes(state["proposed_changes"])
return {"applied": state["approved"]}
Failure Mode 3 β Multiple Interrupts in One Thread
If a graph has two interrupt() calls in sequence, each one halts and resumes independently, in order. This is intentional β LangGraph queues them. But if your UI assumes only one interrupt per thread, the second interrupt will surface unexpectedly. Design your UI to handle any number of sequential interrupts on the same thread ID.
Mitigation Summary
| Failure Mode | Mitigation |
| Deadlocked graph | TTL expiry job; dashboard to surface stalled threads |
| Stale state | Re-validate proposal after resume before executing |
| Unexpected second interrupt | UI must handle the interrupt queue, not assume a single pause |
| Lost checkpoint (MemorySaver restart) | Use persistent checkpointer (SQLite/Redis/Postgres) in all non-dev environments |
π§ Decision Guide: Full Autonomy vs Interrupt-on-Action vs Step-by-Step Approval
| Situation | Recommendation |
| Use full autonomy when | Actions are reversible (read-only queries, draft creation, summarization) and failure cost is low |
| Use interrupt-on-action when | Specific nodes perform irreversible external calls (API writes, deploys, payments); the rest of the graph can run freely |
| Use step-by-step approval when | The domain is regulated (legal, medical, financial), every output needs audit trail, or the agent is new and untested in production |
| Avoid HITL entirely when | Sub-second latency is required (real-time inference, streaming) or human availability cannot be guaranteed (overnight batch jobs) |
| Edge cases | Parallel branches with multiple interrupt() calls require careful thread management; one interrupt per branch per execution step |
π§ͺ Practical Example: PR Review Agent That Asks Before Applying Changes
This agent demonstrates the complete HITL lifecycle in a scenario where the stakes are high enough to justify every pause: code diffs that, if applied incorrectly, break production. The PR review scenario was chosen because it maps cleanly to the three primitives covered in this post β interrupt() for pausing, Command(resume=...) for resuming, and update_state() for human correction β each one triggered at a distinct point in the workflow. As you read through the nodes, watch for the interrupt() call inside review_node and trace how the payload flows back as the return value when the graph resumes β that handoff is the mechanism that makes the entire pattern work.
from typing import TypedDict
from langgraph.graph import StateGraph, START, END
from langgraph.types import interrupt, Command
from langgraph.checkpoint.memory import MemorySaver
from langchain_openai import ChatOpenAI
# --- State schema ---
class PRReviewState(TypedDict):
pr_number: int
diff: str
analysis: str
proposed_changes: list[str]
approved: bool
applied: bool
llm = ChatOpenAI(model="gpt-4o")
# --- Node 1: Analyze the diff ---
def analyze_pr(state: PRReviewState) -> dict:
response = llm.invoke(
f"Analyze this code diff and identify specific issues:\n{state['diff']}"
)
return {"analysis": response.content}
# --- Node 2: Generate concrete change proposals ---
def propose_changes(state: PRReviewState) -> dict:
response = llm.invoke(
f"Based on this analysis, write 3 specific, actionable code changes:\n{state['analysis']}"
)
changes = [line for line in response.content.split("\n") if line.strip()]
return {"proposed_changes": changes[:3]}
# --- Node 3: Interrupt for human approval ---
def human_approval(state: PRReviewState) -> dict:
# Graph pauses here β payload goes to the caller
decision = interrupt({
"message": "Review proposed changes. Reply 'approve' to proceed or 'reject' to cancel.",
"pr_number": state["pr_number"],
"proposed_changes": state["proposed_changes"],
})
return {"approved": decision.strip().lower() == "approve"}
# --- Node 4: Apply approved changes ---
def apply_changes(state: PRReviewState) -> dict:
if not state["approved"]:
print(f"PR #{state['pr_number']}: changes rejected by reviewer.")
return {"applied": False}
# Re-validate before committing (guard against stale state)
print(f"Applying {len(state['proposed_changes'])} changes to PR #{state['pr_number']}:")
for change in state["proposed_changes"]:
print(f" β {change}")
# In production: call GitHub API, run git apply, post review comment, etc.
return {"applied": True}
# --- Build the graph ---
builder = StateGraph(PRReviewState)
builder.add_node("analyze_pr", analyze_pr)
builder.add_node("propose_changes", propose_changes)
builder.add_node("human_approval", human_approval)
builder.add_node("apply_changes", apply_changes)
builder.add_edge(START, "analyze_pr")
builder.add_edge("analyze_pr", "propose_changes")
builder.add_edge("propose_changes", "human_approval")
builder.add_edge("human_approval", "apply_changes")
builder.add_edge("apply_changes", END)
checkpointer = MemorySaver()
graph = builder.compile(checkpointer=checkpointer)
Running the agent β Phase 1 (invoke and receive interrupt):
config = {"configurable": {"thread_id": "pr-42-review"}}
sample_diff = "- def login(username, password):\n+ def login(username: str, password: str) -> bool:"
result = graph.invoke(
{"pr_number": 42, "diff": sample_diff, "approved": False, "applied": False},
config
)
# The graph paused at human_approval
payload = result["__interrupt__"][0].value
print(payload["message"]) # "Review proposed changes..."
print(payload["proposed_changes"]) # ["Add type annotations", ...]
Phase 2 (human reviews and resumes):
# Simulating a human typing "approve" in a UI
resumed = graph.invoke(Command(resume="approve"), config)
print(resumed["approved"]) # True
print(resumed["applied"]) # True
The thread ID "pr-42-review" binds the two invoke() calls together. The checkpointer restores the frozen state from Phase 1, interrupt() returns "approve" inside human_approval, and the graph completes through apply_changes.
π οΈ LangGraph's update_state(): Editing Agent Memory Before Resuming
Command(resume=value) answers the interrupt but leaves the rest of the state unchanged. Sometimes the human doesn't just want to approve β they want to fix what the agent got wrong. That's what graph.update_state() is for.
# After the graph interrupts, inspect the current state snapshot
snapshot = graph.get_state(config)
print(snapshot.values["proposed_changes"])
# ["1. Add type hints", "2. Remove unused import", "3. Rename variable x to user_id"]
# The human wants to override change #3 before approving
corrected_changes = [
"1. Add type hints",
"2. Remove unused import",
"3. Extract login logic into a dedicated AuthService class", # Human edited this
]
graph.update_state(
config,
{"proposed_changes": corrected_changes}
)
# Now resume β the graph will apply the corrected list, not the LLM's original
resumed = graph.invoke(Command(resume="approve"), config)
update_state() writes directly into the persisted checkpoint under the given thread_id. The next time the graph reads state["proposed_changes"], it gets the human's corrected version, not the LLM's original.
What you can edit:
| State key type | Can update_state() modify it? | Notes |
| Simple values (str, int, bool) | β Yes | Replaces the value directly |
Lists with Annotated[list, operator.add] | β Yes | Appends; pass new items only |
| Plain lists | β Yes | Replaces the whole list |
| LangGraph message history | β Yes | Pass {"messages": [HumanMessage(...)]} |
This makes update_state() a powerful tool not just for HITL corrections, but for injecting external context into a running agent β for example, appending a new document retrieved from a database while the agent is paused.
π Lessons Learned
1. Always persist your checkpointer before going to production.
MemorySaver lives in RAM. A process restart during an interrupt wipes all frozen thread states. Use SqliteSaver or a Redis/Postgres backend in any environment where the human might take longer than the process uptime.
2. Keep code before interrupt() idempotent.
Because interrupt() causes the node function to re-execute from the top on resume, any side effects (writes, API calls) placed before interrupt() will run twice. Pure computation before, side effects after.
3. Design for the "never resumes" case.
Not every human responds promptly β or at all. Build a thread expiry mechanism. Query the checkpointer for threads whose updated_at is older than your TTL and either auto-reject them or notify the human again.
4. One thread ID per independent conversation. Thread IDs must be unique per user session. Never reuse a thread ID across different users or different task instances. Collisions mean one user's approval resumes another's graph.
5. Don't interrupt on reversible steps. HITL is a cost: human time, latency, and coordination overhead. Reserve it for the actions that truly warrant it β irreversible, high-blast-radius, or regulated steps. A graph that interrupts on every LLM call trains humans to rubber-stamp approvals, defeating the purpose.
π TLDR: Summary and Key Takeaways
TLDR: Pause LangGraph agents mid-run with interrupt(), get human approval, resume with Command.
interrupt(payload)pauses graph execution inside a node and returns the payload to the caller; the node re-enters from the top on resume.Command(resume=value)is the mechanism to resume a paused graph; the value becomes the return ofinterrupt().- A checkpointer is mandatory β without one, the frozen state cannot be persisted and HITL will not work.
update_state()lets humans correct the agent's state before resuming, not just approve or reject it.- Stale state is a real failure mode: validate the proposal against current world state after every resume, not just before the interrupt.
- Interrupt-on-action (pause only at destructive operations) is the practical sweet spot between full autonomy and step-by-step human approval.
- The memorable rule: agents decide what to do; humans decide whether to let it happen.
π Practice Quiz
Which function do you call inside a LangGraph node to pause execution and return a payload to the caller?
- A)
NodeInterrupt(payload) - B)
interrupt(payload) - C)
graph.pause(payload)Correct Answer: B
- A)
You have a LangGraph agent that interrupts for human approval. After the interrupt fires, you restart your Python process. When the human responds, what happens?
- A) The graph resumes normally β the state is in memory
- B) The graph raises a
GraphInterrupterror and discards the state - C) The graph cannot resume because
MemorySaverstate was lost on restart Correct Answer: C
A human reviewer wants to change one of the agent's proposed values before approving. Which tool do they use?
- A)
graph.update_state(config, {"key": new_value}) - B)
Command(resume={"key": new_value}) - C) Re-invoke the graph from the beginning with corrected input Correct Answer: A
- A)
Open-ended challenge: You are designing a CI/CD agent that runs 20 steps: unit tests, lint, build, integration tests, and a final production deploy. Which of these should trigger an
interrupt()? Consider the trade-offs between safety, latency, and human fatigue β and explain how you would decide which steps deserve a human gate versus which should run autonomously.
π Related Posts
- Multistep AI Agents: The Power of Planning
- AI Agents Explained: When LLMs Start Using Tools
- AI Architecture Patterns: Routing, Planning, Memory, and Evaluation

Written by
Abstract Algorithms
@abstractalgorithms
More Posts
Software Engineering Principles: Your Complete Learning Roadmap
TLDR: This roadmap organizes the Software Engineering Principles series into a problem-first learning path β starting with the code smell before the principle. New to SOLID? Start with Single Responsibility. Facing messy legacy code? Jump to the smel...
Machine Learning Fundamentals: Your Complete Learning Roadmap
TLDR: πΊοΈ Most ML courses dive into math formulas before explaining what problems they solve. This roadmap guides you through 9 essential posts across 3 phases: understanding ML fundamentals β mastering core algorithms β deploying production models. ...
Low-Level Design Guide: Your Complete Learning Roadmap
TLDR TLDR: LLD interviews ask you to design classes and interfaces β not databases and caches.This roadmap sequences 8 problems across two phases: Phase 1 (6 beginner posts) builds your core OOP vocabulary through increasingly complex domains; Phase...

LLM Engineering: Your Complete Learning Roadmap
TLDR: The LLM space moves so fast that engineers end up reading random blog posts and never build a mental model of how everything connects. This roadmap organizes 35+ LLM Engineering posts into 7 tra
