All Posts

Skills vs LangChain, LangGraph, MCP, and Tools: A Practical Architecture Guide

LangChain/LangGraph run workflows, MCP exposes capabilities, tools do actions, and skills package outcomes.

Abstract AlgorithmsAbstract Algorithms
ยทยท14 min read
Cover Image for Skills vs LangChain, LangGraph, MCP, and Tools: A Practical Architecture Guide

AI-assisted content.

TLDR: These are not competing ideas. They are layers. Tools do one action. MCP standardizes access to actions and resources. LangChain and LangGraph orchestrate calls. Skills package business outcomes with contracts, guardrails, and evaluation. Most production confusion comes from mixing these layers.


๐Ÿ“– The Layer Cake: What Each Term Actually Means

A product team shipped a customer support agent that worked in every demo. In production, it returned inconsistent refund decisions โ€” sometimes citing correct policy, sometimes hallucinating eligibility rules โ€” because the "agent" was a single LangGraph workflow with no output contract and no retry guard. The problem was not the model. The problem was missing layers.

People often ask: "Are skills better than LangGraph?" That question is like asking whether APIs are better than databases. They solve different problems.

Use this mental model:

LayerMain question it answersTypical artifact
Tool"What single action can I execute?"Function or API adapter
MCP"How do I discover and call capabilities across systems?"Protocol server + typed schemas
LangChain"How do I compose prompts, tools, and model calls quickly?"Chains, agents, callbacks
LangGraph"How do I run stateful multi-step workflows reliably?"Graph nodes, edges, checkpoints
Skill"How do I deliver a stable product outcome?"Reusable capability contract

A skill is usually built on top of the other layers, not instead of them.

Example:

  • Tool: fetch_customer_profile(customer_id)
  • Tool: check_subscription_status(customer_id)
  • Tool: create_support_ticket(payload)
  • MCP: exposes those tools from remote services with common schemas
  • LangGraph: coordinates retries and branching
  • Skill: AccountRecoverySkill returns a structured, policy-safe resolution

If you skip the skill layer, your app can still run. But behavior often becomes prompt-heavy and hard to govern.

๐Ÿ“Š Framework Decision Tree

flowchart TD
    Goal[Define Agent Goal]
    Single{Single tool call or simple chain?}
    Stateful{Multi-step stateful workflow?}
    Contract{Need stable output contract?}
    Cross{Cross-system tool discovery?}

    LC[LangChain (prompt chains, tools)]
    LG[LangGraph (stateful graph, retries)]
    Skill[Skill Layer (reusable capability)]
    MCP[MCP (cross-system protocol)]
    Both[LangGraph + Skill Layer]

    Goal --> Single
    Single -->|Yes| LC
    Single -->|No| Stateful
    Stateful -->|Yes| Contract
    Contract -->|No| LG
    Contract -->|Yes| Both
    Cross -->|Yes| MCP
    Both --> Cross

This decision tree maps the agent design question โ€” "what building block should I reach for?" โ€” to one of four framework layers: LangChain for simple prompt chains and tool calls, LangGraph for stateful multi-step workflows, the Skill Layer for capabilities requiring stable output contracts, and MCP when cross-system tool discovery is needed. Follow the tree from the root goal downward, answering each binary question in sequence until the appropriate layer emerges. The reader should note that multiple layers often collaborate in a single production agent โ€” this tree shows where each concern belongs, not a mutually exclusive choice.

๐Ÿ“Š LangGraph Node Execution Sequence

sequenceDiagram
    participant R as Router
    participant NA as Node A
    participant E as Edge Condition
    participant NB as Node B
    participant NC as Node C
    participant Term as END

    R->>NA: Start graph execution
    NA->>NA: Process state
    NA->>E: Evaluate conditional edge
    E-->>NB: Condition = "path_b"
    NB->>NB: Process state
    NB->>Term: Transition to END
    Note over NC: Node C not executed. Conditional branch skipped.

This sequence illustrates LangGraph's conditional edge mechanism: the router starts Node A, which evaluates a condition and selects "path_b," causing execution to flow through Node B to END while Node C is never reached. The critical observation is that conditional branching in LangGraph is explicit and deterministic โ€” the edge evaluation result, not the LLM's free-form output, controls which node executes next. This predictability is what makes LangGraph a better fit than prompt chaining when the workflow contains branching logic that must be auditable and reproducible.


๐Ÿ” Where LangChain and LangGraph Fit, and Where They Do Not

LangChain and LangGraph are implementation frameworks. They help you execute reasoning and workflows. They do not automatically define product-level ownership, risk boundaries, or capability lifecycle.

ConcernLangChainLangGraphSkill layer
Fast prototypingStrongGoodMedium
Stateful executionLimited by design patternStrongDepends on runtime
Retry orchestrationBasicStrongPolicy-driven
Business contract (input/output guarantees)ManualManualFirst-class
Capability ownership/versioningExternal processExternal processFirst-class
Governance and risk-tier mappingExternal processExternal processFirst-class

Why teams get confused:

  1. They build one graph and call it a "skill".
  2. They add one tool description and assume governance is done.
  3. They treat protocol access (MCP) as business capability modeling.

Good architecture separates these concerns.

  • Frameworks run computation.
  • Skills define outcome boundaries.

โš™๏ธ End-to-End Execution Path: How the Layers Collaborate

Let us trace one request: "Investigate payment failure spikes and open an incident if needed."

flowchart TD
    A[User request] --> B[Router chooses PaymentIncidentSkill]
    B --> C[Skill validates input and policy]
    C --> D[LangGraph executes workflow state]
    D --> E[Node calls tools via MCP]
    E --> F[Collects logs metrics and ticket status]
    F --> G[Skill output validation]
    G --> H[Final structured response plus trace]

This flow exposes the distinction clearly:

  • LangGraph is the runtime engine for state transitions.
  • MCP is the interoperability channel for tools/resources.
  • Tools are atomic actions.
  • Skill wraps the whole thing as a reusable product capability.

Mini dataset for one run:

StepLayer activeInputOutput
1Skillservice=payments, window=15mValidated request object
2LangGraphSkill stateExecution path
3MCP + Toolfetch_error_rateerror_rate=8.7%
4MCP + Toolcreate_incident_ticketticket_id=INC-9012
5SkillAggregated stateStable JSON result

The skill result is what downstream products depend on. That is why skills should own output contracts.


๐Ÿง  Deep Dive: Why Layer Confusion Breaks Production Systems

Internals: control plane vs execution plane

A useful split:

  • Control plane: registry, routing policy, risk gating, rollout rules
  • Execution plane: LangGraph graph run, MCP calls, tool invocation, retries

If everything is put in execution code, every team ships its own hidden policy logic. That creates drift.

PlaneWhat changes oftenWhat should stay stable
Control planeRouting thresholds, risk policy, capability ownershipGovernance model
Execution planeNode logic, model choice, retries, tool adapter detailsSkill contract

Skills sit at the boundary and stabilize product expectations while execution internals evolve.

Mathematical model: route score vs policy eligibility

A practical routing pattern:

$$ Eligible(s, q) = PolicyAllow(s, q) \land PermissionAllow(s, q) $$

Then score only eligible skills:

$$ Score(s \mid q) = a \cdot Fit - b \cdot Latency - c \cdot Risk + d \cdot Reliability $$

And choose:

$$ s^* = \arg\max_{s \in S, Eligible(s,q)} Score(s \mid q) $$

This keeps policy decisions explicit and auditable.

Performance analysis: where latency is spent

ComponentTypical latency shareNotes
LLM reasoning callsHighPrompt and model dependent
Tool/MCP network I/OMedium to highDominant in API-heavy skills
Orchestration overheadLow to mediumUsually acceptable trade for reliability
Validation and output shapingLowWorth it for contract safety

A common mistake is optimizing graph overhead while ignoring remote tool latency. Measure the right bottleneck.


๐Ÿ”ฌ Internals

LangGraph models agent logic as a directed graph of nodes (LLM calls, tool invocations) and edges (conditional routing, loops). State is a typed dict threaded through each node, enabling persistent checkpointing and resumable workflows. MCP (Model Context Protocol) standardizes tool interfaces between LLMs and external systems via a JSON-RPC-like protocol, decoupling tool implementation from agent framework.

โšก Performance Analysis

LangGraph adds ~10โ€“20ms overhead per graph step versus raw LLM calls due to state serialization and edge evaluation. Multi-agent LangGraph workflows with 3 specialized agents complete complex tasks 2โ€“3ร— faster than single-agent loops by parallelizing independent subtasks. MCP server round-trips add 5โ€“50ms depending on transport (stdio vs. HTTP), making it suitable for all but the most latency-sensitive applications.

๐Ÿ“Š Sequence View: Tool-Only Agent vs Skill-Centric Agent

sequenceDiagram
    participant U as User
    participant A as Agent
    participant G as LangGraph Runtime
    participant M as MCP Server
    participant T as Tool API
    participant S as Skill Contract

    U->>A: "Investigate billing failures"
    A->>S: select(BillingIncidentSkill)
    S->>G: execute(skill_state)
    G->>M: call(fetch_metrics)
    M->>T: invoke tool
    T-->>M: metrics
    M-->>G: typed result
    G->>M: call(open_incident)
    M->>T: invoke tool
    T-->>M: ticket id
    M-->>G: typed result
    G-->>S: final state
    S-->>A: contract-valid output
    A-->>U: summary + ticket link

A tool-only approach may skip S and return free-form text. That is fast to demo, but risky for integrations that expect strict output fields.


๐ŸŒ Real-World Applications: Real-World Patterns That Make This Practical

Pattern 1: Product support copilot

  • Tools: CRM lookup, order API, refund API.
  • MCP: centralizes access to those systems.
  • LangGraph: executes decision branches (refund eligible or not).
  • Skill: RefundResolutionSkill returns decision, reason, next_action.

Pattern 2: Security triage assistant

  • Tools: SIEM query, IOC enrichment, ticketing.
  • LangGraph: handles iterative enrichment loop.
  • Skill: AlertTriageSkill enforces policy that high-risk actions require human approval.

Pattern 3: Data analyst copilot

  • Tools: SQL execution, chart rendering, metadata lookup.
  • MCP: gives one protocol for multiple data backends.
  • Skill: KPIExplainerSkill guarantees output schema with query, metric, confidence, limitations.
Use caseWhy tool-only strugglesWhy skill-centric works
Support automationInconsistent output fieldsStable contract for downstream workflow
Security operationsUnsafe autonomous actionsRisk policy encoded at skill boundary
Analytics Q&AHallucinated field namesValidated query and structured explanation

โš–๏ธ Trade-offs & Failure Modes: Trade-offs and Failure Modes

You should not force everything into skills. Keep the cost-benefit clear.

ChoiceBenefitCost
Tool-only for simple tasksFast implementationLow reuse and weak governance
Full skill contracts for critical tasksReliability and observabilityMore design and lifecycle work
Heavy graph abstraction everywhereUniform runtime patternsOverhead for trivial features

Common failure modes:

  1. Skill inflation: too many overlapping skills with unclear ownership.
  2. Framework lock-in confusion: capability modeled in framework internals only.
  3. Policy leakage: risk rules hidden in prompts instead of explicit control plane.
  4. Protocol overconfidence: assuming MCP alone gives governance.

Mitigations:

  • maintain a capability taxonomy,
  • enforce input/output schemas,
  • version skills separately from graph internals,
  • keep policy checks outside prompt-only logic.

๐Ÿงญ Decision Guide: What to Build at Your Current Maturity

Your current stageRecommended next step
1 to 3 tools, single team prototypeStart with LangChain/LangGraph and basic telemetry
5 to 15 tools, repeated user journeysIntroduce explicit skill contracts
Multi-team platform with compliance needsAdd skill registry, policy gates, and evaluation loops
High-risk automation (finance/security/health)Skill-first design with human approval paths

Quick rule set:

QuestionIf yesIf no
Is the task multi-step with branching?Use LangGraphSimple chain/tool call may be enough
Does output feed another system?Define skill contractFree-form output may be acceptable
Are there risk or compliance constraints?Add policy-gated skill routingKeep lighter execution model
Will this capability be reused by many teams?Register as skillKeep as local orchestration

๐Ÿงช Practical Example: One Capability Across All Layers

Example 1: Tool and MCP-facing adapter

This example traces a single payment incident investigation capability across all four architectural layers โ€” raw tool function, skill contract with policy gate, LangGraph-orchestrated graph execution, and MCP-compatible adapter โ€” to show how the same business logic looks when implemented at each layer. The multi-layer presentation was chosen deliberately because the most common architectural mistake in agent systems is forcing one layer to do the work of all the others. When reading the code, focus on where the output contract appears: only the skill layer enforces a stable typed return value, which is what makes downstream integrations predictable regardless of how the internal execution changes.

# Tool function signature

def fetch_payment_metrics(service: str, window_minutes: int) -> dict:
    return {
        "service": service,
        "window_minutes": window_minutes,
        "error_rate": 0.087,
        "p95_latency_ms": 1230,
    }

# In practice this tool may be exposed through an MCP server with typed schemas.

Example 2: LangGraph workflow plus skill boundary

from dataclasses import dataclass

@dataclass
class PaymentIncidentInput:
    service: str
    window_minutes: int

def payment_incident_skill(payload: PaymentIncidentInput) -> dict:
    # 1) Validate boundary
    if payload.window_minutes <= 0:
        raise ValueError("window_minutes must be positive")

    # 2) Graph execution would happen here
    metrics = fetch_payment_metrics(payload.service, payload.window_minutes)

    # 3) Policy gate
    must_open_incident = metrics["error_rate"] >= 0.05

    # 4) Stable contract
    return {
        "service": payload.service,
        "error_rate": metrics["error_rate"],
        "incident_required": must_open_incident,
        "reason": "error_rate_threshold_breached" if must_open_incident else "within_limits",
    }

This code is simple, but the design principle is important: output contract remains stable even if runtime internals change.


๐Ÿ› ๏ธ LangChain, LangGraph, and MCP: Concrete Implementation of the Four Layers

LangChain provides the Runnable abstraction โ€” chainable, composable steps with a uniform .invoke() / .stream() interface. LangGraph adds stateful graph execution with checkpointing and cyclic routing. MCP (Model Context Protocol) is an emerging open standard for exposing tools and data resources to LLMs over a typed protocol, enabling cross-framework capability sharing.

# pip install langchain langchain-core langgraph
from langchain_core.runnables import RunnableLambda, RunnableSequence
from langgraph.graph import StateGraph, END
from typing import TypedDict, Optional

# --- Layer 1: Tool โ€” one atomic action ---
def fetch_payment_metrics(service: str, window: int = 15) -> dict:
    """Retrieve error rate and latency; replace with real observability API."""
    return {"service": service, "error_rate": 0.072, "p95_ms": 980}

# --- Layer 2: LangChain Runnable chain โ€” lightweight sequential orchestration ---
enrich_chain = RunnableSequence(
    RunnableLambda(lambda x: fetch_payment_metrics(x["service"])),
    RunnableLambda(lambda m: {**m, "alert": m["error_rate"] >= 0.05}),
)
# Use for simple, linear workflows: result = enrich_chain.invoke({"service": "pay-svc"})

# --- Layer 3: LangGraph stateful workflow โ€” stateful branching and retry ---
class PaymentState(TypedDict):
    service:     str
    metrics:     Optional[dict]
    incident_id: Optional[str]

def fetch_step(state: PaymentState) -> PaymentState:
    return {**state, "metrics": fetch_payment_metrics(state["service"])}

def ticket_step(state: PaymentState) -> PaymentState:
    iid = f"INC-{abs(hash(state['service'])) % 9999:04d}"
    return {**state, "incident_id": iid}

def should_escalate(state: PaymentState) -> str:
    return "ticket" if state.get("metrics", {}).get("error_rate", 0) >= 0.05 else END

graph = StateGraph(PaymentState)
graph.add_node("fetch",  fetch_step)
graph.add_node("ticket", ticket_step)
graph.set_entry_point("fetch")
graph.add_conditional_edges("fetch", should_escalate)
graph.add_edge("ticket", END)
workflow = graph.compile()

# --- Layer 4: Skill โ€” stable product contract wrapping the graph ---
def payment_incident_skill(service: str) -> dict:
    """
    Public capability contract. Internal runtime can change without affecting callers.
    This is what the registry, router, and downstream systems depend on.
    """
    result = workflow.invoke({"service": service, "metrics": None, "incident_id": None})
    return {
        "service":     result["service"],
        "error_rate":  result.get("metrics", {}).get("error_rate"),
        "incident_id": result.get("incident_id") or "none",
    }

print(payment_incident_skill("payments-svc"))

This single code block illustrates the separation of concerns: the tool is an atomic function, the LangChain RunnableSequence handles linear orchestration, LangGraph manages stateful branching, and the skill wrapper exposes a stable output contract. The MCP server layer sits below the tools โ€” it exposes fetch_payment_metrics as a typed resource endpoint that any MCP-compatible agent framework can call without reimplementing the adapter.

For a full deep-dive on LangGraph checkpointing, MCP server implementation, and multi-agent skill delegation, a dedicated follow-up post is planned.


๐Ÿ“š Lessons Learned from Teams Shipping Agents

  • Tools, protocols, frameworks, and skills are complementary layers.
  • Framework quality does not replace capability modeling discipline.
  • MCP improves interoperability, not product governance by itself.
  • Skills reduce prompt sprawl by encoding reusable outcome contracts.
  • Keep control plane concerns explicit: ownership, risk tier, version, and evaluation.
  • Design for debuggability: capture route decisions and contract validation failures.

๐Ÿ“Œ TLDR: Summary & Key Takeaways

  • Tool is an atomic action.
  • MCP is a standard way to expose and call capabilities.
  • LangChain and LangGraph orchestrate execution.
  • Skill is a product-level capability contract with policy and stable outputs.
  • Most production reliability gains come from adding skill boundaries, not from switching frameworks.
  • Build layers incrementally: execution first, then contract and governance as reuse and risk grow.

One-liner: LangGraph and MCP help you run workflows; skills help you ship dependable capabilities.


Share

Test Your Knowledge

๐Ÿง 

Ready to test what you just learned?

AI will generate 4 questions based on this article's content.

Abstract Algorithms

Written by

Abstract Algorithms

@abstractalgorithms