Distributed Transactions: 2PC, Saga, and XA Explained
How to maintain data consistency across microservices when ACID transactions don't span databases.
Abstract AlgorithmsIntermediate
For developers with some experience. Builds on fundamentals.
Estimated read time: 25 min
AI-assisted content. This post may have been written or enhanced with AI tools. Please verify critical information independently.
TLDR: Distributed transactions require you to choose a consistency model before choosing a protocol. 2PC and XA give atomic all-or-nothing commits but block all participants on coordinator failure. Saga gives eventual consistency with explicit compensation and survives coordinator crashes โ at the cost of designing compensating transactions for every step. The Outbox pattern and idempotency are mandatory, not optional: they are what make Saga production-safe.
TLDR: Choose 2PC/XA when atomicity is non-negotiable and participants are co-managed; choose Saga when services are independent and eventual consistency is acceptable. Always pair Saga with the Outbox pattern and idempotency keys.
๐ The Charge-Without-Order Incident: Why ACID Transactions Don't Cross Service Boundaries
In 2018, a major e-commerce platform discovered that customers had been charged for orders that were never created. The payment service completed its database write successfully. The order service received the request and started writing โ but the process crashed before it committed. From the payment service's perspective, everything succeeded. From the customer's perspective, money was gone and no order existed.
In a monolith, this scenario cannot happen. You open a single database connection, write both rows inside the same BEGIN / COMMIT block, and the database's ACID guarantees enforce atomicity: either both rows land, or neither does. Rollback is automatic on failure.
In a microservices architecture, you cannot do this. The payment database and the order database are physically separate servers. They may be different vendors โ PostgreSQL for orders, MySQL for payments, Cassandra for notifications. There is no single transaction coordinator that holds locks across all of them. When you make two separate service calls, you have two separate transactions. If the second call fails after the first has committed, you have a partial write โ and no built-in mechanism to undo it.
This is the distributed transaction problem. Before choosing a solution, you must answer three questions that eliminate options:
| Design question | Why the answer eliminates options |
| Can all participants be online simultaneously? | 2PC requires every participant to respond synchronously โ one timeout aborts the entire operation |
| Is the operation long-running (seconds to minutes)? | Holding database locks across network calls causes severe contention at scale |
| Can the business tolerate a short window of inconsistency? | If yes, Saga is viable; if no, 2PC or XA is the only choice |
โ๏ธ Two-Phase Commit: How a Blocking Protocol Achieves Distributed Atomicity
Two-Phase Commit uses a coordinator โ one service or middleware node โ that orchestrates a global commit across multiple participants (individual databases or services). The protocol unfolds in two strict phases.
Phase 1 โ Prepare: The coordinator sends a PREPARE message to every participant. Each participant writes its data to a durable write-ahead log, acquires all needed locks, and votes. A VOTE_COMMIT means "I can guarantee a successful commit if you tell me to." A VOTE_ABORT means "I cannot proceed."
Phase 2 โ Commit or Abort: If every participant voted VOTE_COMMIT, the coordinator writes a commit record to its own durable log and broadcasts COMMIT. If any participant voted VOTE_ABORT, the coordinator broadcasts ABORT and all participants roll back.
The critical constraint: between voting VOTE_COMMIT in Phase 1 and receiving the coordinator's final decision in Phase 2, each participant holds all its acquired locks. If the coordinator crashes in that window, participants cannot proceed โ they cannot commit unilaterally (another participant may have aborted) and cannot abort unilaterally (the coordinator may have written a commit). This is the uncertainty window, the fundamental flaw of 2PC.
| 2PC failure scenario | What happens to participants |
| Coordinator crashes before sending PREPARE | Transaction never starts; all participants are unaffected |
| Coordinator crashes after collecting all votes, before COMMIT | All prepared participants hold locks indefinitely โ blocking protocol |
| One participant crashes before voting | Coordinator aborts after timeout; all others roll back cleanly |
| One participant crashes after voting COMMIT | Coordinator waits for ACK; participant recovers from WAL and re-applies |
| Network partition between coordinator and one participant | That participant is stuck prepared; coordinator's broadcast determines others |
When 2PC is the right call:
- Two to three databases, same vendor, strong consistency is a non-negotiable requirement.
- Operation duration is milliseconds โ never seconds.
- The coordinator is highly available (ZooKeeper, etcd, or database-native distributed coordinator).
- All participants speak the 2PC protocol (rules out most NoSQL stores).
โ๏ธ XA Transactions: The Database Industry Standard for 2PC
XA is the X/Open distributed transaction standard that specifies how a transaction manager coordinates multiple resource managers (databases) using the 2PC protocol. In Java, the Java Transaction API (JTA) is the primary implementation. Most relational databases โ PostgreSQL, MySQL, Oracle, IBM DB2 โ support XA.
In a JTA setup, the application server acts as the transaction manager, automatically detecting and enlisting each XADataSource connection opened within a UserTransaction.begin() / commit() block. Application code calls utx.begin(), performs normal JDBC operations against both the payment and order datasources, and calls utx.commit(). The JTA runtime translates this into a full two-phase commit: it calls xa.prepare() on every enlisted datasource in sequence; if all prepare calls succeed, it calls xa.commit() on each; if any prepare fails, it calls xa.rollback() on all enlisted resources. The application code never directly coordinates the two-phase protocol โ JTA handles it transparently, but the coordinator and lock-duration constraints of 2PC apply in full.
This looks elegant at small scale, but XA carries critical limitations for microservices:
- Database vendor lock-in. All datasources must implement the XA protocol. Redis, DynamoDB, Cassandra, and most NoSQL stores do not.
- Not cross-process. XA only works when the transaction manager can directly enlist
XADataSourceconnections. It cannot coordinate an HTTP call to a remote service or a Kafka publish. - Performance at scale. XA
prepare()is synchronous and holds row-level locks for the full duration of the global transaction. At high write throughput, this creates severe lock contention. - Connection pool friction. HikariCP โ the dominant Java connection pool โ does not support XA connections in its default configuration. You must switch to a JTA-aware pool (Agroal, c3p0 XA), reintroducing complexity.
XA is a viable choice for Java EE monoliths with multiple datasources under a single DBA team's control. It is not a distributed transaction solution for independently-owned polyglot microservices.
๐ง Deep Dive: The Saga Pattern โ Long-Running Transactions Without Global Locks
A Saga decomposes a distributed transaction into a sequence of local transactions, each fully committed by exactly one service. If a later step fails, the saga triggers compensating transactions in reverse order โ new, forward-moving database writes that undo the business effect of prior steps.
A saga never holds locks across services. Each local transaction commits and releases its locks immediately. Consistency is eventual โ the system may be momentarily inconsistent between steps โ but well-designed compensation guarantees convergence to a coherent final state.
There are two coordination styles:
| Saga style | Coordinator | Failure detection | Best for |
| Orchestration | Central saga class sends commands | Orchestrator tracks step state explicitly | Complex conditional workflows, multi-branch logic, compliance auditability |
| Choreography | None โ workflow is emergent from events | Each service reacts to its own failure events | High-throughput pipelines, independent team ownership, loose coupling |
The Saga Internals: Orchestration, Choreography, and Compensation Mechanics
In an Orchestration Saga, a central Saga Orchestrator holds the workflow definition. It sends a command to the first participant, waits for a success or failure event, and then decides the next command. On failure, it issues compensating commands in strict reverse order.
In a Choreography Saga, there is no central brain. Each service listens to domain events from a message broker and reacts: it performs its local transaction and emits the next event. A PaymentFailedEvent causes the Inventory Service to react with a ReleaseInventoryCommand โ no orchestrator needed. The workflow emerges from the event chain.
Compensation ordering is critical. Compensating transactions must reverse steps in the exact reverse sequence they were applied. If Payment fails after Inventory was reserved, the correct compensation order is:
- Step 3 โ CreateShipment: did not execute; no compensation needed.
- Step 2 โ AuthorizePayment: failed; issue
ReleaseInventoryCommandfirst. - Step 1 โ ReserveInventory: was successful; issue
CancelOrderonly after inventory is released.
Reversing this order (cancelling the order before releasing inventory) leaves the system readable as "cancelled order with unreleased stock" during the compensation window โ a different inconsistency introduced by the cleanup itself.
The compensability constraint. Every saga step must have a pre-designed compensating transaction. If a step's effect is physically irreversible โ triggering a warehouse pick, sending an SMS โ move it to the last position so it only fires after all prior steps confirm. Alternatively, design a semantic undo: "send cancellation confirmation" as the compensating action for "send order confirmation."
Performance Analysis: Lock Duration, Throughput, and Latency Characteristics
The performance profile of each approach is determined by lock duration:
| Protocol | Lock scope | Lock duration | Impact at 1000 TPS |
| 2PC / XA | All participants simultaneously | Full duration of coordinator round-trip | Lock contention across services; throughput degrades non-linearly |
| Saga | One service at a time | Duration of that service's local transaction only | Each step's lock is released before the next step begins |
| Saga (failed path) | Compensating steps only | Duration of each compensating transaction | Compensation runs serially in reverse; adds latency proportional to steps |
At 1,000 writes per second across three services, 2PC holds locks on all three databases for the entire coordinator round-trip (minimum 3 network hops: prepare ร 3 + commit ร 3). Saga holds locks on only one database at a time, each for the duration of a single local transaction. The throughput difference at scale is significant.
The trade-off: Saga's reduced locking comes at the cost of temporary inconsistency between steps. Design your system to surface intermediate saga states (e.g., PAYMENT_PENDING, INVENTORY_RESERVED) as meaningful business states rather than implementation artifacts.
Mathematical Model: The Theoretical Bound on Distributed Consistency
The FLP Impossibility result (Fischer, Lynch, Paterson, 1985) proves that in a fully asynchronous distributed system, no consensus algorithm can guarantee both safety (all participants agree on the same value) and liveness (the algorithm always terminates) in the presence of even a single process failure.
2PC violates liveness: when the coordinator fails in the uncertainty window, participants may wait indefinitely. This is not a bug in 2PC โ it is provably unavoidable for any protocol that attempts strong consistency across distributed participants.
The CAP Theorem formalizes the consequence: you cannot simultaneously guarantee Consistency, Availability, and Partition Tolerance. 2PC and XA choose Consistency + Partition Tolerance, sacrificing availability on coordinator failure. Saga chooses Availability + Partition Tolerance, sacrificing immediate consistency in favor of eventual convergence through compensation.
In practice, the choice between 2PC and Saga is not an engineering preference โ it is a direct consequence of your system's position on the CAP triangle, which in turn is a business decision about which failure mode is tolerable.
๐ Visualizing Distributed Transaction Flows
The Two-Phase Commit Protocol in Sequence
sequenceDiagram
participant C as Coordinator
participant P1 as Payment DB
participant P2 as Order DB
C->>P1: PREPARE
C->>P2: PREPARE
P1-->>C: VOTE_COMMIT
P2-->>C: VOTE_COMMIT
Note over C: All votes = COMMIT write commit record to WAL
C->>P1: COMMIT
C->>P2: COMMIT
P1-->>C: ACK
P2-->>C: ACK
Note over P1,P2: Both participants commit and release locks
This sequence diagram shows the nominal path: both participants vote yes, the coordinator writes its commit decision to its own WAL (making the decision durable), then broadcasts COMMIT to all participants. The uncertainty window is the gap between the coordinator writing its commit record and each participant receiving the COMMIT message โ if the coordinator crashes inside this window, participants are stuck with locks held and no safe unilateral action available.
The Saga Compensation Flow: Happy Path and Failure Recovery
flowchart TD
A[ Order Placed] --> B[Reserve Inventory local tx commits immediately]
B -->|" InventoryReservedEvent"| C[Authorize Payment local tx commits immediately]
C -->|" PaymentAuthorizedEvent"| D[Create Shipment local tx commits immediately]
D --> E[ Order Confirmed Saga Complete]
B -->|" OutOfStockEvent"| F[Cancel Order no prior steps to compensate]
F --> I[ Saga Ended Insufficient Stock]
C -->|" PaymentDeclinedEvent"| G[Release Inventory compensating tx reverse Step 1]
G --> H[Cancel Order compensating tx close saga]
H --> J[ Saga Compensated Payment Declined]
This flowchart shows both the happy path (all three local transactions succeed, saga ends cleanly) and the failure path (payment fails after inventory was already reserved). The compensation chain runs in strict reverse: ReleaseInventory fires before CancelOrder because the inventory reservation was the earlier step โ releasing it first restores the earlier invariant before the order record is cancelled. The key insight is that compensation is not a rollback โ it is forward-moving business logic that undoes a prior business effect.
๐ Real-World Applications: How Major Systems Handle Distributed Transaction Consistency
Uber's Trip Lifecycle โ Choreography Saga: Uber's core trip creation uses a choreography-based saga across Matching, Trip, Payment, and Driver Pay services. Each service emits domain events rather than calling upstream services directly. When a payment pre-authorization fails, the Trip service receives a PaymentFailedEvent and compensates by releasing the matched driver and cancelling the trip. There is no central orchestrator โ the workflow emerges from event reactions. This approach allows Uber to update any individual service independently without modifying a central workflow definition.
Amazon's Order Fulfillment โ Orchestration Saga: Amazon's order pipeline spans inventory reservation, payment authorization, warehouse pick instruction, and shipment label generation. Because the fulfillment workflow has conditional branching (multiple fulfilment centers, split shipments, pre-order holds), Amazon uses an orchestration model where a central Workflow Engine (similar to AWS Step Functions) tracks saga state and issues compensation commands explicitly. The branching logic and per-step retry policies live in one inspectable place.
Financial Services โ 2PC with XA: Core banking systems โ wire transfers between accounts in the same bank โ use XA transactions via JTA. Both the debit account and credit account are PostgreSQL datasources under the same DBA team's control. The amounts are small, the operations are milliseconds, and the strong consistency guarantee is a regulatory requirement. XA is appropriate here precisely because the constraints 2PC demands (same-vendor, co-managed, tight latency) are all satisfied.
The Booking.com Flight + Hotel Bundle โ Two-step Saga with hold/confirm: Booking.com's package booking uses a "hold-then-confirm" saga. Flight seats and hotel rooms are first held (reserved but not confirmed) via a local transaction in each vendor's system. Holds have a TTL. Only after both holds succeed does the saga send confirm commands. If a confirmation fails, the hold is allowed to expire โ a time-bounded semantic compensation rather than an explicit undo call. This is a practical variant of compensation design: not all compensations require an active rollback call.
๐๏ธ The Outbox Pattern: Atomically Publishing Events Alongside Database Writes
The Saga pattern immediately creates a dual-write problem: a service must both commit its local transaction to its own database and publish an event to a message broker (Kafka, RabbitMQ) so downstream saga participants can proceed. These are two separate writes. If the service crashes after the database commit but before the broker publish completes, the event is silently lost and the saga stalls permanently.
The Outbox pattern solves this by reducing the dual-write to a single atomic database transaction:
The outbox table lives in the same database as the service's business data. Each row captures everything the poller needs to forward the event: aggregate_type and aggregate_id identify which business entity changed, event_type names the domain event, and payload carries the event body as a JSON object. The published_at column acts as the pending/published flag โ a null value means the event is waiting to be forwarded; a non-null timestamp means the poller has successfully delivered it to the broker. A CDC connector such as Debezium can tail the database write-ahead log to detect new outbox rows without polling, forwarding each event with low latency while keeping the publication step entirely outside the business transaction.
The service writes its business data and its outbox event record in the same database transaction. A separate outbox poller (or a Change Data Capture connector like Debezium reading the PostgreSQL WAL) reads unpublished rows, publishes them to the broker, and marks them published.
Within a single database transaction, the service executes its business write โ decrementing stock โ and immediately writes an outbox event record. Both writes share the same transaction boundary: if the transaction commits, both the stock decrement and the outbox record land atomically; if the transaction rolls back for any reason, neither write survives. A separate outbox poller process reads rows with a null published_at timestamp, forwards each event to Kafka, and marks the row as published. Because the poller runs outside the business transaction and retries on failure, delivery is guaranteed at-least-once โ the same event may be published more than once if the poller crashes between forwarding and marking, which is precisely why every downstream saga participant must handle duplicate messages idempotently.
Because the outbox poller publishes events asynchronously after the database commit, delivery is at-least-once โ the same event may be published more than once on poller retry or during a broker redelivery. This is expected and correct behavior, not a bug โ but it means every downstream saga participant must implement idempotency.
๐๏ธ Idempotency: Making Saga Retries and At-Least-Once Delivery Safe
Saga orchestrators retry failed steps. Message brokers redeliver messages on consumer failure. Outbox pollers may publish the same event twice during a polling window overlap. Without idempotency, a retried AuthorizePaymentCommand charges the customer twice โ a real production failure at every company that skipped this step.
Every saga participant must be idempotent: processing the same request multiple times must produce exactly the same observable result as processing it once. The standard implementation is the idempotency key pattern:
Every saga participant checks an idempotency store before executing any business logic. The idempotency key โ generated by the orchestrator and stable across all retries of that specific step โ is looked up first; if a prior result exists, the participant returns it immediately without re-executing the operation. If no prior result exists, the participant executes the business logic and stores the result in the idempotency store within the same database transaction as the business write. This transactional coupling is critical: if the business write rolls back, the idempotency record must also roll back, ensuring the next retry correctly re-executes rather than finding a phantom "already completed" record from a failed attempt.
| Idempotency key design rule | Why it matters |
| Generated by the caller, not the recipient | The orchestrator controls the key โ recipients must never generate keys, or two retries generate two keys |
| Stable across all retries of the same step | Use a deterministic composite key: sagaId + ":" + stepName |
| TTL long enough to cover the retry window | After the window expires, the record can be garbage-collected |
| Stored in the same transaction as the business write | If the business write rolls back, the idempotency record must also roll back โ no orphaned records |
๐งช Practical Walkthrough: Tracing a Failed Checkout Saga Step by Step
The following trace walks through an Orchestration Saga for a checkout flow using the Outbox pattern for event publishing. Payment is declined after inventory was reserved.
Step 1 โ Order Placed
OrderServicereceives HTTP POST, writesOrder(id=X, status=PENDING)to its DB.- In the same transaction, writes
OutboxEvent(InventoryReserveRequested, orderId=X)to its outbox table. - Transaction commits. Outbox poller picks up the event and publishes to Kafka topic
saga.commands.
Step 2 โ Inventory Service Receives Command
InventoryServiceconsumesReserveInventoryCommand(orderId=X, quantity=3).- Checks idempotency store โ not seen before.
- Writes
Stock(-3)andOutboxEvent(InventoryReservedEvent, orderId=X)in one transaction. Commits. - Poller publishes
InventoryReservedEventtosaga.events.
Step 3 โ Orchestrator Advances Saga
OrderFulfillmentSagareceivesInventoryReservedEvent.- Updates saga state to
PAYMENT_PENDINGin the saga store. - Publishes
AuthorizePaymentCommand(orderId=X, amount=59.99)tosaga.commands.
Step 4 โ Payment Fails
PaymentServiceconsumesAuthorizePaymentCommand. Card declines.- Writes
OutboxEvent(PaymentDeclinedEvent, orderId=X)to its outbox. Commits. - Poller publishes
PaymentDeclinedEventtosaga.events.
Step 5 โ Orchestrator Triggers Compensation (reverse order)
OrderFulfillmentSagareceivesPaymentDeclinedEvent.- Updates saga state to
COMPENSATING. - First compensation: sends
ReleaseInventoryCommand(orderId=X)โ InventoryService restores stock. - Second compensation: sends
CancelOrderCommand(orderId=X, reason=PAYMENT_DECLINED)โ OrderService marks orderCANCELLED. - Saga state moves to
CANCELLED. Orchestrator marks saga ended.
Why compensation order matters here: If CancelOrderCommand fired first, the Order record would enter CANCELLED state while the Inventory record still shows RESERVED. Any concurrent read of order state plus inventory state during that gap would see a cancelled order that still has stock reserved โ the inverse inconsistency of the original failure. Releasing inventory first guarantees that the system moves from {order=PENDING, stock=RESERVED} โ {order=PENDING, stock=AVAILABLE} โ {order=CANCELLED, stock=AVAILABLE}, with each intermediate state being coherent.
โ๏ธ Trade-offs and Failure Modes: 2PC, XA, Saga Orchestration, and Saga Choreography Compared
| Dimension | 2PC | XA (JTA) | Saga โ Orchestration | Saga โ Choreography |
| Consistency model | Strong (atomic) | Strong (atomic) | Eventual | Eventual |
| Coordinator | External coordinator node | JTA transaction manager | Saga orchestrator class | None โ emergent |
| Availability on coordinator failure | Blocked โ uncertainty window | Blocked โ JTA manager is SPOF | Degraded but recoverable via saga store | Fully decoupled |
| Lock duration | Held across all participants for full 2PC duration | Held across all enrolled datasources | Per-step local tx only | Per-step local tx only |
| Participant requirements | Must implement 2PC protocol | Must use XA-capable datasource | Must handle command/event interface | Must align on shared event schema |
| Compensating transactions | Not needed โ atomic rollback | Not needed โ JTA rollback | Required โ explicit per-step design | Required โ per-service reaction |
| Polyglot datastores | โ No | โ No | โ Yes | โ Yes |
| Cross-HTTP service coordination | โ No | โ No | โ Yes | โ Yes |
| Observability | Transaction log | JTA transaction log | Centralized saga state store | Distributed tracing required |
| Debugging stuck transactions | Coordinator WAL | JTA recovery log | Saga state table โ single queryable source | Correlated trace IDs across all participants |
| Best use case | 2โ3 same-vendor DBs, hard atomicity | Java EE multi-datasource, co-managed DBs | Complex conditional workflows, compliance audit | High-throughput event pipelines, independent teams |
๐งญ Decision Guide: Choosing Your Distributed Transaction Strategy
| Situation | Recommendation |
| Financial reconciliation across 2โ3 same-vendor databases, hard atomicity required | 2PC โ use a highly-available coordinator; keep participant count minimal |
| Java EE / Spring app with multiple datasources under the same DBA team | XA (JTA) โ Atomikos or Bitronix; all datasources must be XA-capable |
| Long-running workflow (>100ms), polyglot services, eventual consistency acceptable | Saga Orchestration โ Axon Framework, Temporal, or AWS Step Functions |
| Services owned by independent teams, event-driven architecture already established | Saga Choreography โ Kafka events; each service reacts to domain events |
| Service must publish events alongside DB writes without risk of message loss | Outbox Pattern โ mandatory companion to any Saga that uses a message broker |
| Saga participant may receive duplicate commands on retry or broker redelivery | Idempotency Key โ caller-generated, stored in same transaction as business write |
| Need to choose between orchestration and choreography | Orchestration if the workflow has conditional branches or compliance audit requirements; Choreography if teams are independent and distributed tracing tooling is mature |
When eventual consistency is acceptable: In most e-commerce, booking, and logistics workflows, a gap of seconds to minutes of cross-service inconsistency is a modeled, expected state โ not data corruption. An order in PAYMENT_PENDING state while the payment service processes is a valid visible business state. Design intermediate saga states as first-class domain states and expose them in the UI meaningfully.
When eventual consistency is not acceptable: Real-time financial settlement with immediate debit/credit reconciliation, inventory reservation for strictly limited-quantity flash sales where oversell causes hard downstream consequences, and regulatory-mandated atomic audit trails. In these cases, accept 2PC or XA and the availability and performance trade-offs they impose.
๐ ๏ธ Axon Framework: How It Implements Saga Orchestration in Java
Axon Framework is the leading Java framework for CQRS, event sourcing, and saga orchestration. An Axon Saga is a stateful event-handler class that tracks a long-running business process. Axon persists saga state between events, routes events to the correct saga instance, handles retry, and manages lifecycle โ you focus on business logic and compensating commands.
An Axon Saga class is marked as a stateful orchestrator whose fields are automatically serialized and persisted to the saga store after every event handler runs. A @StartSaga annotation on one event handler tells Axon to create a new saga instance when that event type first arrives โ typically the order-placed event that initiates the workflow. SagaLifecycle.associateWith() registers the correlation key (such as orderId) so Axon routes all subsequent events to the correct saga instance rather than creating new ones. Each step in the workflow maps to a separate saga event handler: on InventoryReservedEvent, the orchestrator issues the payment authorization command; on PaymentDeclinedEvent, it issues compensation commands in strict reverse order โ releasing inventory before cancelling the order. Both the happy path and the compensation path terminate with an @EndSaga-annotated handler, at which point Axon serializes the final saga state and removes the instance from the active store. If the application restarts mid-saga, Axon deserializes the saved state and resumes from the last completed event โ no custom checkpointing required.
| Annotation / API | What it does |
@Saga | Marks the class as a saga; Axon persists its serialized state in the saga store between events |
@StartSaga | Axon creates a new saga instance when this event type first arrives |
SagaLifecycle.associateWith("key", value) | Registers an association so Axon routes future events to this specific instance |
@SagaEventHandler(associationProperty = "orderId") | Handles an event for the saga instance whose association matches |
@EndSaga | Axon marks the instance complete, serializes final state, and removes it from the active store |
Axon persists saga state after every handled event. If the application restarts mid-saga, the orchestrator resumes from the exact saved state โ no custom state management required.
For a full deep-dive on Axon's CQRS and event sourcing architecture, see Saga Pattern: Coordinating Distributed Transactions with Compensation.
๐ Lessons From Production Distributed Transaction Failures
Don't retrofit 2PC onto microservices. XA and 2PC were designed for a world of two or three co-located databases managed by the same team. Introducing them across independent services adds coordinator SPOF, prevents polyglot datastores, and makes each service's database upgrade a coordinated multi-team event. The operational cure is worse than the consistency problem it solves.
Compensating transactions must be total and pre-designed. You cannot write compensating transactions reactively when a failure first occurs in production. Design the compensation logic before the first line of saga code is written. For every step, answer: "If the next step fails, exactly what SQL statement undoes this step's business effect?" If you cannot answer it, the step is not compensable โ redesign the saga boundary.
Idempotency is not optional, and it is not free. Every network call in a distributed system will eventually be retried. At-least-once delivery is the default contract for every durable message broker. Treating idempotency as an afterthought means discovering double-charges, duplicate shipments, or double-credited driver payments in production โ after real money has moved.
The Outbox pattern is the only correct event publication mechanism. Any service that calls kafkaProducer.send() directly after a database commit has a silent data-loss bug on crash-between-commit-and-publish. This triggers on every application restart, deployment, or container eviction โ not just during catastrophic failures.
Saga state must be observable. A saga spanning five services with no central state visibility means that any stuck or failed saga requires reading five separate service logs with correlated trace IDs to diagnose. Invest in saga state queryability from day one โ Axon's saga store, a purpose-built saga state table, or Temporal's built-in workflow history. You will use it the first week a saga gets stuck in production.
Design intermediate states as first-class business states. The states visible during saga execution โ INVENTORY_RESERVED, PAYMENT_PENDING, SHIPMENT_SCHEDULED โ will be read by users, customer support, and downstream analytics. Model them explicitly in the domain. A customer who sees "Order Processing" for 30 seconds does not call support. A customer who sees a payment confirmation with no order reference calls within minutes.
๐ TLDR: Summary & Key Takeaways
- 2PC delivers atomic all-or-nothing commits across participants but blocks indefinitely during the coordinator uncertainty window after all votes are collected. Use it only for small numbers of same-vendor databases with hard consistency requirements and a highly-available coordinator.
- XA (JTA) is the Java standard for 2PC across multiple datasources. Viable in Java EE monoliths with co-owned databases; too brittle and vendor-tied for polyglot independent microservices.
- Saga Orchestration replaces global locking with a sequence of local transactions coordinated by a central orchestrator. Choose this when workflows are conditional, multi-branch, or require compliance audit trails โ and when your team owns all participants.
- Saga Choreography removes the central orchestrator entirely. Best for high-throughput event pipelines with services owned by independent teams, backed by mature distributed tracing.
- Outbox Pattern is mandatory for any Saga that publishes events to a broker. Writing to the outbox in the same database transaction as the business write is the only way to guarantee at-least-once event delivery without 2PC.
- Idempotency is required for every saga participant because at-least-once delivery and orchestrator retries guarantee duplicate commands in any production system. Design idempotency keys as caller-generated, deterministic, TTL-bounded, and stored transactionally.
- The core trade-off is not technical preference โ it is a business consistency model decision driven by the CAP theorem. Make it explicitly, design compensating transactions before writing saga code, and make intermediate states visible from day one.
๐ Related Posts
Test Your Knowledge
Ready to test what you just learned?
AI will generate 4 questions based on this article's content.

Written by
Abstract Algorithms
@abstractalgorithms
More Posts
Clock Skew and Causality Violations: Why Distributed Clocks Lie
TLDR: Physical clocks on distributed machines cannot be perfectly synchronized. NTP keeps them within tens to hundreds of milliseconds in normal conditions โ but under load, across datacenters, or after a VM pause, the drift can reach seconds. When s...
Stale Reads and Cascading Failures in Distributed Systems
TLDR: Stale reads return superseded data from replicas that haven't yet applied the latest write. Cascading failures turn one overloaded node into a cluster-wide collapse through retry storms and redistributed load. Both are preventable โ stale reads...
Split Brain Explained: When Two Nodes Both Think They Are Leader
TLDR: Split brain happens when a network partition causes two nodes to simultaneously believe they are the leader โ each accepting writes the other never sees. Prevent it with quorum consensus (at least โN/2โ+1 nodes must agree before leadership is g...
HyperLogLog Explained: Counting Billions of Unique Items with 12 KB
TLDR: HyperLogLog estimates the number of distinct elements in a dataset using ~12 KB of memory regardless of cardinality โ with ยฑ0.81% error. The insight: if you hash every element to a random bit string, the maximum length of leading zeros you obse...
