38 min readDatabases Transactions Isolation Levels

Lost Update Explained: When Two Writes Become One

How concurrent read-modify-write operations silently discard one transaction's changes — and four ways to prevent it from overselling your inventory.

Abstract Algorithms/Apr 11, 2026/System Design Interview Prep

Executive TLDR

TLDR: A lost update occurs when two concurrent read modify write transactions both read the same committed value, both compute a new value from it, and both write back — with the second write silently discarding the first.
The corruption is visible only in the final state.
Atomic SQL operations ( ) eliminate the vulnerability at any isolation level by collapsing read and write into one indivisible step.
Optimistic locking with a version column detects conflicts at write time.

Core mental model

Read this as a system of state, constraints, and failure boundaries.

How concurrent read-modify-write operations silently discard one transaction's changes — and four ways to prevent it from overselling your inventory.

Explain simpler Compare tradeoffs

Key systems visualization

The article’s conceptual path

📖 The Flash Sale That Sold an Item It No Longer Had

🔍 The Offline Spreadsheet Problem: What a Lost Update Actually Is

⚙️ The Read-Modify-Write Gap: Where the Anomaly Is Born

📊 Lost Update in Slow Motion: A Transaction-Level Sequence

🔍 Lost Update, Dirty Write, and Write Skew: Knowing the Difference Under Pressure

> TLDR: A lost update occurs when two concurrent read-modify-write transactions both read the same committed value, both compute a new value from it, and both write back — with the second write silently discarding the first. No error is raised. Both transactions report success. The corruption is visible only in the final state. Atomic SQL operations (`UPDATE SET val = val + 1`) eliminate the vulnerability at any isolation level by collapsing read and write into one indivisible step. `SELECT FOR UPDATE` serializes competing reads. Optimistic locking with a version column detects conflicts at write time. Serializable isolation catches all remaining cases. Critical database caveat: PostgreSQL auto-detects lost updates at Repeatable Read via first-committer-wins. MySQL at Repeatable Read does not — you must use explicit locking or atomic SQL regardless of isolation level.

📖 The Flash Sale That Sold an Item It No Longer Had

At 09:00:00 UTC on a high-traffic Friday, FlashMart's engineering team triggered a flash sale on the last remaining unit of the Gaming Headset Pro (product_id = 7842). The item had been promoted to 400,000 newsletter subscribers for 48 hours. By 09:00:01, two purchase requests arrived simultaneously from different data centers.

Server A (us-east-1) received Customer Alice's request at 09:00:01.003. It opened a database transaction, queried the products table, and read inventory_count = 1. It prepared to set inventory to 0 and create a new order.

Server B (us-west-2) received Customer Bob's request at 09:00:01.007 — just 4 milliseconds later. It opened its own independent transaction, ran the same query, and also read inventory_count = 1. It prepared the identical update.

Server A wrote inventory_count = 0 and committed at 09:00:01.041. Server B wrote inventory_count = 0 and committed at 09:00:01.049. Both commits succeeded without any error.

The warehouse received two dispatch orders for product_id = 7842. Zero units were on the shelf. Customer Alice received an "Order Confirmed — Shipment in 2 days" email at 09:00:01.200. Customer Bob received the exact same email at 09:00:01.208. FlashMart's support queue received 37 escalation tickets by 09:05:00. Both customers had paid $149.99. Only one would receive a headset.

No error appeared in the database log. Both transactions were syntactically correct, ran at the default Read Committed isolation level, and completed successfully. The inventory_count was decremented — once. But two purchases were processed. One decrement was silently lost.

This is the lost update anomaly.

🔍 The Offline Spreadsheet Problem: What a Lost Update Actually Is

Before defining the anomaly precisely, consider an analogy that makes the mechanism obvious.

Two colleagues — Alice and Bob — each receive a printed copy of the same spreadsheet at 9 AM. The spreadsheet shows the marketing budget remaining is $5,000. Alice is told to deduct $2,000 for a conference booking. Bob is told to deduct $1,500 for a trade show deposit. Both work on their own printouts. Alice's version shows $3,000 remaining. Bob's version shows $3,500. The finance team receives both emails and saves whichever arrived last. Alice's $2,000 deduction is gone. The correct balance of $1,500 is never recorded.

A lost update is exactly this. Each person worked from an identical starting snapshot, computed an independent modification, and submitted their result — with the later submission overwriting the earlier one as if it never existed.

Formal definition: A lost update occurs when Transaction T1 reads value V from a row, Transaction T2 reads the same value V from the same row, T1 writes a new value V₁ derived from V and commits, and T2 writes a new value V₂ also derived from V and commits — overwriting V₁. T1's update is permanently gone. The database holds V₂, which is only correct under the assumption that T1 never happened.

The most dangerous property of a lost update is that both transactions succeed from their own perspective. Neither T1 nor T2 sees a conflict. Neither rolls back. Neither raises an error. The corruption is invisible at the transaction boundary — it becomes apparent only when a downstream system (the warehouse) or a human observer notices that the final state is inconsistent with the operations that were reported as successful.

⚙️ The Read-Modify-Write Gap: Where the Anomaly Is Born

Lost updates emerge from a universal application pattern: the read-modify-write cycle. A transaction reads the current state of a row, computes a new value in application memory, and writes the result back. This three-step sequence contains a structural vulnerability: the gap between the read and the write.

During that gap, another transaction can perform its own read of the same row. Both transactions now hold an identical snapshot of the row's value in their respective application memories. When they each proceed to write their independently-computed results, the second write does not "add to" or "reconcile with" the first — it blindly overwrites. The database has no way to know that the new value was derived from the same starting point as the previous write.

The sequence below isolates the gap window: both T1 and T2 complete their reads before either has written, so both proceed to write an independently computed value — and the later write silently discards the earlier one.

sequenceDiagram
    participant T1 as Transaction T1
    participant DB as Database
    participant T2 as Transaction T2
    T1->>DB: READ row value = V
    T2->>DB: READ row value = V
    Note over T1,T2: Both hold V in app memory - gap window open
    T1->>T1: Compute new value V1 from V
    T2->>T2: Compute new value V2 from V
    T1->>DB: WRITE value = V1
    DB-->>T1: Commit OK
    T2->>DB: WRITE value = V2
    DB-->>T2: Commit OK
    Note over DB: V2 overwrites V1 - T1 update is lost

Non-atomic cycle — vulnerable to lost update:

  1. T1: READ  inventory_count = 1   → computes: 1 - 1 = 0
  2. T2: READ  inventory_count = 1   → computes: 1 - 1 = 0
  3. T1: WRITE inventory_count = 0   → committed
  4. T2: WRITE inventory_count = 0   → overwrites T1, committed
  Final: inventory_count = 0  (two sales processed, one decrement lost)

An atomic operation eliminates the gap entirely. The modification is expressed as a single SQL statement that the database engine evaluates and applies in one indivisible step. No other transaction can interleave a read between the evaluation and the write:

Atomic operation — no lost update gap:

  T1: UPDATE SET inventory_count = inventory_count - 1 WHERE inventory_count > 0
  T2: UPDATE SET inventory_count = inventory_count - 1 WHERE inventory_count > 0
  The database evaluates each expression against the committed row value at write time.
  Final: inventory_count = -1 prevented by the WHERE guard; exactly one sale succeeds.

With atomic operations, the expression inventory_count - 1 is evaluated inside the database engine using the row's current committed value at the moment of the write — not the value the application read earlier. The read and write are collapsed into a single database operation, and the vulnerability window ceases to exist.

Not all updates can be expressed atomically. Complex business logic — applying a tiered loyalty discount, computing a shipping cost from multiple table lookups, or running a fraud score — cannot always be collapsed into a single SQL expression. For those cases, the other three prevention strategies apply.

📊 Lost Update in Slow Motion: A Transaction-Level Sequence

The following sequence diagram traces the exact database-level events during the FlashMart flash sale. Pay attention to the overlap window between T1's SELECT and T2's SELECT — that is where the anomaly is born.

sequenceDiagram
    participant T1 as Transaction T1 - Server A
    participant DB as Stock Database
    participant T2 as Transaction T2 - Server B

    T1->>DB: BEGIN
    T2->>DB: BEGIN
    T1->>DB: SELECT inventory_count FROM products WHERE id=7842
    DB-->>T1: inventory_count = 1
    T2->>DB: SELECT inventory_count FROM products WHERE id=7842
    DB-->>T2: inventory_count = 1
    Note over T1: Computes 1 - 1 = 0, ready to write
    Note over T2: Computes 1 - 1 = 0, ready to write
    T1->>DB: UPDATE products SET inventory_count = 0 WHERE id=7842
    DB-->>T1: 1 row updated
    T1->>DB: COMMIT
    DB-->>T1: T1 committed successfully
    T2->>DB: UPDATE products SET inventory_count = 0 WHERE id=7842
    DB-->>T2: 1 row updated
    T2->>DB: COMMIT
    DB-->>T2: T2 committed successfully
    Note over DB: Final state: inventory_count = 0. T1 update overwritten.

The diagram shows the four critical moments that produce the lost update: (1) both T1 and T2 read inventory_count = 1 before either has written — they share an identical snapshot; (2) each independently computes the same target value of 0; (3) T1 commits cleanly, setting inventory to 0; (4) T2 also commits cleanly, writing 0 again — which overwrites T1's write even though T1 already committed. The database ends at inventory_count = 0, which is correct for one sale but wrong for two. T1's decrement is gone with no trace.

🔍 Lost Update, Dirty Write, and Write Skew: Knowing the Difference Under Pressure

Lost updates are frequently confused with two related anomalies — dirty writes and write skew — in interviews and code reviews. All three are write conflicts involving concurrent transactions, but they differ in the data each transaction reads and the rows each transaction writes.

Anomaly	Both transactions write the same row?	Reads dirty (uncommitted) data?	Involves different rows?	Root cause
Lost Update	✅ Yes	❌ No — reads committed data	❌ No	Second committed write overwrites the first; gap between read and write
Dirty Write	✅ Yes	✅ Yes — T2 reads T1's uncommitted write	❌ No	T2 overwrites T1's uncommitted value; if T1 rolls back, state is incoherent
Write Skew	❌ No — each writes a different row	❌ No — reads committed data	✅ Yes	Both check a shared constraint across multiple rows; both pass; combined result violates the constraint

Lost update is the only one of the three that involves exclusively committed reads. T1 and T2 both see valid, committed data — the anomaly arises purely from the temporal gap between their respective reads and writes. This is precisely why it survives at Read Committed isolation: there are no dirty (uncommitted) values in play, so the isolation level's dirty-read protection provides no benefit.

Dirty write is prevented by virtually all mainstream databases. Most engines refuse to let T2 overwrite T1's uncommitted row because T1 holds a write lock that is not released until T1 commits or rolls back. You are unlikely to encounter a dirty write in PostgreSQL, MySQL InnoDB, or CockroachDB under normal operation.

Write skew requires a constraint that spans multiple rows — the canonical example is two doctors both going off-call after independently checking that at least one remains on-call. Neither transaction writes the row that the other reads, so snapshot isolation does not detect the conflict. Write skew requires either Serializable isolation or application-level SELECT FOR UPDATE that covers every row in the constraint check.

The practical interview rule: if two transactions overwrite the same row using committed reads, it is a lost update. If they write different rows and together violate a shared constraint, it is write skew.

🧠 How Databases Actually Detect (and Miss) Lost Updates

The most consequential practical distinction in production systems is not which isolation level is configured in theory — it is whether your specific database implementation detects lost updates at that level.

Internals: How MVCC and Write Tracking Create (and Close) the Lost-Update Window

Every major database that supports snapshot-based isolation uses Multi-Version Concurrency Control (MVCC). When T1 reads a row, it does not acquire a traditional read lock — it receives a pointer to the most recently committed version of that row as of the transaction's start time. The row itself can continue to receive new committed versions from other transactions. This is what makes snapshot isolation highly concurrent: readers never block writers, and writers never block readers.

The lost-update window exists because MVCC, in its base form, only tracks which version was read — not whether the row was modified since the read. When T2 attempts to write, a basic MVCC engine checks that T2 is writing a row that exists (not a phantom) but does not automatically verify that the row's value has not changed since T2's snapshot was taken. The second write proceeds unaware of T1's committed change.

Databases that prevent lost updates extend MVCC with write-conflict tracking: when a transaction commits a write to a row, the engine records a "last-writer" marker on that row's latest version. When a subsequent transaction attempts to write the same row, the engine compares the current committed version with the version the transaction read. If they differ, the engine knows a write-write conflict has occurred — and can choose to abort the later transaction (first-committer-wins) rather than silently overwrite.

PostgreSQL implements this extension as part of its Serializable Snapshot Isolation (SSI) subsystem. MySQL InnoDB does not implement this extension at Repeatable Read — its MVCC records only the read snapshot, not write-conflict markers, leaving the gap open.

Performance Analysis: The Throughput Cost of Conflict Detection

Write-conflict tracking adds overhead in two dimensions. First, memory overhead: the engine must maintain additional metadata on modified rows — essentially a lightweight "who last wrote this, and when" record — across the transaction's lifetime. Under high concurrency, this metadata grows proportionally with the number of concurrent write transactions.

Second, abort overhead: when the engine detects a write-write conflict and aborts T2, T2's work is discarded. The application must retry from scratch — re-read, re-compute, re-write. Under high contention (many transactions competing for the same row), the effective commit rate can drop below what a pessimistic lock would achieve, because T2 gets further through its work before failing than it would have if it had simply blocked at the SELECT FOR UPDATE step.

The throughput ordering under high write contention for a single hot row is approximately: atomic SQL (no conflict possible) > SELECT FOR UPDATE (T2 blocks early, no wasted work) > optimistic locking / SSI (T2 reads and computes before being aborted, wasted work at high conflict rates). For read-heavy, low-conflict workloads the ordering inverts: optimistic locking and SSI impose near-zero overhead since conflicts rarely occur.

PostgreSQL: First-Committer-Wins at Repeatable Read

PostgreSQL's Repeatable Read isolation uses Snapshot Isolation (SI): each transaction operates on a point-in-time snapshot of the database taken at transaction start. All reads within the transaction see data as it existed at that moment, regardless of concurrent commits. This prevents non-repeatable reads.

PostgreSQL adds a critical extension: the first-committer-wins rule. When T2 attempts to write a row that T1 has already committed a write to since T2's snapshot was taken, PostgreSQL detects the write-write conflict and aborts T2 with:

ERROR:  could not serialize access due to concurrent update
DETAIL:  Process 12345 updated this row while this transaction was in progress.

The application must catch this error and retry the transaction from the beginning. This behavior means that PostgreSQL at Repeatable Read automatically prevents lost updates — no explicit locking is required for simple read-modify-write cycles.

At the default Read Committed level, this protection does not apply. Each statement gets a fresh snapshot, and write-write conflicts are not detected. Lost updates are possible at Read Committed on PostgreSQL.

MySQL InnoDB: Repeatable Read Does Not Auto-Detect

MySQL InnoDB also implements MVCC with a transaction-start snapshot at Repeatable Read, similar in name to PostgreSQL's behavior. The critical difference: MySQL does not apply the first-committer-wins rule. When T2 attempts to write a row that T1 already committed, MySQL allows T2's write to proceed without error. The lost update occurs silently.

This is one of the most consequential behavioral differences between the two most widely deployed open-source databases. Engineers migrating between them — or operating both — have discovered production data corruption bugs that were silently suppressed by one engine and silently allowed by the other.

The MySQL rule: Never rely on Repeatable Read alone to prevent lost updates in MySQL InnoDB. Always use explicit SELECT ... FOR UPDATE locking, atomic SQL operations, or application-level optimistic locking with a version column.

CockroachDB: Serializable by Default

CockroachDB enforces Serializable isolation (SSI) by default. Every transaction automatically detects write-write conflicts and aborts the losing transaction with a 40001 serialization_failure error. Lost updates are impossible without explicitly downgrading isolation — which CockroachDB strongly discourages. Applications must be written to retry on 40001, but they gain the guarantee that no lost update, write skew, or phantom read can occur.

Redis: Atomic at the Instance Level, Vulnerable in Cluster Mode

A single Redis instance processes commands serially — only one command executes at a time. Built-in commands like INCR and DECR are inherently atomic at the instance level, making them immune to lost updates. The read-modify-write gap does not exist when a single command encapsulates both operations.

In Redis Cluster, keys are distributed across shards. If a client reads from a replica and writes to a primary, or if related keys live on different shards, the same lost-update window re-emerges. Redis provides WATCH, MULTI, and EXEC as an optimistic concurrency mechanism at the client level: WATCH marks a key for monitoring, and the subsequent EXEC aborts the entire queued transaction block if the watched key changed since WATCH was issued — effectively implementing client-side optimistic locking.

📊 Visual Reference

sequenceDiagram
    participant T1 as Transaction 1
    participant DB as Database
    participant T2 as Transaction 2

    T1->>DB: Read X (value=100)
    T2->>DB: Read X (value=100)
    T1->>DB: Write X = 50
    T2->>DB: Write X = 75
    DB->>DB: Conflict!

📊 Isolation Level Prevention Matrix: Which Levels Actually Catch Lost Updates

Not all isolation levels prevent lost updates, and — as MySQL's behavior demonstrates — even the same named level can behave differently across databases. This table reflects both ANSI SQL standard semantics and the documented behavior of the three most common production databases.

Isolation Level	ANSI SQL prevents lost updates?	PostgreSQL	MySQL InnoDB	CockroachDB
Read Uncommitted	❌ No	Promoted to Read Committed	❌ No	Not supported
Read Committed	❌ No	❌ No	❌ No	Not supported
Repeatable Read	❌ No (ANSI does not require it)	✅ Yes (first-committer-wins via SSI)	❌ No (silent overwrite)	Not supported
Snapshot Isolation	✅ Yes (where implemented)	✅ Yes	❌ MySQL RI ≠ true SI for writes	✅ Yes
Serializable	✅ Yes	✅ Yes	✅ Yes (SSI enforced)	✅ Yes (default)

The practical takeaway from this table: never assume your isolation level prevents lost updates without verifying your database's specific implementation. In MySQL, treat Repeatable Read as equivalent to Read Committed for lost-update purposes. In PostgreSQL, Repeatable Read provides first-committer-wins protection — but your application still needs retry logic on could not serialize access errors.

📊 The Full Purchase Flow When a Lost Update Strikes the Architecture

The following architecture diagram shows how two simultaneous purchase requests travel through a real-world e-commerce system and produce an oversell event. The lost update is not a single-database problem — its consequences cascade from the Inventory Service through the Order Service to Warehouse Dispatch.

graph TD
    A[Customer A - Purchase Request] --> LB[Load Balancer]
    B[Customer B - Purchase Request] --> LB
    LB --> S1[Server A - Inventory Service]
    LB --> S2[Server B - Inventory Service]
    S1 -->|T1: SELECT inventory=1| DB[(Stock Database)]
    S2 -->|T2: SELECT inventory=1| DB
    S1 -->|T1: UPDATE SET inventory=0| DB
    S2 -->|T2: UPDATE SET inventory=0| DB
    DB -->|Both commits succeed| ORD[Order Service]
    ORD -->|Order-1001 confirmed| WH[Warehouse Dispatch]
    ORD -->|Order-1002 confirmed| WH
    WH --> OVER[Zero items, two dispatch orders created]

The diagram traces the complete failure path from customer request to warehouse error. Two requests enter through the load balancer, each is handled by a separate Inventory Service process with its own database connection. Both processes read inventory = 1 before either has written. Both write inventory = 0. The Order Service receives two successful signals from the Inventory Service — it has no visibility into the collision — and creates two confirmed orders. The Warehouse Dispatch system receives two dispatch instructions for a product with zero remaining stock. The anomaly originates in the read overlap window at the Stock Database, but its effects propagate without any error signal through every downstream component.

⚙️ Four Ways to Prevent Lost Updates Before They Reach Production

Strategy 1: Atomic SQL Operations — The Default for All Counter and Quantity Updates

Atomic operations eliminate the read-modify-write gap at the database level. The modification is expressed as a single self-referential SQL statement. The database evaluates and applies the update in one indivisible operation — no other transaction can interleave a read between the evaluation and the write.

This strategy works at every isolation level, including Read Committed. It requires no version columns, no retry logic, and no lock management overhead. It is the fastest, simplest, and most reliable approach for any modification that can be expressed as a function of the current value.

The limitation is expressiveness: complex multi-step conditional business logic (tier-based pricing, multi-table constraint checks) cannot always be reduced to a single SQL expression. For those cases, move to strategies 2–4.

Applicable to: Counter increments, quantity decrements, running totals, score updates, balance adjustments — any scalar modification described as a pure function of the current value.

Strategy 2: SELECT FOR UPDATE — Serialize the Read Under High Contention

SELECT ... FOR UPDATE acquires a row-level exclusive write lock on the selected rows at the moment of the read. Any other transaction attempting to read those rows with FOR UPDATE blocks until the locking transaction commits or rolls back. The read-modify-write cycle is serialized: T2 cannot even read the row's current value until T1 is completely finished.

The lock is held for the duration of the transaction and released only at COMMIT or ROLLBACK. This eliminates the overlap window by making T2 wait for the gap to close before it can open one of its own.

Trade-off: Locks held for the full transaction duration can become throughput bottlenecks under high concurrency. A long-running transaction holding a FOR UPDATE lock on a hot row can cause other transactions to queue behind it. For low-to-medium contention workloads, optimistic locking (Strategy 3) is usually more throughput-efficient because reads proceed without blocking and conflicts are resolved only at write time.

Applicable to: High-contention rows with complex update logic — seat reservations, booking systems, financial transfers, any row touched by hundreds of concurrent transactions.

Strategy 3: Optimistic Locking with Version Columns — Low Overhead, Retry-Based Safety

Optimistic locking assumes conflicts are rare and checks for them at write time rather than preventing concurrent reads. Every row carries a version column (integer). The application reads the row including its current version, computes the new value, and writes back using a conditional UPDATE that includes WHERE version = :old_version in its predicate.

If another transaction committed a write between the read and the write, the version will have advanced. The conditional UPDATE matches zero rows. The application detects this (rows_affected = 0), treats it as a lost-update conflict, and retries the full read-compute-write cycle from the beginning.

Under low contention — where most transactions succeed on first attempt — optimistic locking delivers near-zero overhead. Reads are non-blocking. No lock infrastructure is required. The retry path is rarely taken.

Under high contention, the retry amplification effect can degrade throughput significantly. If 50 transactions simultaneously compete for the same version, 49 of them will fail on first attempt and retry — potentially failing again if the winning transaction is fast enough to repeat the cycle. At this contention level, SELECT FOR UPDATE (Strategy 2) serializes writes and avoids the retry storm.

Applicable to: Low-conflict record edits, user-facing forms that save data after a "think time" gap, API resources where clients infrequently edit the same record.

Strategy 4: Serializable Isolation — The Safety Net for Complex Workloads

Serializable isolation guarantees that the result of all concurrent transactions is equivalent to some valid sequential execution. Any transaction pattern that could produce a lost update — or write skew, or phantom read — causes one of the conflicting transactions to be aborted and retried automatically.

This is the highest-coverage approach. You do not need to identify specific hot rows, add version columns, or audit every code path for read-modify-write cycles. The database detects all anomaly patterns automatically.

The cost is throughput. PostgreSQL's Serializable Snapshot Isolation (SSI) is relatively efficient, but it requires the engine to track read and write dependencies across transactions. Under high conflict rates, more transactions abort and retry. For most OLTP systems, targeted application of Strategies 1–3 on the specific hot paths is more efficient than blanket Serializable isolation on every connection.

Applicable to: Systems where write skew is also a concern, compliance workloads requiring guaranteed correctness, or codebases where auditing every read-modify-write path is impractical.

📊 Visual Reference

sequenceDiagram
    participant T1 as Transaction 1
    participant DB as Database
    participant T2 as Transaction 2

    T1->>DB: Read X (value=100)
    T2->>DB: Read X (value=100)
    T1->>DB: Write X = 50
    T2->>DB: Write X = 75
    DB->>DB: Conflict!

🏗️ Lost Updates Across Microservices and Distributed Databases

The lost update problem becomes structurally harder in distributed systems because there is no single transaction boundary that spans multiple services or data stores.

Cross-service read-modify-write: In a microservices architecture, an Inventory Service and a Pricing Service may both read from the same Order record in a shared database and write back independently. If they use separate database connections, no shared transaction detects the write-write conflict. The solutions are: (a) designate a single service as the authoritative writer and route all Order record updates through it; (b) use an event-driven append-only pattern (the Outbox Pattern or Change Data Capture) where each service emits an event and a single downstream consumer applies state changes sequentially; or (c) embed version fields in the API contract — the PUT request body carries a version field, and the service refuses updates where version does not match the current database value.

DynamoDB Optimistic Concurrency Control: DynamoDB has no multi-row transactions at scale, but it provides conditional writes via ConditionExpression. Setting ConditionExpression: "version = :v" on a PutItem or UpdateItem call implements the exact same version-check pattern as SQL optimistic locking — the operation is rejected if the stored version does not match the expected value. The application retries on ConditionalCheckFailedException.

CRDTs for commutative operations: Conflict-Free Replicated Data Types (CRDTs) provide a mathematical framework for merging concurrent writes without coordination. A CRDT counter can accept increments from multiple nodes across a cluster and converge to the correct total without a coordinator — because addition is commutative. CRDTs are used in distributed caches and multi-leader replication systems (Riak, distributed Redis counters) where exact consistency is not required but monotonic progress is guaranteed. They do not solve lost updates for non-commutative writes (setting a value to a specific result of complex logic), which is the common application case.

Eventual consistency systems: Apache Cassandra and similar AP databases accept concurrent writes to the same cell by design. Last-Write-Wins (LWW) is the default conflict resolution — the write with the highest timestamp survives. This is a structural lost update by design. Applications built on Cassandra that need update safety must either use lightweight transactions (IF version = x), use a single-partition serial consistency level, or architect their writes as append-only events rather than read-modify-write updates.

🌍 Where Lost Updates Have Cost Real Companies Real Money

Inventory overselling in flash sales: The FlashMart scenario in the opening is representative of real incidents at Amazon, Shopify, and Ticketmaster during high-demand product launches and concert ticket drops. The common root cause is an inventory service operating at Read Committed with a read-modify-write cycle and no FOR UPDATE lock or atomic decrement. The business cost extends beyond the single oversold unit: customer service escalation, refund processing costs, reputational damage, and loss of repeat purchase intent from affected customers.

Loyalty points double-spending: A loyalty program that awards points on purchase and redeems them at checkout faces a double-spend risk when point balance updates are non-atomic. A customer redeems 500 points at a checkout terminal while a background job simultaneously awards 100 bonus points for a qualifying purchase. Both processes read the current balance of 600, both compute their modification, and the final balance reflects only the last write. The customer either receives phantom points or loses a legitimate redemption — neither outcome is acceptable, and neither raises a database error.

Comment counter and engagement metric corruption: High-traffic content platforms that track likes, share counts, or view counts via application-level read-modify-write operations experience persistent counter drift. A post receiving 10,000 simultaneous view events — each represented as a read-increment-write cycle — can drift its counter by thousands within minutes. The solution is well-known (UPDATE posts SET view_count = view_count + 1) but is frequently skipped in early implementations that scale unexpectedly beyond their original design assumptions.

Financial ledger discrepancies: In banking and fintech, a balance adjustment that reads the source account balance, subtracts a transfer amount, and writes back is a classic lost-update surface. Two simultaneous transfers from the same account can each read the full balance, each compute a valid deduction, and each commit — leaving the account with a balance that reflects only one of the two deductions. Banks mitigate this with row-level locking, debit/credit ledger entries (append-only writes that avoid read-modify-write entirely), and idempotency keys that prevent duplicate processing.

⚖️ Trade-offs: What Each Prevention Strategy Costs in Throughput and Complexity

Prevention Strategy	Isolation level required	Throughput impact under contention	Application complexity	Detects conflict automatically?
Atomic SQL	Any (works at Read Committed)	Lowest — no locks, no retries needed	Lowest — change the SQL expression only	N/A — gap is eliminated
SELECT FOR UPDATE	Read Committed or higher	Medium — row locks cause queuing behind hot rows	Low — wrap reads in FOR UPDATE	Blocks T2 at read; no app retry needed
Optimistic locking	Read Committed or higher	Low under low contention; degrades under high contention	Medium — version column schema change, retry loop required	App retries on 0 rows-affected
Serializable isolation	Serializable	Highest — conflict tracking overhead, more aborts at high concurrency	Medium — catch serialization errors, implement retry	Database aborts and signals automatically

The two most common failure patterns in production: (a) under-protection — using Read Committed with a read-modify-write cycle and no locking, assuming "the database handles it" — it does not; and (b) over-protection — running every transaction at Serializable when only a handful of critical hot paths actually need it, degrading overall throughput unnecessarily for uncontested reads.

🧭 Choosing the Right Prevention Strategy: A Decision Guide

The following flowchart captures the practical decision logic for selecting a lost-update prevention strategy. Start from the cheapest option and move toward heavier mechanisms only when the update pattern genuinely requires it.

flowchart TD
    A[Lost update risk identified on a row] --> B{Can the update be expressed as a single SQL expression?}
    B -->|Yes - e.g. val = val plus delta| C[Atomic SQL - works at any isolation level]
    B -->|No - complex multi-step logic required| D{Expected contention level on this row?}
    D -->|High - many concurrent writes| E[SELECT FOR UPDATE - pessimistic lock]
    D -->|Low to medium - conflicts are infrequent| F[Optimistic locking with version column]
    E --> G{Does the update span multiple rows or tables?}
    G -->|Yes - shared constraint across row set| H[Serializable isolation or SSI]
    G -->|No - single hot row| I[FOR UPDATE on that row is sufficient]
    F --> J{Retry rate acceptable in production?}
    J -->|Yes - few retries observed| K[Optimistic locking confirmed]
    J -->|No - too many conflicts causing retry storms| E

The flowchart guides the selection: atomic SQL eliminates the problem with no infrastructure cost and is the right default for any increment or decrement. Optimistic locking is appropriate for complex logic on low-contention rows. SELECT FOR UPDATE belongs on high-contention hot rows. Serializable isolation serves as the safety net when write skew is also present, or when auditing every write path is impractical.

Situation	Recommended Strategy
Counter increment, quantity decrement, running total	Atomic SQL: `UPDATE SET col = col + delta WHERE col > 0`
Complex business rule on a single hot row (booking, seat reservation)	`SELECT ... FOR UPDATE`
Low-conflict record edits (user profile, document save after think time)	Optimistic locking with `version` column
Write-skew risk across multiple rows (on-call scheduling, overlapping bookings)	Serializable isolation
MySQL InnoDB at any isolation level with complex update logic	Always use `FOR UPDATE` or atomic SQL — Repeatable Read does not auto-protect
Redis distributed cluster with shared counters	`WATCH` / `MULTI` / `EXEC` optimistic concurrency
Microservices with separate database connections updating shared state	Conditional writes in API contract or single authoritative writer

🧪 Practical Walkthrough: Tracing a Lost Update Through Concrete Numbers

This section walks through two concrete scenarios — one showing the failure mode clearly, one showing each prevention strategy applied to the same scenario — so the mechanism and the fix are directly comparable without abstraction.

Scenario A — The Vulnerable Read-Modify-Write Cycle

A sessions table tracks active login count per user. User 42 has active_sessions = 2. Two requests arrive simultaneously: the web service opens a new session for a new browser tab, and the mobile app also opens a new session.

Step	Web Service (T1)	Mobile Service (T2)	sessions.active_sessions
1	`SELECT active_sessions WHERE user_id=42` → 2	—	2 (committed)
2	—	`SELECT active_sessions WHERE user_id=42` → 2	2 (committed)
3	Computes 2 + 1 = 3, issues UPDATE	—	—
4	`UPDATE SET active_sessions = 3` → committed	—	3
5	—	Computes 2 + 1 = 3, issues UPDATE	3
6	—	`UPDATE SET active_sessions = 3` → committed	3

Final value: active_sessions = 3. Correct value: active_sessions = 4. One login was lost. The user has three tracked sessions but four browsers open — the session count is now permanently off.

Scenario B — Fixed with an Atomic Operation

Replace the read-modify-write cycle with a single atomic increment. No reads are fetched into application memory. No intermediate value is computed outside the database.

Step	Web Service (T1)	Mobile Service (T2)	sessions.active_sessions
1	`UPDATE SET active_sessions = active_sessions + 1 WHERE user_id=42`	—	2 → 3 committed
2	—	`UPDATE SET active_sessions = active_sessions + 1 WHERE user_id=42`	3 → 4 committed

Final value: active_sessions = 4. Correct. The active_sessions + 1 expression is evaluated by the database using the row's current committed value at each write's execution moment. There is no gap for T2 to read a stale value — T2 reads and increments in a single operation after T1's write is committed.

Scenario C — Fixed with Optimistic Locking

When the update logic is too complex for a single expression (e.g., a tiered cap that prevents active_sessions exceeding 5 for a free-tier user), the application reads the current value and version, applies the tier-check logic, and writes back conditionally. If another transaction committed a change in the meantime, the version will have advanced and the UPDATE will affect 0 rows — signalling the application to retry.

Step	Web Service (T1)	Mobile Service (T2)	sessions row
1	`SELECT active_sessions=2, version=19 WHERE user_id=42`	—	version=19
2	—	`SELECT active_sessions=2, version=19 WHERE user_id=42`	version=19
3	Applies tier logic → 3; `UPDATE SET active_sessions=3, version=20 WHERE user_id=42 AND version=19` → 1 row	—	version=20
4	—	Applies tier logic → 3; `UPDATE SET active_sessions=3, version=20 WHERE user_id=42 AND version=19` → 0 rows	version=20
5	—	0 rows → conflict detected, retry: re-reads version=20, re-applies logic → 4; UPDATE with version=20 → 1 row	version=21

Final value: active_sessions = 4, version = 21. Correct. The version column acts as a conflict sentinel: it proves whether the row was modified between the application's read and its write, enabling safe retry without ever requiring a lock.

🛠️ PostgreSQL, MySQL, and Redis: Enforcing Lost-Update Prevention in Practice

This section shows short DDL and config-level SQL patterns — not application code — that implement each prevention strategy in the three most commonly deployed data stores. Every snippet represents the minimum database-layer change needed to enforce safety.

PostgreSQL: SELECT FOR UPDATE Serializes the Read

Wrapping the SELECT in FOR UPDATE acquires a row-level exclusive lock that persists until the transaction ends. T2's SELECT FOR UPDATE on the same row blocks at the read step until T1 commits or rolls back — the overlap window is eliminated entirely.

-- PostgreSQL: pessimistic locking pattern for inventory decrement
BEGIN;

SELECT inventory_count
  FROM products
  WHERE id = 7842
  FOR UPDATE;
-- T2 blocks here until this transaction commits

UPDATE products
  SET inventory_count = inventory_count - 1
  WHERE id = 7842
    AND inventory_count > 0;

COMMIT;

The FOR UPDATE on the SELECT is the only required change relative to the vulnerable pattern. Without it, the SELECT takes a shared MVCC snapshot and does not block concurrent reads. With it, T2 cannot read the locked row until T1 commits — serializing the read-modify-write cycle.

PostgreSQL: Optimistic Locking with a Version Column

The products table is extended with a version integer column. The UPDATE includes a WHERE version = :old_version predicate. If the version changed between the application's read and this write, the update matches zero rows — the application detects the conflict and retries.

-- Schema: add version column (one-time migration)
ALTER TABLE products
  ADD COLUMN version INTEGER NOT NULL DEFAULT 1;

-- Read step: capture current version
SELECT inventory_count, version
  FROM products
  WHERE id = 7842;
-- Returns: inventory_count = 1, version = 7

-- Write step: conditional on version matching
UPDATE products
  SET inventory_count = 0,
      version         = version + 1
  WHERE id    = 7842
    AND version = 7;
-- rows_affected = 0  →  version changed, conflict detected, application retries
-- rows_affected = 1  →  committed successfully, version is now 8

The version column pattern is database-agnostic — the same WHERE version = :v condition works identically in MySQL InnoDB, SQL Server, Oracle, and CockroachDB.

MySQL InnoDB: Explicit FOR UPDATE Is Mandatory

MySQL at Repeatable Read does not auto-detect lost updates. SELECT ... FOR UPDATE is required for every read-modify-write cycle that must be safe. The AND inventory_count > 0 guard in the UPDATE adds a defense-in-depth check against negative inventory even if locking is momentarily bypassed.

-- MySQL: explicit locking required regardless of isolation level
START TRANSACTION;

SELECT inventory_count
  FROM products
  WHERE id = 7842
  FOR UPDATE;

UPDATE products
  SET inventory_count = inventory_count - 1
  WHERE id              = 7842
    AND inventory_count > 0;

COMMIT;

For the atomic SQL equivalent in MySQL — which needs no lock at all — the single statement UPDATE products SET inventory_count = inventory_count - 1 WHERE id = 7842 AND inventory_count > 0 provides complete protection at any isolation level.

Redis: WATCH / MULTI / EXEC for Optimistic Concurrency

The standard Redis MULTI / EXEC block batches commands but does not prevent another client from modifying a key between a GET and the queued SET. Adding WATCH turns the block into an optimistic transaction: if the watched key changes between WATCH and EXEC, the EXEC returns null (the block is discarded) and the client retries from the WATCH.

# Redis DDL-level pseudoconfig (not application code)
# Optimistic concurrency for inventory key

WATCH inventory:7842
  # Read current value outside MULTI block
  GET inventory:7842            → "1"
MULTI
  SET inventory:7842 0
EXEC
# Returns null if inventory:7842 changed since WATCH → client retries
# Returns array of results if committed successfully

The WATCH key is released automatically on EXEC (whether it aborted or succeeded), on DISCARD, or on any error in the watched key path. For high-throughput counters where retries are unacceptable, prefer the atomic DECR command on a single Redis instance — it has no gap.

For a deep-dive on isolation level semantics and how each level maps to specific anomaly prevention, see Isolation Levels in Databases. For the full taxonomy of distributed write anomalies beyond lost updates, see Data Anomalies in Distributed Systems.

📊 Visual Reference

sequenceDiagram
    participant T1 as Transaction 1
    participant DB as Database
    participant T2 as Transaction 2

    T1->>DB: Read X (value=100)
    T2->>DB: Read X (value=100)
    T1->>DB: Write X = 50
    T2->>DB: Write X = 75
    DB->>DB: Conflict!

📚 Lessons Learned: What Production Engineers Know About Lost Updates

Silence is the most dangerous symptom. Lost updates produce no errors, no warnings, and no log entries at the database or application layer. Both transactions report success. The only discovery path is noticing that the final state is inconsistent with the operations that were reported as successful — often via a customer complaint, a manual audit, or a downstream system (the warehouse) rejecting an impossible request.
Read Committed is not a safety net for read-modify-write. The default isolation level on PostgreSQL and Oracle (and the effective default on MySQL) allows lost updates. If your application reads a value, modifies it in application memory, and writes it back, you are vulnerable at Read Committed regardless of database vendor.
MySQL Repeatable Read ≠ PostgreSQL Repeatable Read for lost updates. PostgreSQL's first-committer-wins rule auto-detects write-write conflicts at Repeatable Read. MySQL's does not. Never assume one database's isolation semantics map to another's just because they share a level name. Test lost-update scenarios explicitly when migrating between vendors.
Prefer atomic SQL for all simple updates. UPDATE SET val = val + delta is always safer than a read-modify-write cycle for counters and quantities. It requires zero additional infrastructure, works at any isolation level, and has lower overhead than any locking or retry mechanism. Default to it unless the update genuinely cannot be expressed as a single SQL expression.
Optimistic locking is a contract that every write path must honor. Adding a version column means nothing unless every UPDATE on that table checks it. A single code path that performs an unconditional UPDATE without WHERE version = :v silently breaks the entire optimistic locking invariant for that table. Audit all write paths when introducing optimistic locking.
High contention surfaces deserve pessimistic locks. Optimistic locking's retry amplification effect — many transactions failing and retrying under high conflict — can make SELECT FOR UPDATE the more efficient choice for rows that are written frequently. Profile retry rates in staging with realistic concurrency before choosing between the two.
Distributed systems require an explicit strategy at every boundary. Across microservices or NoSQL stores, there is no shared transaction coordinator. Conditional writes, append-only event logs, single-authoritative-writer patterns, and CRDT-compatible operations are the idiomatic solutions. Assume no lost-update protection exists until you have explicitly implemented one.

📌 Summary & Key Takeaways: Lost Updates in Seven Bullets

A lost update occurs when two read-modify-write transactions overwrite each other — one write silently discards the other, with no error raised by either the database or the application.
The root cause is the read-modify-write gap: T2 reads the same committed value T1 already read, and T2's write lands after T1's — overwriting T1's result using a stale starting point.
Atomic SQL (UPDATE SET col = col + delta) eliminates the gap entirely and is the correct default for all counter, quantity, and balance operations at any isolation level.
SELECT FOR UPDATE serializes competing reads at the row level, blocking T2 at the read step until T1 commits. Best for high-contention rows with complex update logic.
Optimistic locking (version column + WHERE version = :v) detects conflicts at write time and retries. Best for low-to-medium contention with complex multi-step logic.
Serializable isolation prevents all lost updates (and write skew) database-wide, with automatic abort on conflict. Best as a safety net or when auditing every write path is impractical.
PostgreSQL auto-detects lost updates at Repeatable Read. MySQL does not — treat MySQL Repeatable Read as equivalent to Read Committed for lost-update safety purposes.