30 min readDatabases Transactions Isolation Levels

Dirty Read Explained: How Uncommitted Data Corrupts Transactions

How a single uncommitted write can corrupt downstream reads, cause phantom transactions, and what isolation levels prevent it.

Abstract Algorithms/Apr 11, 2026/System Design Interview Prep

Executive TLDR

TLDR: A dirty read occurs when Transaction B reads data written by Transaction A before A has committed.
If A rolls back, B has made decisions on data that — from the database's perspective — never existed.
Read Committed isolation (the default in PostgreSQL and Oracle) eliminates dirty reads entirely.
Read Uncommitted is the only SQL isolation level that permits them, and it belongs only in approximate analytics workloads where a slightly stale or wrong number is acceptable.

Core mental model

Read this as a system of state, constraints, and failure boundaries.

How a single uncommitted write can corrupt downstream reads, cause phantom transactions, and what isolation levels prevent it.

Explain simpler Compare tradeoffs

Key systems visualization

The article’s conceptual path

📖 The $240 Phantom Refund: A Real Production Failure

🔍 Why Uncommitted Data Is Called "Dirty"

⚙️ The Mechanics of a Dirty Read: A Step-by-Step Transaction Timeline

🧠 Deep Dive: How the Database Engine Permits or Blocks Dirty Reads

📊 How Dirty Reads Propagate Through a Multi-Service Architecture

TLDR: A dirty read occurs when Transaction B reads data written by Transaction A before A has committed. If A rolls back, B has made decisions on data that — from the database's perspective — never existed. Read Committed isolation (the default in PostgreSQL and Oracle) eliminates dirty reads entirely. Read Uncommitted is the only SQL isolation level that permits them, and it belongs only in approximate analytics workloads where a slightly stale or wrong number is acceptable.

📖 The $240 Phantom Refund: A Real Production Failure

It is 14:03:22 UTC on a Tuesday. A payment platform processes thousands of transfers per second. The balance-service receives a GET request: "Does user acc-9981 have enough funds to process a $240 withdrawal?"

Here is what happens inside the database at that exact moment:

14:03:21.800 UTC — The transfer-service starts Transaction A. It debits acc-9981 by $240, writing the new balance of $60 to the accounts table. The transaction is not yet committed.
14:03:21.950 UTC — The balance-service opens Transaction B and reads the accounts row for acc-9981. It sees $60. It decides the account does not have enough funds for a second withdrawal and records this in its audit cache.
14:03:22.100 UTC — Transaction A hits a downstream error (the recipient's bank API timed out). It rolls back. The balance of acc-9981 is restored to $300.
14:03:22.200 UTC — Transaction B, still running, has already logged "insufficient funds" to the audit system and declined a legitimate transaction.

The account had $300 the entire time. The declined transaction was valid. The audit log now contains a false record. That is a dirty read.

No hardware failure. No network partition. Just a transaction reading data that was never actually committed — and then acting on it.

This post dissects exactly how this anomaly occurs at the storage engine level, walks through the isolation level spectrum that prevents it, and explains what your database is doing (or not doing) to protect you by default.

🔍 Why Uncommitted Data Is Called "Dirty"

In database theory, committed data is data that has been durably written and is visible to all future transactions regardless of what happens next. Uncommitted data is in-flight — it exists in memory (and sometimes in undo/redo logs), but the transaction that created it can still be reversed entirely as if it never happened.

Reading uncommitted data is called a dirty read because that data is "tainted" — it might disappear. The transaction that wrote it has not yet declared it permanent. If the writer rolls back for any reason (application error, deadlock resolution, network timeout, explicit ROLLBACK), the reader has been handed phantom data.

Two storage-engine perspectives on why this happens:

Lock-based engines (traditional): In a locking model, Transaction A acquires a write lock on the row it modifies. For dirty reads to be prevented, Transaction B must wait for A's write lock to be released — which only happens after A commits or rolls back. If the database is running at READ UNCOMMITTED isolation, it deliberately skips acquiring the read lock, allowing B to bypass the wait and read A's in-flight write directly from the page buffer.

MVCC engines (PostgreSQL, Oracle, MySQL InnoDB): In Multi-Version Concurrency Control, each write creates a new row version tagged with the writing transaction's ID. Readers receive a snapshot at a point in time and only ever see versions whose creating transaction has a committed transaction ID. Dirty reads are structurally prevented even at READ UNCOMMITTED in PostgreSQL because the MVCC snapshot never includes uncommitted row versions. This is a key architectural distinction covered in detail in the Database-Specific Behavior section.

The key mental model: a dirty read is not about data being wrong in the committed state — it is about reading a hypothetical future that may be cancelled.

The lifecycle of a write has exactly two terminal outcomes: commit or rollback. Any data read by another transaction between the BEGIN and the final outcome occupies the uncertain middle state shown below.

stateDiagram-v2
    [*] --> Uncommitted: T1 writes row value
    Uncommitted --> Committed: T1 issues COMMIT
    Uncommitted --> RolledBack: T1 issues ROLLBACK
    Committed --> [*]: Value permanently visible to all readers
    RolledBack --> [*]: Original value restored in buffer

⚙️ The Mechanics of a Dirty Read: A Step-by-Step Transaction Timeline

The sequence below traces a dirty read through its full lifecycle. Transaction T1 is the writer; Transaction T2 is the reader that gets burned.

The four phases are: T1 writes (uncommitted), T2 reads the uncommitted value, T1 rolls back, and T2 is left holding phantom data.

sequenceDiagram
    participant T1 as Transaction T1 (Transfer Service)
    participant DB as Database Engine
    participant T2 as Transaction T2 (Balance Service)

    T1->>DB: BEGIN
    T1->>DB: UPDATE accounts SET balance = 60 WHERE id = 9981
    Note over DB: Row written to buffer, not committed
    T2->>DB: BEGIN
    T2->>DB: SELECT balance FROM accounts WHERE id = 9981
    DB-->>T2: Returns 60 (DIRTY READ occurs here)
    Note over T2: T2 sees $60  decides "insufficient funds"
    T1->>DB: ROLLBACK (downstream API timeout)
    Note over DB: Row restored to 300 in buffer
    T2->>DB: COMMIT
    Note over T2: T2 committed a decision based on data that no longer exists

Notice that T2's SELECT returns 60 at the exact moment T1's write is in the buffer but uncommitted. When T1 rolls back, the database restores the row to 300 — but T2 has already read, decided, and committed based on the phantom value. T2's commit is perfectly valid from the database's perspective; the damage is in the application-level decision it already made.

In a locking context: T2 is allowed to read without waiting for T1's row-level write lock because the isolation level is READ UNCOMMITTED. The lock is there — T2 just does not request it.

In an MVCC context: At READ UNCOMMITTED, a non-MVCC engine would hand T2 the in-buffer row directly. An MVCC engine like PostgreSQL would still give T2 the last committed version (balance = 300), because T1's new row version has an uncommitted transaction ID and MVCC snapshots exclude uncommitted versions by design.

🧠 Deep Dive: How the Database Engine Permits or Blocks Dirty Reads

Understanding why dirty reads happen requires looking inside the storage engine at two distinct mechanisms: the row-version visibility rules (Internals) and the overhead profile of the locking and snapshotting that prevents them (Performance Analysis).

The Internals: Row Visibility in Lock-Based vs. MVCC Engines

Lock-based engines manage read/write isolation through a two-phase locking (2PL) protocol. A writer acquires an exclusive lock on the rows it modifies and holds that lock until it commits or rolls back. A reader at READ COMMITTED or above must acquire a shared lock before reading — and a shared lock cannot be granted while an exclusive lock is held. This forces the reader to wait until the writer either commits (making the data permanently visible) or rolls back (making the old data visible again). At READ UNCOMMITTED, the database deliberately skips the shared lock acquisition, allowing the reader to access the page directly from the buffer pool without lock coordination. The dirty read is a consequence of the engine bypassing its own lock protocol.

MVCC engines (PostgreSQL, Oracle, MySQL InnoDB) solve the read/write contention problem differently. Every UPDATE or DELETE does not overwrite the existing row in place — it creates a new row version stamped with the writing transaction's ID, while the old row version is retained in either the heap (PostgreSQL) or the undo log (MySQL, Oracle). Each transaction receives a snapshot — a point-in-time view defined by the set of transaction IDs that were committed at snapshot creation time. When a reader evaluates a row version, it checks: "Was the transaction that created this version in my committed-set?" If the creating transaction ID is not in the committed-set (because it is still in-flight), that row version is invisible to the reader. The reader transparently falls back to the previous committed version. This is why PostgreSQL can never show a dirty read — the in-flight transaction's row version is structurally invisible to any snapshot.

The undo log's role: In MySQL InnoDB and Oracle, when a reader needs to "see through" a new row version to find the last committed version, it follows a chain of undo log entries backward to reconstruct the row state as it existed at the reader's snapshot time. This is sometimes called a "consistent read" or "point-in-time read". The undo log is the MVCC engine's time machine — it makes dirty reads impossible by design without requiring readers to wait for writers.

Performance Analysis: The Cost of Preventing Dirty Reads

At READ UNCOMMITTED (lock-based engines): Zero additional overhead for readers — they bypass all locking. Write performance is unaffected. This is why the isolation level exists at all: for read-only analytics on non-critical data, skipping shared lock acquisition eliminates contention on heavily-written tables.

At READ COMMITTED (lock-based engines): Each reader must acquire and release a shared lock per row read. Under low concurrency, this overhead is negligible. Under high write contention (many concurrent UPDATE transactions on the same rows), readers experience lock wait latency proportional to the write transaction duration. This is the primary performance motivation for MVCC: replacing shared lock acquisition with a snapshot scan eliminates reader/writer blocking entirely.

At READ COMMITTED (MVCC engines, e.g., PostgreSQL): Readers hold no locks. The cost of preventing dirty reads is a one-time snapshot establishment at transaction start (reading the pg_snapshot system catalog to capture the current committed-transaction watermark). This operation is O(1) per transaction, not O(n) per row read. MVCC dirty read prevention is effectively free in terms of reader latency.

Undo log overhead: The cost of MVCC's dirty read prevention moves to write amplification (every write creates at least two row versions: the new version and the undo record) and undo log management (the database must retain undo records for all active readers, which can grow unbounded for long-running transactions). This is why monitoring transaction age matters: a 10-minute-old read transaction forces the database to retain all undo log entries created in the last 10 minutes for every table it might touch.

📊 Visual Reference

sequenceDiagram
    participant T1 as Transaction 1
    participant DB as Database
    participant T2 as Transaction 2

    T1->>DB: Read X (value=100)
    T2->>DB: Read X (value=100)
    T1->>DB: Write X = 50
    T2->>DB: Write X = 75
    DB->>DB: Conflict!

📊 How Dirty Reads Propagate Through a Multi-Service Architecture

A dirty read inside a single database transaction is bad enough. In a microservices architecture, the damage amplifies because each downstream service trusts the read it receives and may persist or act on it before the upstream rollback propagates.

The diagram below shows an order processing pipeline where the inventory service exposes a read that the order service acts on — and both are unaware that the originating stock-reservation transaction is still in flight.

graph TD
    A[Warehouse Service] -->|BEGIN: reserve 1 unit SKU-4421| B[DB Write Buffer - uncommitted]
    B -->|READ UNCOMMITTED query| C[Inventory Service]
    C -->|Stock check returns 0 available| D[Order Service]
    D -->|Creates shipment record - item out of stock| E[Shipping Queue]
    A -->|ROLLBACK - reservation cancelled| F[DB Row restored to 1 unit]
    E -->|Shipment dispatched| G[Customer notified - order cancelled]
    F -.->|Too late: shipment already queued| E

This diagram shows how a single uncommitted write — a warehouse reservation that gets rolled back — cascades into an incorrect stock count, a false "out of stock" response to the order service, a premature order cancellation in the shipping queue, and finally a customer notification for a cancellation that should never have happened. The original inventory data was never actually zero; that was a dirty read at the inventory service layer.

The lesson from this architecture diagram: dirty reads are not an isolated database problem. They are a data contract violation that propagates at the speed of your internal service calls.

🧠 The Isolation Level Spectrum: Which Levels Draw the Line on Dirty Reads

SQL defines four standard isolation levels. Each level is defined by the anomalies it permits. The levels form a strict hierarchy — each higher level prevents all the anomalies of the level below it, plus at least one additional class of anomaly.

Isolation Level	Dirty Read	Non-Repeatable Read	Phantom Read	Write Skew
Read Uncommitted	Possible	Possible	Possible	Possible
Read Committed	Prevented	Possible	Possible	Possible
Repeatable Read	Prevented	Prevented	Possible*	Possible*
Serializable	Prevented	Prevented	Prevented	Prevented

*MySQL InnoDB's Repeatable Read also prevents phantom reads via gap locks. Standard SQL specification allows phantoms at this level.

The critical takeaway: Read Committed is the lowest isolation level that eliminates dirty reads. It is also the default in PostgreSQL, Oracle, and SQL Server. This means most production databases are already protected against dirty reads by default — but READ UNCOMMITTED can be set explicitly per session, per transaction, or (dangerously) at the server level.

Why not always use Serializable? Serializable prevents every known anomaly, but it does so by detecting read-write conflicts and aborting transactions that would violate serializability. Under contention, this increases retry rates and latency. For most OLTP workloads, Read Committed is the correct default — providing dirty read protection without the overhead of full serializability.

🏗️ How PostgreSQL, MySQL, Oracle, and SQL Server Each Handle Dirty Reads

The SQL standard permits dirty reads at READ UNCOMMITTED, but not every database implements it the same way. Engine architecture matters enormously here.

PostgreSQL

PostgreSQL uses MVCC throughout. Every write creates a new row version; readers always consult the version visible in their snapshot, which only includes row versions from committed transactions. When you set SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED in PostgreSQL, the engine silently upgrades it to READ COMMITTED. PostgreSQL cannot produce dirty reads under any isolation level setting. This is a design choice, not a limitation.

MySQL InnoDB

InnoDB also uses MVCC, but it genuinely supports READ UNCOMMITTED as a distinct mode. At READ UNCOMMITTED, InnoDB reads the latest row version from the buffer regardless of whether the writing transaction has committed. This is a true dirty read. InnoDB's default isolation level is REPEATABLE READ, which prevents dirty reads via snapshot reads.

Oracle Database

Oracle uses MVCC (called "consistent reads") and does not implement READ UNCOMMITTED at all. The lowest Oracle isolation level is READ COMMITTED, which already prevents dirty reads. Like PostgreSQL, Oracle is architecturally incapable of producing them.

SQL Server

SQL Server supports READ UNCOMMITTED and it genuinely returns uncommitted data at that level. SQL Server calls this explicitly using NOLOCK hints or the READ UNCOMMITTED isolation level. The NOLOCK hint is commonly misused by developers who want "fast reads" without understanding that they are opting into dirty reads, non-repeatable reads, and phantom reads simultaneously.

Summary of database defaults:

Database	Default Isolation	Can Produce Dirty Reads
PostgreSQL	Read Committed	No (MVCC prevents it at all levels)
MySQL InnoDB	Repeatable Read	Yes (at READ UNCOMMITTED only)
Oracle	Read Committed	No (minimum level is Read Committed)
SQL Server	Read Committed	Yes (at READ UNCOMMITTED / NOLOCK)

🌍 When Read Uncommitted Is Used Intentionally — and Safely

Dirty reads are not always catastrophic. There is a narrow class of use cases where reading slightly stale or possibly uncommitted data is explicitly acceptable because the cost of correctness is higher than the cost of approximation.

Approximate aggregate counts: An analytics dashboard showing "~14,200 active users online" does not need transactional precision. If the exact count is 14,197 because three sessions are in mid-flight transactions, the dashboard consumer will never notice or care. Using READ UNCOMMITTED here avoids lock contention on a heavily-updated counter table.

Monitoring dashboards and alerting systems: A monitoring query that scans order_events to count orders processed in the last five minutes can tolerate a few phantom or missing rows. The trend matters; the exact count does not. Using READ UNCOMMITTED (or NOLOCK in SQL Server) keeps the monitoring queries from competing with transactional workloads.

Large batch reporting on write-heavy tables: A nightly ETL that aggregates millions of rows from a continuously-updated table may deadlock repeatedly at higher isolation levels. If the report is labelled "approximate" and stakeholders accept that, READ UNCOMMITTED is a pragmatic choice to keep ETL jobs from blocking production writes.

The critical discipline: If you choose READ UNCOMMITTED, make this explicit in the system design. Label the API endpoint, dashboard, or report as "approximate". Document the isolation level in the query comment. Never let a READ UNCOMMITTED result feed a financial calculation, legal audit, or inventory commitment — those are exactly the contexts that produce the failures described in this post.

🔬 Finding Dirty Read Exposure in a Running Production Database

When a production incident points to phantom data, the first question is: "Was there a dirty read, or is this something else?" These tools and queries help you reconstruct what happened.

pg_stat_activity for transaction age analysis (PostgreSQL):

Every in-flight transaction on PostgreSQL appears in pg_stat_activity. Long-running transactions with a state of idle in transaction are the primary source of dirty read opportunity — they hold write locks or uncommitted row versions that other sessions may be querying against.

SELECT pid, usename, state, query_start, now() - query_start AS duration, query
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY duration DESC;

Transactions that have been idle in transaction for more than a few seconds in an OLTP system are a red flag.

Lock wait analysis:

When Transaction B is waiting for Transaction A's lock to release, the contention is visible in pg_locks joined to pg_stat_activity. This is the locking-model equivalent of dirty read risk — the question is whether B is waiting (good: it will get a consistent view after A commits or rolls back) or bypassing the wait (bad: dirty read).

SELECT blocking.pid AS blocking_pid,
       blocked.pid  AS blocked_pid,
       blocked.query AS blocked_query,
       blocking.query AS blocking_query
FROM pg_stat_activity blocked
JOIN pg_locks bl ON bl.pid = blocked.pid AND NOT bl.granted
JOIN pg_locks kl ON kl.transactionid = bl.transactionid AND kl.granted
JOIN pg_stat_activity blocking ON blocking.pid = kl.pid;

Slow query logs and timeline reconstruction:

If an incident has already occurred, the reconstruction approach is: pull slow query logs for the affected table in the incident window, extract transaction IDs, and trace which transactions overlapped. If a SELECT completed between a BEGIN and ROLLBACK of a concurrent writer on the same row, you have found the dirty read window.

Application-level detection: Instrument your application to log the transaction isolation level alongside every query. If any service is making balance, inventory, or audit queries without explicit isolation level enforcement, you cannot be certain what they read.

⚖️ Architectural Defenses Against Dirty Read Corruption

Prevention is always cheaper than incident response. The following strategies, ordered from lowest to highest overhead, cover the range of options.

1. Rely on the database default (for PostgreSQL and Oracle users):

If you are running PostgreSQL or Oracle and you have not explicitly set READ UNCOMMITTED anywhere in your codebase, you are already protected. Audit your connection pool configurations, ORM settings, and legacy query scripts for any explicit NOLOCK, READ UNCOMMITTED, or isolation_level=0 settings. Remove them.

2. Enforce Read Committed or higher at the connection pool level:

For MySQL and SQL Server deployments, set the isolation level at the connection pool or data source configuration rather than relying on developers to set it per query. This makes isolation level a deployment concern, not a per-developer concern.

3. Use MVCC snapshots explicitly for long read workloads:

For reporting queries that run for seconds or minutes against a live OLTP table, use SET TRANSACTION ISOLATION LEVEL REPEATABLE READ (or SERIALIZABLE in PostgreSQL) before the reporting transaction begins. This gives the query a stable snapshot of the entire database at the moment the transaction opened, regardless of concurrent writes. The result is consistent even if dozens of write transactions commit during the query's execution.

4. Optimistic locking for application-level consistency:

When you cannot raise the isolation level (for example, in a shared multi-tenant database with strict latency SLAs), use optimistic locking: add a version column to sensitive tables, read the version alongside the data, and include WHERE version = :read_version in your update statement. If another transaction modified the row between your read and your write, the update affects zero rows — your application detects this and retries. This prevents the effects of dirty reads even if the read itself returns a stale value.

5. Event sourcing and append-only ledgers:

Financial systems that cannot tolerate any form of read anomaly sometimes move to an append-only event ledger model: balance is never stored as a mutable row but reconstructed from an immutable sequence of credit/debit events. Dirty reads on an append-only ledger cause no harm because an uncommitted event that gets rolled back simply disappears from the sequence — the materialized balance is only ever calculated from committed events.

🔥 What Dirty Reads Actually Break Downstream: Failure Modes in Production

Understanding the specific failure modes that dirty reads produce helps prioritize where to apply defenses first.

Double charges and phantom refunds: A payment service reads an in-flight debit as committed, determines a refund is owed, issues the refund — and then the original debit rolls back. The account receives both a restored balance (from the rollback) and a refund. Stripe's engineering blog has documented incidents where NOLOCK-equivalent reads in legacy reporting flows caused exactly this class of double-credit.

Inventory overselling: An e-commerce platform's checkout service reads an inventory count of 1 for a limited-edition item. Two concurrent transactions are simultaneously decrementing that count, both uncommitted. Both checkout sessions see a count of 1, both proceed, and two orders are confirmed for a single physical item. The warehouse discovers the problem when the second shipment order cannot be fulfilled.

Corrupt audit logs and compliance failures: Financial regulations (SOX, PCI-DSS) require that audit records reflect only committed state. An audit logger that reads dirty data will record phantom transactions — debits that were rolled back, account states that never existed. During a regulatory audit, these records are inexplicable and can trigger compliance investigations.

Cascading cache poisoning: If a caching layer (Redis, Memcached, Varnish) is populated from a dirty read, the incorrect value gets cached with a TTL. Even after the source transaction rolls back and the database returns to the correct state, the cache continues serving the phantom value to every downstream reader until the TTL expires. This extends the window of incorrectness far beyond the original transaction boundary.

Misleading monitoring and alert fatigue: A monitoring system that reads dirty data will trigger false alerts — "disk usage at 97%" based on a mid-flight write that gets rolled back to 40%. Teams that see frequent false alerts start disabling them, which means real alerts get missed.

🧭 Choosing the Right Isolation Level: When Dirty Read Protection Is Worth the Cost

This decision guide applies to the specific question: "What isolation level should I set for this database, service, or query?"

Situation	Recommendation
OLTP application with financial, inventory, or audit data	Use Read Committed (default in PostgreSQL/Oracle) or higher. Never drop to Read Uncommitted. This is the baseline for correctness in any business-critical system.
Analytics query on a live OLTP table, result labelled "approximate"	Read Uncommitted is acceptable on MySQL/SQL Server if the report consumer explicitly understands and accepts approximate figures. Must be documented.
Long-running reporting query that causes lock contention in production	Use Repeatable Read or Serializable on an MVCC engine (PostgreSQL). The stable snapshot eliminates both dirty reads and lock contention. On SQL Server, enable RCSI instead of Read Uncommitted.
Multi-statement transaction that must see its own writes but not concurrent writes	Read Committed (with per-statement snapshot refresh) or Repeatable Read (with transaction-level snapshot). The choice depends on whether non-repeatable reads in the same transaction are acceptable.
Avoid when	Avoid Read Uncommitted for: payment processing, order fulfillment, compliance logging, session management, or any query result that is cached and served to other services.
Alternative to Read Uncommitted for performance	On SQL Server: enable RCSI (Read Committed Snapshot Isolation). On MySQL: set `transaction_isolation = READ-COMMITTED` plus connection pool tuning. On PostgreSQL: Read Committed already provides non-blocking reads via MVCC — no trade-off needed.
Edge case: long-running analytics on PostgreSQL	Even Serializable on PostgreSQL does not block readers (MVCC). The snapshot is stable for the transaction's lifetime. Use `SET TRANSACTION ISOLATION LEVEL SERIALIZABLE` for the analytics session with no performance penalty for readers.

🧪 Tracing a Dirty Read Through a Real E-Commerce Checkout Scenario

This walkthrough shows the exact sequence of events in an e-commerce checkout system where a dirty read leads to overselling a limited-edition item. It demonstrates the problem concretely before showing how isolation level selection resolves it.

Setup: The products table has a row for SKU-7731 (a limited-edition sneaker) with stock_count = 1. Two customers check out simultaneously: Customer A and Customer B.

Phase 1 — The Race Condition (Read Uncommitted, bad configuration):

T=0ms: Transaction A (Customer A checkout) begins. It reads stock_count = 1. It decrements: UPDATE products SET stock_count = 0 WHERE sku = 'SKU-7731'. Transaction A is NOT yet committed.
T=5ms: Transaction B (Customer B checkout) begins. It reads stock_count = 0 — the uncommitted value written by Transaction A. Transaction B sees zero stock, returns "Out of Stock" to Customer B, and records a "stock_unavailable" event in the audit log.
T=10ms: Transaction A fails (payment gateway timeout). It rolls back. stock_count is restored to 1.
T=15ms: The sneaker is now available again, but Customer B was already turned away. No oversell — but a valid sale was lost, and the audit log contains a false "stock_unavailable" record.

The dirty read cost a sale and corrupted the audit log. No double-spend occurred here, but only by luck — if Transaction B had proceeded to place an order instead of rejecting, both customers would have gotten a confirmation for the same unit.

Phase 2 — The Same Scenario at Read Committed (correct configuration):

T=0ms: Transaction A begins. Writes stock_count = 0. Uncommitted.
T=5ms: Transaction B begins. Attempts to read stock_count. The database (at Read Committed) returns the last committed value: stock_count = 1. Transaction B sees 1 unit available.
T=8ms: Transaction B proceeds to place its order, reads the row again (still committed value = 1), and its own UPDATE decrements to 0. Transaction B commits.
T=10ms: Transaction A rolls back. The row is now back to whatever state Transaction B left it — which is 0, because B committed a valid decrement.
Result: One valid sale completed (Customer B). The rollback of Transaction A is irrelevant to B's committed transaction. No dirty read. No lost sale.

The key difference: at Read Committed, Transaction B always reads the last committed value of stock_count, ignoring Transaction A's in-flight write entirely. This is dirty read prevention working exactly as designed.

🛠️ PostgreSQL and MySQL: Configuring Isolation Levels at the Server and Session Level

System Design Concept posts use configuration-level examples only — the following snippets show how isolation levels are set at the database server, session, and transaction scope without any application code.

PostgreSQL — `postgresql.conf` and per-session overrides

The default transaction isolation level in PostgreSQL is read committed. It is configured in postgresql.conf:

# postgresql.conf
# Default transaction isolation level for all connections
# Valid values: read_uncommitted, read_committed, repeatable_read, serializable
# PostgreSQL silently upgrades read_uncommitted to read_committed
default_transaction_isolation = 'read committed'

Per-session isolation level (set in the session before any transaction begins):

-- Elevate a reporting session to a stable snapshot
SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL REPEATABLE READ;

-- Or set for the current transaction only
BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- ... queries ...
COMMIT;

MySQL InnoDB — per-session and global isolation level

# my.cnf / my.ini
[mysqld]
# InnoDB default is REPEATABLE-READ
# Setting to READ-COMMITTED prevents dirty reads and non-repeatable reads
transaction_isolation = READ-COMMITTED

-- Per-session override in MySQL
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;

-- Per-transaction override
START TRANSACTION;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
-- ... queries ...
COMMIT;

SQL Server — removing NOLOCK hints (the most impactful change)

The most common source of dirty reads in SQL Server is not READ UNCOMMITTED at the session level — it is the WITH (NOLOCK) query hint applied to individual SELECT statements by developers who were told it "speeds up queries". The correct remediation is to remove NOLOCK and, if read performance is the concern, enable Read Committed Snapshot Isolation (RCSI) at the database level:

-- Enable RCSI on SQL Server (one-time, per-database)
ALTER DATABASE YourDatabase
SET READ_COMMITTED_SNAPSHOT ON;

With RCSI enabled, READ COMMITTED readers get a versioned snapshot rather than acquiring shared locks, providing both dirty read prevention and non-blocking reads. This removes the motivation for using NOLOCK in the first place.

For a full deep-dive on transaction isolation level internals and MVCC implementation, see Isolation Levels in Databases.

📚 Lessons Learned from Dirty Read Incidents

The default is usually your friend — audit overrides first. PostgreSQL and Oracle are architecturally incapable of dirty reads. If you are debugging a phantom data incident on these engines, dirty read is not the root cause. On MySQL and SQL Server, audit every explicit READ UNCOMMITTED and NOLOCK in the codebase before assuming a higher-level race condition.
NOLOCK is not a performance optimization — it is a correctness trade-off. SQL Server's WITH (NOLOCK) hint was popularized as a way to avoid blocking reads. It works by enabling dirty reads. Teams that apply it universally to "speed up" queries are unknowingly opting into phantom data on every query that hits an in-flight transaction. Enabling RCSI achieves the same performance benefit without the correctness compromise.
Caches amplify dirty reads beyond the transaction boundary. A dirty read that lasts 50 milliseconds becomes a 5-minute outage if the result is written to a cache with a 5-minute TTL. When designing caching strategies for financial or inventory data, ensure the cache is populated only from committed reads — or implement cache invalidation triggered by commit events via CDC.
Isolation level is a deployment configuration, not an application detail. The isolation level your queries run at should be set at the connection pool, ORM data source, or postgresql.conf level — not scattered in individual query files. Centralizing isolation level configuration makes it auditable and prevents any single developer from inadvertently dropping to READ UNCOMMITTED in a production code path.
Dirty reads and long-running transactions are co-conspirators. The longer Transaction A holds an uncommitted write open, the wider the window for Transaction B to read dirty data. Enforce transaction timeouts at the database level (lock_timeout, statement_timeout in PostgreSQL; innodb_lock_wait_timeout in MySQL) to limit the blast radius of slow or stuck transactions.
Test isolation level behavior explicitly in your integration tests. Most integration test suites run each test in a transaction that gets rolled back at the end. This inadvertently tests dirty read scenarios — but only if your test isolation level matches production. Mismatched test isolation levels hide the exact class of concurrency bugs that produce dirty reads in production.
Append-only audit tables are immune to dirty read corruption. If regulatory compliance depends on the integrity of your audit log, model it as an append-only event table where each record is inserted once and never updated. A dirty read on an append-only insert is still a risk (you might see a phantom insert that gets rolled back), but it is far easier to detect and compensate for than a phantom update to a mutable balance column.

📌 Summary & Key Takeaways

A dirty read occurs when Transaction B reads data written by Transaction A before A commits. If A rolls back, B has acted on data that never officially existed.
The term "dirty" means tainted by uncertainty — the data is in-flight and can be erased at any moment by the writing transaction's rollback.
Read Committed is the minimum isolation level that prevents dirty reads. It is the default in PostgreSQL, Oracle, and SQL Server.
PostgreSQL and Oracle cannot produce dirty reads regardless of the isolation level setting, because their MVCC implementations exclude uncommitted row versions from all snapshots. MySQL InnoDB and SQL Server can produce dirty reads at READ UNCOMMITTED.
WITH (NOLOCK) in SQL Server is the most common accidental source of dirty reads in production. The safe alternative is enabling Read Committed Snapshot Isolation (RCSI) at the database level.
Downstream systems that cache, queue, or act on dirty read results extend the window of incorrectness far beyond the original transaction's lifetime.
Read Uncommitted has legitimate uses in approximate analytics, monitoring dashboards, and non-critical reporting — but must be explicitly documented and never used in financial, inventory, or audit-sensitive code paths.
The single most impactful defensive action: audit your codebase and connection pool configurations for any explicit READ UNCOMMITTED, NOLOCK, or isolation_level=0 settings, and remove or justify each one.

Quiet AI help

Explain simpler Compare approaches What next?

Article metadata

Written by

Abstract Algorithms

@abstractalgorithms

Reader feedback

Was this article useful?

Rate it if it helped, then continue with the next deep dive when you are ready.

Related deep dives

Softmax Function Explained: From Raw Scores to Probabilities

23 min read

Clock Skew and Causality Violations: Why Distributed Clocks Lie

19 min read

Build vs Buy: Deploying Your Own LLM vs Using ChatGPT, Gemini, and Claude APIs

31 min read

ANN Index Types Explained: When to Choose Flat, HNSW, IVF, or IVF-PQ

14 min · Ann · best next step

Open Collection

Dirty Read Explained: How Uncommitted Data Corrupts Transactions

Read this as a system of state, constraints, and failure boundaries.

The article’s conceptual path

📖 The $240 Phantom Refund: A Real Production Failure

🔍 Why Uncommitted Data Is Called "Dirty"

⚙️ The Mechanics of a Dirty Read: A Step-by-Step Transaction Timeline

🧠 Deep Dive: How the Database Engine Permits or Blocks Dirty Reads

The Internals: Row Visibility in Lock-Based vs. MVCC Engines

Performance Analysis: The Cost of Preventing Dirty Reads

📊 Visual Reference

📊 How Dirty Reads Propagate Through a Multi-Service Architecture

🧠 The Isolation Level Spectrum: Which Levels Draw the Line on Dirty Reads

🏗️ How PostgreSQL, MySQL, Oracle, and SQL Server Each Handle Dirty Reads

PostgreSQL

MySQL InnoDB

Oracle Database

SQL Server

🌍 When Read Uncommitted Is Used Intentionally — and Safely

🔬 Finding Dirty Read Exposure in a Running Production Database

⚖️ Architectural Defenses Against Dirty Read Corruption

🔥 What Dirty Reads Actually Break Downstream: Failure Modes in Production

🧭 Choosing the Right Isolation Level: When Dirty Read Protection Is Worth the Cost

🧪 Tracing a Dirty Read Through a Real E-Commerce Checkout Scenario

🛠️ PostgreSQL and MySQL: Configuring Isolation Levels at the Server and Session Level

PostgreSQL — postgresql.conf and per-session overrides

MySQL InnoDB — per-session and global isolation level

SQL Server — removing NOLOCK hints (the most impactful change)

📚 Lessons Learned from Dirty Read Incidents

📌 Summary & Key Takeaways

Was this article useful?

PostgreSQL — `postgresql.conf` and per-session overrides