Start here
Distributed Systems
Move through replication, consensus, quorum, leader election, transactions, and failure recovery as one connected system.
ReplicationConsensusQuorumLeader ElectionDistributed TransactionsKafka
Begin with
Quorum gives you the cleanest entry point before branching into constraints, failures, and related systems.
12
Articles
10
Concepts
Relationships
Follow the shape of the system
Move through prerequisites, dependencies, tradeoffs, and adjacent concepts without losing the thread.
Guidance
Replication
Continues from what you have already explored.
System behavior
Kafka Topic Replication Flow
Partition leader writes and followers replicate before committed consumption.
Step 1 / 2Normal flow
Read in sequence
1Split Brain Explained: When Two Nodes Both Think They Are LeaderTLDR: Split brain happens when a network partition causes two nodes to simultaneously believe they are the leader — each accepting writes the other never sees. Prevent it with quorum consensus (at lea22 min2Data Anomalies in Distributed Systems: Split Brain, Clock Skew, Stale Reads, and MoreTLDR: Distributed systems produce anomalies not because the code is buggy — but because physics makes perfect consistency impossible across network boundaries. Split brain, stale reads, clock skew, ca13 min3A Guide to Raft, Paxos, and Consensus AlgorithmsTLDR
TLDR: Consensus algorithms allow a cluster of computers to agree on a single value (e.g., "Who is the leader?"). Paxos is the academic standard — correct but notoriously hard to understand. Raft13 min4Stale Reads and Cascading Failures in Distributed SystemsTLDR: Stale reads return superseded data from replicas that haven't yet applied the latest write. Cascading failures turn one overloaded node into a cluster-wide collapse through retry storms and redi25 min5The Dual Write Problem: Why Two Writes Always Fail Eventually — and How to Fix ItTLDR: Any service that writes to a database and publishes a message in the same logical operation has a dual write problem. try/catch retries don't fix it — they turn failures into duplicates. The Tra23 min6How Kafka Works: The Log That Never ForgetsTLDR: Kafka is a distributed event store. Unlike a traditional queue (RabbitMQ) where messages disappear after reading, Kafka stores them in a persistent Log. This allows multiple consumers to read th13 min7Key Terms in Distributed Systems: The Definitive GlossaryTLDR: Distributed systems vocabulary is precise for a reason. Mixing up read skew and write skew costs you an interview. Confusing Snapshot Isolation with Serializable costs you a production outage. T51 min8The 8 Fallacies of Distributed SystemsTLDR
TLDR: In 1994, L. Peter Deutsch at Sun Microsystems listed 8 assumptions that developers make about distributed systems — all of which are false. Believing them leads to hard-to-reproduce bugs, 13 min9Dirty Write Explained: When Uncommitted Data Gets OverwrittenTLDR: A dirty write occurs when Transaction B overwrites data that Transaction A has written but not yet committed. The result is not a rollback or an error — it is silently inconsistent committed dat28 min10Read Skew Explained: Inconsistent Snapshots Across Multiple ObjectsTLDR: Read skew occurs when a transaction reads two logically related objects at different points in time — one before and one after a concurrent transaction commits — producing a view that never exis34 min11Lost Update Explained: When Two Writes Become OneTLDR: A lost update occurs when two concurrent read-modify-write transactions both read the same committed value, both compute a new value from it, and both write back — with the second write silently38 min12Phantom Read Explained: When New Rows Appear Mid-TransactionTLDR: A phantom read occurs when a transaction runs the same range query twice and gets a different set of rows — because a concurrent transaction inserted or deleted matching rows and committed in be32 min
Related threads

