Topic
databases
27 articles across 14 sub-topics
Sub-topic
9 articles

Read Skew Explained: Inconsistent Snapshots Across Multiple Objects
TLDR: Read skew occurs when a transaction reads two logically related objects at different points in time — one before and one after a concurrent transaction commits — producing a view that never existed as a committed whole. Read Committed isolation...

Phantom Read Explained: When New Rows Appear Mid-Transaction
TLDR: A phantom read occurs when a transaction runs the same range query twice and gets a different set of rows — because a concurrent transaction inserted or deleted matching rows and committed in between. Row locks cannot stop this because the phan...

Write Skew Explained: The Anomaly That Requires Serializable Isolation
TLDR: Write skew is the hardest concurrency anomaly to reason about: two concurrent transactions each read a shared condition, decide they can safely proceed, and then write to different rows. No individual operation is wrong. No row was overwritten....
Non-Repeatable Read Explained: When the Same Query Returns Different Results
TLDR: A non-repeatable read happens when the same SELECT returns different results within a single transaction because a concurrent transaction committed an update between the two reads. Read Committed isolation — the default in PostgreSQL, MySQL, an...

Sharding Approaches in SQL and NoSQL: Range, Hash, and Directory-Based Strategies Compared
TLDR: Sharding splits your database across multiple physical nodes so no single machine carries all the data or absorbs all the writes. The strategy you choose — range, hash, consistent hashing, or directory — determines whether range queries stay ch...

Key Terms in Distributed Systems: The Definitive Glossary
TLDR: Distributed systems vocabulary is precise for a reason. Mixing up read skew and write skew costs you an interview. Confusing Snapshot Isolation with Serializable costs you a production outage. This glossary organises every critical term into co...
Sub-topic
3 articles
Change Data Capture Pattern: Log-Based Data Movement Without Full Reloads
TLDR: Change data capture moves committed database changes into downstream systems without full reloads. It is most useful when freshness matters, replay matters, and the source database must remain the system of record. TLDR: CDC becomes production-...
Understanding Consistency Patterns: An In-Depth Analysis
TLDR TLDR: Consistency is about whether all nodes in a distributed system show the same data at the same time. Strong consistency gives correctness but costs latency. Eventual consistency gives speed but requires tolerance for briefly stale reads. C...
Data Warehouse vs Data Lake vs Data Lakehouse: Which One to Choose?
TLDR: Warehouse = structured, clean data for BI and SQL dashboards (Snowflake, BigQuery). Lake = raw, messy data for ML and data science (S3, HDFS). Lakehouse = open table formats (Delta Lake, Iceberg) that bring SQL performance to raw storage — the ...
Sub-topic
2 articles

ACID Properties Explained: How SQL Databases Guarantee Atomicity, Consistency, Isolation, and Durability
TLDR: ACID is four orthogonal guarantees that every SQL transaction must provide. Atomicity says all-or-nothing: PostgreSQL implements it via WAL rollback; MySQL InnoDB via undo logs. Consistency says constraints always hold: the database rejects any...

Isolation Levels in Databases: Read Committed, Repeatable Read, Snapshot, and Serializable Explained
TLDR: Isolation levels control which concurrency anomalies a transaction can see. Read Committed (PostgreSQL and Oracle's default) prevents dirty reads but still silently allows non-repeatable reads, write skew, and lost updates. Repeatable Read adds...
Sub-topic
2 articles

Dirty Write Explained: When Uncommitted Data Gets Overwritten
TLDR: A dirty write occurs when Transaction B overwrites data that Transaction A has written but not yet committed. The result is not a rollback or an error — it is silently inconsistent committed data: one table reflects Transaction B's intent, anot...

Lost Update Explained: When Two Writes Become One
TLDR: A lost update occurs when two concurrent read-modify-write transactions both read the same committed value, both compute a new value from it, and both write back — with the second write silently discarding the first. No error is raised. Both tr...
Sub-topic
2 articles

Choosing the Right Database: CAP Theorem and Practical Use Cases
TLDR: Database selection is a trade-off between consistency, availability, and scalability. By using the CAP Theorem as a compass and matching your data access patterns to the right storage engine (Relational, Document, KV, or Wide-Column), you can b...
BASE Theorem Explained: How it Stands Against ACID
TLDR TLDR: ACID (Atomicity, Consistency, Isolation, Durability) is the gold standard for banking. BASE (Basically Available, Soft state, Eventual consistency) is the standard for social media. BASE intentionally sacrifices instant accuracy in exchan...
Sub-topic
1 article
NoSQL Partitioning: How Cassandra, DynamoDB, and MongoDB Split Data
TLDR: Every NoSQL database hides a partitioning engine behind a deceptively simple API. Cassandra uses a consistent hashing ring where a Murmur3 hash of your partition key selects a node — virtual nodes (vnodes) make rebalancing smooth. DynamoDB mana...
Sub-topic
1 article
SQL Partitioning: Range, Hash, List, and Composite Strategies Explained
TLDR: SQL partitioning divides one logical table into smaller physical child tables, all accessed through the parent table name. The query optimizer skips irrelevant child tables entirely — a process called partition pruning — turning a 30-second ful...
Sub-topic
1 article

Compare-and-Swap and Optimistic Locking: How Every Database Implements It
TLDR: Compare-and-Swap (CAS) is the CPU-level atomic instruction that makes lock-free concurrency possible. Optimistic locking builds on it at the database layer: read freely, compute locally, write only if the record has not changed. Every major dat...
Sub-topic
1 article

Partitioning Approaches in SQL and NoSQL: Horizontal, Vertical, Range, Hash, and List Partitioning
TLDR: Partitioning splits one logical table into smaller physical pieces called partitions. The database planner skips irrelevant partitions entirely — turning a 30-second full-table scan into a 200ms single-partition read. Range partitioning is best...
Sub-topic
1 article
Dirty Read Explained: How Uncommitted Data Corrupts Transactions
TLDR: A dirty read occurs when Transaction B reads data written by Transaction A before A has committed. If A rolls back, B has made decisions on data that — from the database's perspective — never existed. Read Committed isolation (the default in Po...
Sub-topic
1 article
Database Anomalies: How SQL and NoSQL Handle Dirty Reads, Phantom Reads, and Write Skew
TLDR: Database anomalies are the predictable side-effects of concurrent transactions — dirty reads, phantom reads, write skew, and lost updates. SQL databases use MVCC and isolation levels to prevent them; PostgreSQL's Serializable Snapshot Isolation...
Sub-topic
1 article
Probabilistic Data Structures: A Practical Guide to Bloom Filters, HyperLogLog, and Count-Min Sketch
TLDR: Probabilistic data structures trade a small, bounded probability of being wrong for orders-of-magnitude better memory efficiency and O(1) speed. Bloom Filters answer "definitely not in this set" in constant time with zero false negatives. Hyper...
Sub-topic
1 article
How CDC Works Across Databases: PostgreSQL, MySQL, MongoDB, and Beyond
A data engineering team at a fintech company built what they believed was a robust Change Data Capture pipeline: three source databases (PostgreSQL, MongoDB, and Cassandra), Debezium connectors wired to Kafka, and a downstream data warehouse receivin...
Sub-topic
1 article
System Design Data Modeling and Schema Evolution: Query-Driven Storage That Survives Change
TLDR: In system design interviews, data modeling is where architecture meets reality. A good model starts from query patterns, chooses clear entity boundaries, defines indexes deliberately, and includes a schema evolution path so the system can chang...
