Abstract AlgorithmsAn AI Powered Learning Platform

Home

Topic

databases

27 articles across 14 sub-topics

Sub-topic

Distributed Systems

9 articles

Read Skew Explained: Inconsistent Snapshots Across Multiple Objects

TLDR: Read skew occurs when a transaction reads two logically related objects at different points in time — one before and one after a concurrent transaction commits — producing a view that never existed as a committed whole. Read Committed isolation...

Apr 11, 2026•31 min read

Phantom Read Explained: When New Rows Appear Mid-Transaction

TLDR: A phantom read occurs when a transaction runs the same range query twice and gets a different set of rows — because a concurrent transaction inserted or deleted matching rows and committed in between. Row locks cannot stop this because the phan...

Apr 11, 2026•29 min read

Write Skew Explained: The Anomaly That Requires Serializable Isolation

TLDR: Write skew is the hardest concurrency anomaly to reason about: two concurrent transactions each read a shared condition, decide they can safely proceed, and then write to different rows. No individual operation is wrong. No row was overwritten....

Apr 11, 2026•22 min read

Non-Repeatable Read Explained: When the Same Query Returns Different Results

TLDR: A non-repeatable read happens when the same SELECT returns different results within a single transaction because a concurrent transaction committed an update between the two reads. Read Committed isolation — the default in PostgreSQL, MySQL, an...

Apr 11, 2026•24 min read

Sharding Approaches in SQL and NoSQL: Range, Hash, and Directory-Based Strategies Compared

TLDR: Sharding splits your database across multiple physical nodes so no single machine carries all the data or absorbs all the writes. The strategy you choose — range, hash, consistent hashing, or directory — determines whether range queries stay ch...

Apr 5, 2026•27 min read

Key Terms in Distributed Systems: The Definitive Glossary

TLDR: Distributed systems vocabulary is precise for a reason. Mixing up read skew and write skew costs you an interview. Confusing Snapshot Isolation with Serializable costs you a production outage. This glossary organises every critical term into co...

Apr 5, 2026•45 min read

Sub-topic

Architecture

3 articles

Change Data Capture Pattern: Log-Based Data Movement Without Full Reloads

TLDR: Change data capture moves committed database changes into downstream systems without full reloads. It is most useful when freshness matters, replay matters, and the source database must remain the system of record. TLDR: CDC becomes production-...

Mar 13, 2026•15 min read

Understanding Consistency Patterns: An In-Depth Analysis

TLDR TLDR: Consistency is about whether all nodes in a distributed system show the same data at the same time. Strong consistency gives correctness but costs latency. Eventual consistency gives speed but requires tolerance for briefly stale reads. C...

Mar 9, 2026•13 min read

Data Warehouse vs Data Lake vs Data Lakehouse: Which One to Choose?

TLDR: Warehouse = structured, clean data for BI and SQL dashboards (Snowflake, BigQuery). Lake = raw, messy data for ML and data science (S3, HDFS). Lakehouse = open table formats (Delta Lake, Iceberg) that bring SQL performance to raw storage — the ...

Mar 9, 2026•14 min read

Sub-topic

Acid

2 articles

ACID Properties Explained: How SQL Databases Guarantee Atomicity, Consistency, Isolation, and Durability

TLDR: ACID is four orthogonal guarantees that every SQL transaction must provide. Atomicity says all-or-nothing: PostgreSQL implements it via WAL rollback; MySQL InnoDB via undo logs. Consistency says constraints always hold: the database rejects any...

Apr 17, 2026•36 min read

Isolation Levels in Databases: Read Committed, Repeatable Read, Snapshot, and Serializable Explained

TLDR: Isolation levels control which concurrency anomalies a transaction can see. Read Committed (PostgreSQL and Oracle's default) prevents dirty reads but still silently allows non-repeatable reads, write skew, and lost updates. Repeatable Read adds...

Apr 5, 2026•25 min read

Sub-topic

Concurrency

2 articles

Dirty Write Explained: When Uncommitted Data Gets Overwritten

TLDR: A dirty write occurs when Transaction B overwrites data that Transaction A has written but not yet committed. The result is not a rollback or an error — it is silently inconsistent committed data: one table reflects Transaction B's intent, anot...

Apr 11, 2026•26 min read

Lost Update Explained: When Two Writes Become One

TLDR: A lost update occurs when two concurrent read-modify-write transactions both read the same committed value, both compute a new value from it, and both write back — with the second write silently discarding the first. No error is raised. Both tr...

Apr 11, 2026•35 min read

Sub-topic

Cap-theorem

2 articles

Choosing the Right Database: CAP Theorem and Practical Use Cases

TLDR: Database selection is a trade-off between consistency, availability, and scalability. By using the CAP Theorem as a compass and matching your data access patterns to the right storage engine (Relational, Document, KV, or Wide-Column), you can b...

Apr 5, 2026•8 min read

BASE Theorem Explained: How it Stands Against ACID

TLDR TLDR: ACID (Atomicity, Consistency, Isolation, Durability) is the gold standard for banking. BASE (Basically Available, Soft state, Eventual consistency) is the standard for social media. BASE intentionally sacrifices instant accuracy in exchan...

Mar 9, 2026•14 min read

Sub-topic

Cassandra

1 article

NoSQL Partitioning: How Cassandra, DynamoDB, and MongoDB Split Data

TLDR: Every NoSQL database hides a partitioning engine behind a deceptively simple API. Cassandra uses a consistent hashing ring where a Murmur3 hash of your partition key selects a node — virtual nodes (vnodes) make rebalancing smooth. DynamoDB mana...

May 3, 2026•22 min read

Sub-topic

Partitioning

1 article

SQL Partitioning: Range, Hash, List, and Composite Strategies Explained

TLDR: SQL partitioning divides one logical table into smaller physical child tables, all accessed through the parent table name. The query optimizer skips irrelevant child tables entirely — a process called partition pruning — turning a 30-second ful...

May 3, 2026•23 min read

Sub-topic

Compare And Swap

1 article

Compare-and-Swap and Optimistic Locking: How Every Database Implements It

TLDR: Compare-and-Swap (CAS) is the CPU-level atomic instruction that makes lock-free concurrency possible. Optimistic locking builds on it at the database layer: read freely, compute locally, write only if the record has not changed. Every major dat...

Apr 17, 2026•31 min read

Sub-topic

Nosql

1 article

Partitioning Approaches in SQL and NoSQL: Horizontal, Vertical, Range, Hash, and List Partitioning

TLDR: Partitioning splits one logical table into smaller physical pieces called partitions. The database planner skips irrelevant partitions entirely — turning a 30-second full-table scan into a 200ms single-partition read. Range partitioning is best...

Apr 12, 2026•37 min read

Sub-topic

Dirty-read

1 article

Dirty Read Explained: How Uncommitted Data Corrupts Transactions

TLDR: A dirty read occurs when Transaction B reads data written by Transaction A before A has committed. If A rolls back, B has made decisions on data that — from the database's perspective — never existed. Read Committed isolation (the default in Po...

Apr 11, 2026•28 min read

Sub-topic

Consistency

1 article

Database Anomalies: How SQL and NoSQL Handle Dirty Reads, Phantom Reads, and Write Skew

TLDR: Database anomalies are the predictable side-effects of concurrent transactions — dirty reads, phantom reads, write skew, and lost updates. SQL databases use MVCC and isolation levels to prevent them; PostgreSQL's Serializable Snapshot Isolation...

Apr 5, 2026•29 min read

Sub-topic

Algorithms

1 article

Probabilistic Data Structures: A Practical Guide to Bloom Filters, HyperLogLog, and Count-Min Sketch

TLDR: Probabilistic data structures trade a small, bounded probability of being wrong for orders-of-magnitude better memory efficiency and O(1) speed. Bloom Filters answer "definitely not in this set" in constant time with zero false negatives. Hyper...

Apr 5, 2026•13 min read

Sub-topic

Cdc

1 article

How CDC Works Across Databases: PostgreSQL, MySQL, MongoDB, and Beyond

A data engineering team at a fintech company built what they believed was a robust Change Data Capture pipeline: three source databases (PostgreSQL, MongoDB, and Cassandra), Debezium connectors wired to Kafka, and a downstream data warehouse receivin...

Apr 5, 2026•33 min read

Sub-topic

Data-modeling

1 article

System Design Data Modeling and Schema Evolution: Query-Driven Storage That Survives Change

TLDR: In system design interviews, data modeling is where architecture meets reality. A good model starts from query patterns, chooses clear entity boundaries, defines indexes deliberately, and includes a schema evolution path so the system can chang...

Mar 12, 2026•13 min read