SQL Partitioning: Range, Hash, List, and Composite Strategies Explained

How PostgreSQL and MySQL use range, hash, list, and composite partitioning to eliminate unnecessary I/O — and when NOT to partition at all

System Design Interview Prep

Abstract Algorithms

·May 3, 2026·23 min read

📚

Intermediate

For developers with some experience. Builds on fundamentals.

Estimated read time: 23 min

AI-assisted content. This post may have been written or enhanced with AI tools. Please verify critical information independently.

TLDR: SQL partitioning divides one logical table into smaller physical child tables, all accessed through the parent table name. The query optimizer skips irrelevant child tables entirely — a process called partition pruning — turning a 30-second full-table scan into a 200ms single-partition read. Range partitioning is the right choice for time-series data with rolling archival. Hash partitioning prevents write hotspots. List partitioning enforces geo-compliance boundaries. The wrong partition key eliminates all of these gains. Automate partition lifecycle with pg_partman or you will have a midnight incident.

📖 The 2-Billion-Row Problem: Why Indexes Stop Scaling Without Partitioning

An index on a 2-billion-row table is still a 2-billion-entry B-Tree. Every lookup navigates that tree — hundreds of levels deep — and the leaf pages are scattered across the entire tablespace. Even with a selective index, large tables exhaust buffer pool capacity, force expensive disk reads, and make VACUUM operations that span the full table increasingly slow and disruptive.

Partitioning changes the problem fundamentally. Instead of one 400 GB table, you have 48 separate child tables — one per month across four years. A query for last month's orders reads only 8 GB. The other 392 GB are never opened because the database engine evaluates each child table's partition boundary against the query predicate and eliminates non-matching children before reading a single data page. This is partition pruning, and it is the entire performance mechanism that makes partitioning worth the operational overhead.

Partitioning vs. sharding — the distinction that matters in every system design interview:

Dimension	Partitioning	Sharding
Data location	Same node, or coordinator-managed distribution	Independent database nodes, each a fully separate instance
Application connection	Single endpoint — application sees one table	Shard-aware routing required — app knows about multiple databases
Cross-partition queries	Supported — planner merges results transparently	Expensive scatter-gather or not supported
Transaction guarantees	Full ACID within the node	Cross-shard: 2-Phase Commit or eventual consistency only
Scaling target	Query performance and data manageability	Write throughput beyond what a single node can serve

Rule of thumb: If your application connects to one connection string, you are partitioning. Partitioning and sharding are sequential steps on the scaling ladder — not competing alternatives. Start with one database. Add partitioning when queries get slow on large tables — it is a DDL change, not a topology change. Move to sharding when the dataset or write throughput genuinely exceeds a single node.

🔍 The Four SQL Partitioning Strategies at a Glance

Before going deep on each strategy, here is the orientation table. Every major SQL database — PostgreSQL, MySQL, CockroachDB, Spanner — supports some subset of these. PostgreSQL supports all four natively as of version 10.

Strategy	Splits By	Best For	Key Trade-Off
Range	Ordered value ranges — dates, IDs	Time-series, rolling archival, rolling retention windows	Write hotspot on the latest partition
List	Explicit enumerated values per partition	Geographic regions, category-based isolation, GDPR compliance	All values must be known at design time
Hash	`hash(key) mod N` — uniform distribution	High-volume inserts, write hotspot prevention	No range query pruning — all partitions scanned for range predicates
Composite	Two levels: range-then-hash or list-then-hash	Very large tables needing both date pruning and write balance	Higher DDL complexity; more total partitions to manage

Each strategy makes the same fundamental trade-off in a different direction: range partitioning is optimal for queries that filter by a time window but concentrates writes on the current period; hash partitioning is optimal for write balance but cannot prune for range predicates; list partitioning is optimal for categorical isolation but requires foreknowledge of all valid values.

⚙️ Range Partitioning: Month-by-Month Child Tables for Time-Series Data

Range partitioning assigns each row to a child table based on whether the partition key value falls within a defined range boundary. The classic application is a time-series table partitioned by month — each month's data living in its own child table, with older months archived or dropped via a single DDL statement.

Pseudo-schema — range-partitioned orders table:

Column	Type	Notes
id	bigint	Auto-increment surrogate key
customer_id	bigint	Required — used for per-customer queries
total_cents	bigint	Stored in cents to avoid floating-point rounding
status	text	`pending`, `paid`, `shipped`, `cancelled`
created_at	timestamptz	Partition key — range boundary applied per month

Partition layout:

Child Table	Range Boundary	Contains
orders_2024_01	`[2024-01-01, 2024-02-01)`	January 2024 rows
orders_2024_02	`[2024-02-01, 2024-03-01)`	February 2024 rows
orders_2024_03	`[2024-03-01, 2024-04-01)`	March 2024 rows
orders_default	`MAXVALUE`	Catch-all for dates beyond the last explicit partition

What partition pruning delivers here: A query filtering on created_at BETWEEN 2024-01-10 AND 2024-01-20 causes the planner to evaluate each child table's range boundary, find that only orders_2024_01 can possibly match, and skip all other child tables before opening a single data file. In a 48-partition table spanning four years, this prunes 47 of 48 child tables — a 47× reduction in I/O before even consulting a local index on customer_id.

Instant archival is the operational superpower of range partitioning. Detaching the January 2022 child table from the parent removes it from the live table in milliseconds — no multi-hour DELETE with vacuuming. Dropping it outright is equally fast. This is why range partitioning is universally used for compliance retention windows where data must be deleted after a fixed period.

Write hotspot — range partitioning's structural weakness: All current-period inserts land on the most recent child table. January 2024 receives every new order until February 1st, after which February becomes the hot partition. For write-heavy tables, this concentrates I/O on a single child table. The mitigation is composite partitioning — range by month, then hash within each month by customer ID — which distributes writes across multiple sub-partitions while preserving date-range pruning for month-scoped queries.

⚙️ List Partitioning: Categorical Splits for Geo-Compliance Boundaries

List partitioning assigns rows to child tables based on membership in an explicit set of values. When those values are geographic identifiers, the partition boundary becomes a compliance boundary — EU data physically lives in the EU partition, which can be storage-pinned to EU-region infrastructure.

Pseudo-schema — list-partitioned orders table:

Column	Type	Notes
id	bigint	Auto-increment surrogate key
customer_id	bigint	Required
region	text	Partition key — explicit region code such as `US`, `DE`, `JP`
total_cents	bigint	Required
created_at	timestamptz	Required

Partition layout:

Child Table	Partition Values	Purpose
orders_us	`('US', 'CA', 'MX')`	North American region
orders_eu	`('DE', 'FR', 'GB', 'NL', 'ES', 'IT')`	EU region — GDPR compliance boundary
orders_apac	`('JP', 'SG', 'AU', 'IN', 'KR')`	APAC region
orders_default	`DEFAULT`	Catch-all for unrecognized region codes

The DEFAULT partition is mandatory. Without it, an insert for a region code not listed in any explicit partition fails with a constraint error. At midnight during a rollout for a new market, that error means production data loss.

Pruning behaviour: A query with WHERE region = 'DE' prunes to orders_eu — the planner evaluates set membership and eliminates every other child table. A query with WHERE region IN ('DE', 'JP') prunes to orders_eu and orders_apac — two tables instead of four.

Geo-compliance at the database layer: By assigning tablespace or storage constraints to the EU child table (via CockroachDB zone configs or PostgreSQL tablespaces), EU data can be physically constrained to EU-region infrastructure. The partition boundary is the compliance boundary — more reliable than application-layer enforcement because it does not depend on every code path correctly routing writes.

⚙️ Hash Partitioning: Even Write Distribution Across N Buckets

Hash partitioning applies hash(partition_key) mod N to assign each row to one of N child tables. The distribution is deterministic but appears random — which is exactly what prevents write hotspots.

Pseudo-schema — hash-partitioned events table:

Column	Type	Notes
id	bigint	Auto-increment surrogate key
customer_id	bigint	Partition key — hashed to determine child table
event_type	text	Click, purchase, view, etc.
payload	jsonb	Event data
created_at	timestamptz	Required

Partition layout — 4 partitions:

Child Table	Assignment Rule	Receives Rows Where
events_p0	`hash(customer_id) mod 4 = 0`	~25% of all customer IDs
events_p1	`hash(customer_id) mod 4 = 1`	~25% of all customer IDs
events_p2	`hash(customer_id) mod 4 = 2`	~25% of all customer IDs
events_p3	`hash(customer_id) mod 4 = 3`	~25% of all customer IDs

Pruning for point lookups vs. range queries:

Query Pattern	Pruning Result
`WHERE customer_id = 42`	Prunes to exactly 1 child table — planner computes `hash(42) mod 4`
`WHERE created_at > '2024-01-01'`	No pruning — the hash has no sense of order on `created_at`
`WHERE customer_id = 42 AND created_at > '2024-01-01'`	Prunes to exactly 1 child table — `customer_id` is the partition key

Hash partitioning optimizes for point-lookup performance and write balance. It is the wrong strategy when your most frequent queries filter by an ordered range on a non-hash-key column.

Write distribution is hash partitioning's defining advantage. Inserts for any customer ID land on one of four child tables based on the hash, not on insert time. A high-volume event stream that would saturate a single range partition under date-based partitioning distributes load evenly across four partitions under hash partitioning.

⚙️ Composite Partitioning: Two-Level Strategies for Tables That Outgrow a Single Approach

Composite sub-partitioning applies two levels. The most common pattern: range by month at the first level, then hash by customer ID within each monthly partition. This delivers both date-range pruning for reporting queries and write balance within each month.

Composite partition structure:

Level	Strategy	Key	Purpose
Level 1	Range	created_at — monthly	Prune by date — reporting and instant archival
Level 2	Hash	customer_id — 4 buckets	Balance writes within each monthly range

The resulting partition count: 48 months × 4 hash buckets = 192 child tables for a 4-year dataset. Each holds roughly total_rows / 192 rows.

Pruning for composite partitions:

Query Predicate	Partitions Scanned
`WHERE created_at BETWEEN Jan and Feb 2024`	4 (all hash buckets for January only)
`WHERE created_at BETWEEN Jan and Feb 2024 AND customer_id = 42`	1 (exact hash bucket within January)
`WHERE customer_id = 42` (no date filter)	192 — zero pruning on a range partition without the partition key

The PostgreSQL planning ceiling: Composite partitioning increases total child table count. PostgreSQL adds per-partition overhead at planning time — negligible below 500 partitions, measurable at 1,000, significant at 5,000. For composite schemes producing thousands of sub-partitions, measure query planning latency at realistic partition counts before committing the design to production.

📊 How the Query Optimizer Decides Which Partitions to Skip

The diagram below traces the full pruning decision path from query arrival through the PostgreSQL planner to partition elimination. Follow it from top to bottom to see exactly where the pruning decision is made and what determines which child tables are eliminated.

flowchart TD
    Q["Query received: WHERE created_at BETWEEN Jan and Feb 2024"] --> OPT["Query optimizer enters partition pruning phase"]
    OPT --> CHECK{"Partition key present in WHERE clause?"}
    CHECK -->|"Yes — pruning active"| BOUNDS["Evaluate each child table boundary against predicate"]
    CHECK -->|"No — no pruning possible"| FULL["Scan all child tables (avoid this pattern)"]
    BOUNDS --> SKIP1["Skip 2021 partitions — bounds outside predicate range"]
    BOUNDS --> SKIP2["Skip 2022 partitions — bounds outside predicate range"]
    BOUNDS --> SKIP3["Skip 2023 partitions — bounds outside predicate range"]
    BOUNDS --> SCAN["Scan Jan 2024 child table — bounds match predicate"]
    SKIP1 --> MERGE["Merge results and return to client"]
    SKIP2 --> MERGE
    SKIP3 --> MERGE
    SCAN --> MERGE
    FULL --> MERGE

The critical fork is the diamond node: is the partition key in the WHERE clause? When it is, the planner eliminates dozens of child tables before opening a single data file. When it is not, every child table is opened — making the partitioned table perform worse than an equivalent unpartitioned table because of the overhead of N table opens, N index scans, and a merge step.

This diagram captures the single most important rule in partition design: the partition key must appear in your most frequent query predicates. Verify it with EXPLAIN (ANALYZE, VERBOSE) and look for Partitions excluded: N in the output. A 48-partition table with a month-scoped query should show Partitions excluded: 47. If excluded is 0, your query pattern and partition key are misaligned.

🧠 Deep Dive: Static vs. Dynamic Partition Pruning in PostgreSQL

PostgreSQL's pruning mechanism has two modes. Understanding the distinction matters for parameterized queries — the most common form in production application code.

Partition Pruning Internals: How PostgreSQL Plans Around Boundaries

Static pruning happens at parse time when the partition key predicate contains a literal value. The planner knows at query compilation time exactly which child tables can possibly match. The plan is fixed — partition metadata is consulted once and eliminated partitions never appear in the execution plan.

Dynamic pruning happens at execution time when the predicate value comes from a runtime parameter: a bind variable in a prepared statement, or a value supplied via a nested loop join. PostgreSQL v12+ supports dynamic pruning for parameterized queries — the executor re-evaluates the partition set at runtime using the bound parameter value. MySQL supports static pruning for RANGE and LIST partitions but has limited dynamic pruning support, which can force full-partition scans for parameterized queries even when a literal equivalent would prune correctly.

Static vs. dynamic pruning support matrix:

Database	Static Pruning	Dynamic Pruning	Notes
PostgreSQL 10–11	✅ RANGE, LIST, HASH	❌ Not supported	Bind parameters force full scan
PostgreSQL 12+	✅ All partition types	✅ Parameterized queries and nested loops	Recommended version for production partitioning
MySQL 8.0	✅ RANGE, LIST	⚠️ Limited	Static only for most query forms
CockroachDB	✅ All partition types	✅	Full pruning support

Performance Analysis: Measuring Pruning Effectiveness

The most direct way to measure pruning impact is EXPLAIN (ANALYZE, BUFFERS). Look at the Partitions selected output. With correct pruning, a query against a 48-partition table that touches only January data should show Partitions selected: 1. Seeing Partitions selected: 48 on a query with a literal date predicate means the partition key is not in the WHERE clause, or it is wrapped in a function.

Query Pattern	Partitions Scanned	Relative Cost
`WHERE created_at >= '2024-01-01'` (range literal)	1	Baseline — optimal
`WHERE DATE(created_at) = '2024-01-15'` (function wrap)	48	48× row reads
`WHERE created_at = $1` (bind param, PG 12+)	1	Same as literal
`WHERE created_at = $1` (bind param, PG 11)	48	No dynamic pruning

**The function-wrapping trap

Wrapping the partition key column in a function call defeats pruning. WHERE DATE(created_at) = '2024-01-15' forces the planner to materialize DATE(created_at) for every row before evaluating the predicate — it cannot apply partition bounds because the comparison is on the derived value, not the raw column. The correct form is a range predicate on the raw column: WHERE created_at >= '2024-01-15' AND created_at < '2024-01-16'. A routine refactor that wraps the partition key in DATE(), TRUNC(), or TO_CHAR() silently disables pruning for a query that was previously fast. Add EXPLAIN (ANALYZE) assertions for critical queries in CI pipelines to catch this before it reaches production.

⚖️ Hot-Partition Anti-Patterns and When NOT to Partition

Five anti-patterns that destroy partition performance:

Anti-Pattern	What Happens	Fix
Partition key missing from critical queries	Zero pruning — every child table is scanned — performance is worse than unpartitioned	Align the partition key with the most frequent WHERE predicate; redesign if mismatched
Over-partitioning — more than 500 partitions	Planning overhead adds hundreds of milliseconds to every query even after pruning	Use composite partitioning to reduce the top-level partition count
Low-cardinality partition key	A boolean column produces 2 partitions — near-zero benefit with DDL overhead	Choose keys with enough distinct values to justify the intended partition count
Skewed range sizes — 80% of data in one partition	That partition handles 80% of all I/O — the partition that receives the most writes is always the bottleneck	Use monthly ranges; add hash sub-partitioning for the current high-write period
Missing DEFAULT partition on list-partitioned tables	An unrecognized region code causes an insert failure at midnight during a rollout	Always define a DEFAULT partition before deploying the parent table

When NOT to partition:

Condition	Reason to Skip Partitioning
Table has fewer than 50 million rows	Existing B-Tree indexes are sufficient — partitioning adds DDL overhead without benefit
No natural partition key in the access pattern	Without a key that appears in frequent WHERE clauses, there is zero pruning benefit
Primary bottleneck is write throughput beyond a single node	Partitioning does not help — horizontal sharding via Vitess or Citus is the correct solution
Frequent cross-partition joins on non-partition-key columns	Multi-partition merge adds overhead that may be worse than the equivalent on an unpartitioned table
Partition count would exceed 1,000	PostgreSQL planning overhead at very high partition counts degrades OLTP query latency even after pruning

The system design interview answer: partition when tables are large and queries are slow on a single node. Shard when data volume or write throughput genuinely exceeds what a single node can serve.

🧭 Choosing the Right Partition Strategy: Decision Guide

Situation	Recommended Strategy	Reason
Time-series data with rolling archival — logs, payments, events	Range by date — monthly or quarterly	Date-range pruning plus instant archival via DETACH PARTITION
High-volume inserts where even distribution is required	Hash by primary access key — user ID, order ID	Prevents write hotspot; point lookups prune to exactly one partition
Geo-compliance or categorical access patterns	List by region or country code	Partition boundary equals compliance boundary — always add DEFAULT
Table so large that monthly range partitions are still too big	Composite: range by month plus hash by user ID within each month	Combines date pruning with write balance
Queries are almost entirely point lookups by ID	Hash partitioning	Optimal for point lookup plus write balance — accept range scans on other columns
Queries filter by multiple unrelated columns	Choose the column in the most frequent WHERE clause as partition key	Only one partition key per table — target the highest-value pruning predicate

🌍 Real-World Applications of SQL Partitioning

Financial ledger at a major payments processor. A payments company running on PostgreSQL held 11 billion transaction rows in a single transactions table. Month-end reporting queries scanned the entire table. After migrating to monthly range partitioning on settled_at, the same reports ran against two or three child partitions. Query times dropped from 42 seconds to 900 milliseconds. Old partitions were detached and moved to cheaper cold storage using ALTER TABLE ... DETACH PARTITION, with no application code change. The parent table name never changed — the migration was entirely invisible to the application layer.

E-commerce order history at global scale. A global retailer used list partitioning by region code (EMEA, APAC, AMER, LATAM) to comply with data residency regulations that require customer order data to remain within specific geographic boundaries. Each partition was hosted on a tablespace located in the correct data center. Queries scoped to a single region skipped all other partitions entirely, and compliance auditors could verify data locality at the partition level without touching application logic.

Audit logging at a SaaS platform. A multi-tenant SaaS platform wrote 200 million audit log rows per month. The team used composite partitioning: range by month at the top level, then hash by tenant ID within each month. This gave two benefits simultaneously: old months were detached on a rolling 13-month retention schedule with pg_partman, and hot write traffic was distributed evenly across 16 hash sub-partitions per month rather than concentrating on a single child table.

Key lesson from all three cases: the partition strategy that performed best was always the one whose key matched the dominant query predicate — not the one that seemed intuitively clean on a whiteboard.

🧪 Practical Examples: Applying Partition Strategy to Real Query Patterns

Scenario 1: Analytics on a slow time-series table. A page_views table with 4 billion rows is slow on all date-range queries. The dominant query pattern is WHERE viewed_at BETWEEN '2024-01-01' AND '2024-01-31'. The correct strategy is range partitioning by month on viewed_at. After partitioning into 48 monthly child tables, the January query reads only one child table — roughly 80 million rows instead of 4 billion. The query planner shows Partitions selected: 1 in EXPLAIN ANALYZE.

Common mistake: partitioning by user_id instead, which makes the January query scan all 48 partitions because viewed_at is not the partition key.

Scenario 2: Preventing a write hotspot. An events table receives 50,000 inserts per second, all with current timestamps. Range partitioning by date concentrates every insert on the current month's partition — effectively making it a single write target. Switching to hash partitioning on event_id distributes inserts uniformly across 32 partitions, eliminating the hotspot. The trade-off: date-range queries now scan all 32 partitions instead of one.

Decision rule: choose hash when write throughput is the bottleneck; choose range when read selectivity is the bottleneck. You cannot optimise for both simultaneously with a single partition key.

Scenario 3: Testing partition pruning with EXPLAIN. Run EXPLAIN (ANALYZE) SELECT * FROM orders WHERE created_at >= '2024-06-01' AND created_at < '2024-07-01' and look for Partitions selected: 1 in the output. If you see Partitions selected: 48, the predicate is not hitting the partition key — check for function wrapping or a missing column in the WHERE clause.

🛠️ pg_partman: Automated Partition Lifecycle Management for PostgreSQL

Manual partition pre-creation works for small setups, but production systems need automation. If the cron job that creates next month's partition fails on the 20th, inserts fail at midnight on the 1st — a production incident caused by missing infrastructure, not bugs in application code. pg_partman is the open-source PostgreSQL extension that prevents this.

pg_partman automatically pre-creates future partitions, enforces retention policies, and handles archival. Its background worker (pg_partman_bgw) runs maintenance on a configurable schedule inside the PostgreSQL process — no external cron dependency required.

Key pg_partman configuration parameters:

Parameter	Purpose	Recommended Value
`p_premake`	Number of future partitions to pre-create	At least 2 — covers deploy delays and off-hours maintenance windows
`p_interval`	Partition interval	`'daily'`, `'weekly'`, `'monthly'`, `'yearly'`
`retention`	How long to retain partitions before action	`'24 months'` for rolling 2-year retention
`retention_keep_table`	`true` detaches instead of dropping — archive pattern; `false` drops outright	`false` for OLTP rolling windows; `true` for data warehouse archival
`infinite_time_partitions`	`true` creates partitions beyond the premake window as data arrives	`true` for safety — prevents insert failures on unexpected future data

Operational workflow with pg_partman:

Create the parent table with the partition key and interval defined.
Register the parent with pg_partman, specifying interval and premake count.
Configure retention — how long to keep partitions and whether to detach or drop.
pg_partman's background worker runs maintenance automatically on the configured schedule.
Monitor via pg_partman's built-in views: partman.part_config shows all managed parents; partman.check_subpart_sums validates child table consistency.

Pre-creation safety margin: Set p_premake to at least 2. This creates two months of future partitions immediately. If maintenance is delayed — a holiday, a failed deploy — there is a buffer before inserts would hit a missing partition. For daily-partitioned high-traffic tables, set p_premake to 7 or higher.

For a full deep-dive on pg_partman including native triggers, sub-partitioning hierarchies, and pg_partman_bgw tuning, see the pg_partman GitHub repository.

📚 Lessons Learned: What Production Teaches You About SQL Partitioning

The partition key is a permanent architectural decision. Re-partitioning a 500 GB live table requires building a new partitioned table schema, migrating data with minimal downtime via logical replication or bulk INSERT/ATTACH, and cutting over at a maintenance window. Plan the partition key with the same weight you give to API contracts. Changing it later is a multi-week engineering project under production load.

Monitor partition skew from day one. Set up a weekly query that reports row counts and physical size per child table. A partition growing 10× larger than its siblings signals skewed distribution — a write hotspot forming or an unbalanced hash function. Catching it at 20 GB is a configuration change. Catching it at 200 GB is a live migration under load with risk of data loss.

Always verify pruning with EXPLAIN before shipping to production. Add EXPLAIN (ANALYZE) assertions for critical queries in your CI pipeline. Assert that Partitions excluded is non-zero. A routine refactor that wraps the partition key column in a function call silently disables pruning for a query that was previously fast.

Range partitioning write hotspots appear gradually and then cause sudden spikes. The first month after launching a range-partitioned table, the write hotspot is a single small partition. After a year, the current month's partition receives the same write rate — but the hot partition is one of 12. At two years, one of 24. The hotspot is structural and permanent. Monitor I/O per partition; if the current month consistently shows 5–10× more I/O than any other, add hash sub-partitioning to spread the current period's load.

Partition maintenance automation is not optional in production. Every production system running range-partitioned tables should use pg_partman or equivalent automation. The failure mode — no partition for tomorrow's date — is silent until midnight when inserts start failing. This is not a theoretical concern; it is a production incident that happens in real systems where partition creation was left to a manually-maintained cron job without alerting.

📌 TLDR: Five Rules for SQL Partitioning

Range partitioning for time-series data. Date-range pruning plus instant archival via DETACH PARTITION. Accept the write hotspot on the current period and mitigate with composite partitioning if write volume is high.
Hash partitioning for write balance. Uniform write distribution plus point-lookup pruning. Accept that range predicates on non-key columns scan all partitions.
List partitioning for compliance boundaries. The partition boundary is the compliance boundary for geo-constrained data. Always add a DEFAULT partition.
The partition key must appear in your most frequent WHERE clause. Verify with EXPLAIN ANALYZE before shipping. Zero excluded partitions means zero benefit — and worse performance than an unpartitioned table due to merge overhead.
Automate partition lifecycle with pg_partman from day one. Manual partition creation is a production incident waiting to happen. Pre-create at least 2 future partitions; set retention policies before the first partition ages out.

Test Your Knowledge

🧠

Ready to test what you just learned?

AI will generate 4 questions based on this article's content.

NoSQL Partitioning: How Cassandra, DynamoDB, and MongoDB Split Data

TLDR: Every NoSQL database hides a partitioning engine behind a deceptively simple API. Cassandra uses a consistent hashing ring where a Murmur3 hash of your partition key selects a node — virtual nodes (vnodes) make rebalancing smooth. DynamoDB mana...

May 3, 2026•22 min read

Split Brain Explained: When Two Nodes Both Think They Are Leader

TLDR: Split brain happens when a network partition causes two nodes to simultaneously believe they are the leader — each accepting writes the other never sees. Prevent it with quorum consensus (at least ⌊N/2⌋+1 nodes must agree before leadership is g...

May 3, 2026•20 min read

Clock Skew and Causality Violations: Why Distributed Clocks Lie

TLDR: Physical clocks on distributed machines cannot be perfectly synchronized. NTP keeps them within tens to hundreds of milliseconds in normal conditions — but under load, across datacenters, or after a VM pause, the drift can reach seconds. When s...

May 3, 2026•18 min read

Stale Reads and Cascading Failures in Distributed Systems

TLDR: Stale reads return superseded data from replicas that haven't yet applied the latest write. Cascading failures turn one overloaded node into a cluster-wide collapse through retry storms and redistributed load. Both are preventable — stale reads...

May 3, 2026•23 min read