SQL Partitioning: Range, Hash, List, and Composite Strategies Explained
How PostgreSQL and MySQL use range, hash, list, and composite partitioning to eliminate unnecessary I/O — and when NOT to partition at all
Abstract AlgorithmsIntermediate
For developers with some experience. Builds on fundamentals.
Estimated read time: 23 min
AI-assisted content. This post may have been written or enhanced with AI tools. Please verify critical information independently.
TLDR: SQL partitioning divides one logical table into smaller physical child tables, all accessed through the parent table name. The query optimizer skips irrelevant child tables entirely — a process called partition pruning — turning a 30-second full-table scan into a 200ms single-partition read. Range partitioning is the right choice for time-series data with rolling archival. Hash partitioning prevents write hotspots. List partitioning enforces geo-compliance boundaries. The wrong partition key eliminates all of these gains. Automate partition lifecycle with pg_partman or you will have a midnight incident.
📖 The 2-Billion-Row Problem: Why Indexes Stop Scaling Without Partitioning
An index on a 2-billion-row table is still a 2-billion-entry B-Tree. Every lookup navigates that tree — hundreds of levels deep — and the leaf pages are scattered across the entire tablespace. Even with a selective index, large tables exhaust buffer pool capacity, force expensive disk reads, and make VACUUM operations that span the full table increasingly slow and disruptive.
Partitioning changes the problem fundamentally. Instead of one 400 GB table, you have 48 separate child tables — one per month across four years. A query for last month's orders reads only 8 GB. The other 392 GB are never opened because the database engine evaluates each child table's partition boundary against the query predicate and eliminates non-matching children before reading a single data page. This is partition pruning, and it is the entire performance mechanism that makes partitioning worth the operational overhead.
Partitioning vs. sharding — the distinction that matters in every system design interview:
| Dimension | Partitioning | Sharding |
| Data location | Same node, or coordinator-managed distribution | Independent database nodes, each a fully separate instance |
| Application connection | Single endpoint — application sees one table | Shard-aware routing required — app knows about multiple databases |
| Cross-partition queries | Supported — planner merges results transparently | Expensive scatter-gather or not supported |
| Transaction guarantees | Full ACID within the node | Cross-shard: 2-Phase Commit or eventual consistency only |
| Scaling target | Query performance and data manageability | Write throughput beyond what a single node can serve |
Rule of thumb: If your application connects to one connection string, you are partitioning. Partitioning and sharding are sequential steps on the scaling ladder — not competing alternatives. Start with one database. Add partitioning when queries get slow on large tables — it is a DDL change, not a topology change. Move to sharding when the dataset or write throughput genuinely exceeds a single node.
🔍 The Four SQL Partitioning Strategies at a Glance
Before going deep on each strategy, here is the orientation table. Every major SQL database — PostgreSQL, MySQL, CockroachDB, Spanner — supports some subset of these. PostgreSQL supports all four natively as of version 10.
| Strategy | Splits By | Best For | Key Trade-Off |
| Range | Ordered value ranges — dates, IDs | Time-series, rolling archival, rolling retention windows | Write hotspot on the latest partition |
| List | Explicit enumerated values per partition | Geographic regions, category-based isolation, GDPR compliance | All values must be known at design time |
| Hash | hash(key) mod N — uniform distribution | High-volume inserts, write hotspot prevention | No range query pruning — all partitions scanned for range predicates |
| Composite | Two levels: range-then-hash or list-then-hash | Very large tables needing both date pruning and write balance | Higher DDL complexity; more total partitions to manage |
Each strategy makes the same fundamental trade-off in a different direction: range partitioning is optimal for queries that filter by a time window but concentrates writes on the current period; hash partitioning is optimal for write balance but cannot prune for range predicates; list partitioning is optimal for categorical isolation but requires foreknowledge of all valid values.
⚙️ Range Partitioning: Month-by-Month Child Tables for Time-Series Data
Range partitioning assigns each row to a child table based on whether the partition key value falls within a defined range boundary. The classic application is a time-series table partitioned by month — each month's data living in its own child table, with older months archived or dropped via a single DDL statement.
Pseudo-schema — range-partitioned orders table:
| Column | Type | Notes |
| id | bigint | Auto-increment surrogate key |
| customer_id | bigint | Required — used for per-customer queries |
| total_cents | bigint | Stored in cents to avoid floating-point rounding |
| status | text | pending, paid, shipped, cancelled |
| created_at | timestamptz | Partition key — range boundary applied per month |
Partition layout:
| Child Table | Range Boundary | Contains |
| orders_2024_01 | [2024-01-01, 2024-02-01) | January 2024 rows |
| orders_2024_02 | [2024-02-01, 2024-03-01) | February 2024 rows |
| orders_2024_03 | [2024-03-01, 2024-04-01) | March 2024 rows |
| orders_default | MAXVALUE | Catch-all for dates beyond the last explicit partition |
What partition pruning delivers here: A query filtering on created_at BETWEEN 2024-01-10 AND 2024-01-20 causes the planner to evaluate each child table's range boundary, find that only orders_2024_01 can possibly match, and skip all other child tables before opening a single data file. In a 48-partition table spanning four years, this prunes 47 of 48 child tables — a 47× reduction in I/O before even consulting a local index on customer_id.
Instant archival is the operational superpower of range partitioning. Detaching the January 2022 child table from the parent removes it from the live table in milliseconds — no multi-hour DELETE with vacuuming. Dropping it outright is equally fast. This is why range partitioning is universally used for compliance retention windows where data must be deleted after a fixed period.
Write hotspot — range partitioning's structural weakness: All current-period inserts land on the most recent child table. January 2024 receives every new order until February 1st, after which February becomes the hot partition. For write-heavy tables, this concentrates I/O on a single child table. The mitigation is composite partitioning — range by month, then hash within each month by customer ID — which distributes writes across multiple sub-partitions while preserving date-range pruning for month-scoped queries.
⚙️ List Partitioning: Categorical Splits for Geo-Compliance Boundaries
List partitioning assigns rows to child tables based on membership in an explicit set of values. When those values are geographic identifiers, the partition boundary becomes a compliance boundary — EU data physically lives in the EU partition, which can be storage-pinned to EU-region infrastructure.
Pseudo-schema — list-partitioned orders table:
| Column | Type | Notes |
| id | bigint | Auto-increment surrogate key |
| customer_id | bigint | Required |
| region | text | Partition key — explicit region code such as US, DE, JP |
| total_cents | bigint | Required |
| created_at | timestamptz | Required |
Partition layout:
| Child Table | Partition Values | Purpose |
| orders_us | ('US', 'CA', 'MX') | North American region |
| orders_eu | ('DE', 'FR', 'GB', 'NL', 'ES', 'IT') | EU region — GDPR compliance boundary |
| orders_apac | ('JP', 'SG', 'AU', 'IN', 'KR') | APAC region |
| orders_default | DEFAULT | Catch-all for unrecognized region codes |
The DEFAULT partition is mandatory. Without it, an insert for a region code not listed in any explicit partition fails with a constraint error. At midnight during a rollout for a new market, that error means production data loss.
Pruning behaviour: A query with WHERE region = 'DE' prunes to orders_eu — the planner evaluates set membership and eliminates every other child table. A query with WHERE region IN ('DE', 'JP') prunes to orders_eu and orders_apac — two tables instead of four.
Geo-compliance at the database layer: By assigning tablespace or storage constraints to the EU child table (via CockroachDB zone configs or PostgreSQL tablespaces), EU data can be physically constrained to EU-region infrastructure. The partition boundary is the compliance boundary — more reliable than application-layer enforcement because it does not depend on every code path correctly routing writes.
⚙️ Hash Partitioning: Even Write Distribution Across N Buckets
Hash partitioning applies hash(partition_key) mod N to assign each row to one of N child tables. The distribution is deterministic but appears random — which is exactly what prevents write hotspots.
Pseudo-schema — hash-partitioned events table:
| Column | Type | Notes |
| id | bigint | Auto-increment surrogate key |
| customer_id | bigint | Partition key — hashed to determine child table |
| event_type | text | Click, purchase, view, etc. |
| payload | jsonb | Event data |
| created_at | timestamptz | Required |
Partition layout — 4 partitions:
| Child Table | Assignment Rule | Receives Rows Where |
| events_p0 | hash(customer_id) mod 4 = 0 | ~25% of all customer IDs |
| events_p1 | hash(customer_id) mod 4 = 1 | ~25% of all customer IDs |
| events_p2 | hash(customer_id) mod 4 = 2 | ~25% of all customer IDs |
| events_p3 | hash(customer_id) mod 4 = 3 | ~25% of all customer IDs |
Pruning for point lookups vs. range queries:
| Query Pattern | Pruning Result |
WHERE customer_id = 42 | Prunes to exactly 1 child table — planner computes hash(42) mod 4 |
WHERE created_at > '2024-01-01' | No pruning — the hash has no sense of order on created_at |
WHERE customer_id = 42 AND created_at > '2024-01-01' | Prunes to exactly 1 child table — customer_id is the partition key |
Hash partitioning optimizes for point-lookup performance and write balance. It is the wrong strategy when your most frequent queries filter by an ordered range on a non-hash-key column.
Write distribution is hash partitioning's defining advantage. Inserts for any customer ID land on one of four child tables based on the hash, not on insert time. A high-volume event stream that would saturate a single range partition under date-based partitioning distributes load evenly across four partitions under hash partitioning.
⚙️ Composite Partitioning: Two-Level Strategies for Tables That Outgrow a Single Approach
Composite sub-partitioning applies two levels. The most common pattern: range by month at the first level, then hash by customer ID within each monthly partition. This delivers both date-range pruning for reporting queries and write balance within each month.
Composite partition structure:
| Level | Strategy | Key | Purpose |
| Level 1 | Range | created_at — monthly | Prune by date — reporting and instant archival |
| Level 2 | Hash | customer_id — 4 buckets | Balance writes within each monthly range |
The resulting partition count: 48 months × 4 hash buckets = 192 child tables for a 4-year dataset. Each holds roughly total_rows / 192 rows.
Pruning for composite partitions:
| Query Predicate | Partitions Scanned |
WHERE created_at BETWEEN Jan and Feb 2024 | 4 (all hash buckets for January only) |
WHERE created_at BETWEEN Jan and Feb 2024 AND customer_id = 42 | 1 (exact hash bucket within January) |
WHERE customer_id = 42 (no date filter) | 192 — zero pruning on a range partition without the partition key |
The PostgreSQL planning ceiling: Composite partitioning increases total child table count. PostgreSQL adds per-partition overhead at planning time — negligible below 500 partitions, measurable at 1,000, significant at 5,000. For composite schemes producing thousands of sub-partitions, measure query planning latency at realistic partition counts before committing the design to production.
📊 How the Query Optimizer Decides Which Partitions to Skip
The diagram below traces the full pruning decision path from query arrival through the PostgreSQL planner to partition elimination. Follow it from top to bottom to see exactly where the pruning decision is made and what determines which child tables are eliminated.
flowchart TD
Q["Query received: WHERE created_at BETWEEN Jan and Feb 2024"] --> OPT["Query optimizer enters partition pruning phase"]
OPT --> CHECK{"Partition key present in WHERE clause?"}
CHECK -->|"Yes — pruning active"| BOUNDS["Evaluate each child table boundary against predicate"]
CHECK -->|"No — no pruning possible"| FULL["Scan all child tables (avoid this pattern)"]
BOUNDS --> SKIP1["Skip 2021 partitions — bounds outside predicate range"]
BOUNDS --> SKIP2["Skip 2022 partitions — bounds outside predicate range"]
BOUNDS --> SKIP3["Skip 2023 partitions — bounds outside predicate range"]
BOUNDS --> SCAN["Scan Jan 2024 child table — bounds match predicate"]
SKIP1 --> MERGE["Merge results and return to client"]
SKIP2 --> MERGE
SKIP3 --> MERGE
SCAN --> MERGE
FULL --> MERGE
The critical fork is the diamond node: is the partition key in the WHERE clause? When it is, the planner eliminates dozens of child tables before opening a single data file. When it is not, every child table is opened — making the partitioned table perform worse than an equivalent unpartitioned table because of the overhead of N table opens, N index scans, and a merge step.
This diagram captures the single most important rule in partition design: the partition key must appear in your most frequent query predicates. Verify it with EXPLAIN (ANALYZE, VERBOSE) and look for Partitions excluded: N in the output. A 48-partition table with a month-scoped query should show Partitions excluded: 47. If excluded is 0, your query pattern and partition key are misaligned.
🧠 Deep Dive: Static vs. Dynamic Partition Pruning in PostgreSQL
PostgreSQL's pruning mechanism has two modes. Understanding the distinction matters for parameterized queries — the most common form in production application code.
Partition Pruning Internals: How PostgreSQL Plans Around Boundaries
Static pruning happens at parse time when the partition key predicate contains a literal value. The planner knows at query compilation time exactly which child tables can possibly match. The plan is fixed — partition metadata is consulted once and eliminated partitions never appear in the execution plan.
Dynamic pruning happens at execution time when the predicate value comes from a runtime parameter: a bind variable in a prepared statement, or a value supplied via a nested loop join. PostgreSQL v12+ supports dynamic pruning for parameterized queries — the executor re-evaluates the partition set at runtime using the bound parameter value. MySQL supports static pruning for RANGE and LIST partitions but has limited dynamic pruning support, which can force full-partition scans for parameterized queries even when a literal equivalent would prune correctly.
Static vs. dynamic pruning support matrix:
| Database | Static Pruning | Dynamic Pruning | Notes |
| PostgreSQL 10–11 | ✅ RANGE, LIST, HASH | ❌ Not supported | Bind parameters force full scan |
| PostgreSQL 12+ | ✅ All partition types | ✅ Parameterized queries and nested loops | Recommended version for production partitioning |
| MySQL 8.0 | ✅ RANGE, LIST | ⚠️ Limited | Static only for most query forms |
| CockroachDB | ✅ All partition types | ✅ | Full pruning support |
Performance Analysis: Measuring Pruning Effectiveness
The most direct way to measure pruning impact is EXPLAIN (ANALYZE, BUFFERS). Look at the Partitions selected output. With correct pruning, a query against a 48-partition table that touches only January data should show Partitions selected: 1. Seeing Partitions selected: 48 on a query with a literal date predicate means the partition key is not in the WHERE clause, or it is wrapped in a function.
| Query Pattern | Partitions Scanned | Relative Cost |
WHERE created_at >= '2024-01-01' (range literal) | 1 | Baseline — optimal |
WHERE DATE(created_at) = '2024-01-15' (function wrap) | 48 | 48× row reads |
WHERE created_at = $1 (bind param, PG 12+) | 1 | Same as literal |
WHERE created_at = $1 (bind param, PG 11) | 48 | No dynamic pruning |
**The function-wrapping trap
Wrapping the partition key column in a function call defeats pruning. WHERE DATE(created_at) = '2024-01-15' forces the planner to materialize DATE(created_at) for every row before evaluating the predicate — it cannot apply partition bounds because the comparison is on the derived value, not the raw column. The correct form is a range predicate on the raw column: WHERE created_at >= '2024-01-15' AND created_at < '2024-01-16'. A routine refactor that wraps the partition key in DATE(), TRUNC(), or TO_CHAR() silently disables pruning for a query that was previously fast. Add EXPLAIN (ANALYZE) assertions for critical queries in CI pipelines to catch this before it reaches production.
⚖️ Hot-Partition Anti-Patterns and When NOT to Partition
Five anti-patterns that destroy partition performance:
| Anti-Pattern | What Happens | Fix |
| Partition key missing from critical queries | Zero pruning — every child table is scanned — performance is worse than unpartitioned | Align the partition key with the most frequent WHERE predicate; redesign if mismatched |
| Over-partitioning — more than 500 partitions | Planning overhead adds hundreds of milliseconds to every query even after pruning | Use composite partitioning to reduce the top-level partition count |
| Low-cardinality partition key | A boolean column produces 2 partitions — near-zero benefit with DDL overhead | Choose keys with enough distinct values to justify the intended partition count |
| Skewed range sizes — 80% of data in one partition | That partition handles 80% of all I/O — the partition that receives the most writes is always the bottleneck | Use monthly ranges; add hash sub-partitioning for the current high-write period |
| Missing DEFAULT partition on list-partitioned tables | An unrecognized region code causes an insert failure at midnight during a rollout | Always define a DEFAULT partition before deploying the parent table |
When NOT to partition:
| Condition | Reason to Skip Partitioning |
| Table has fewer than 50 million rows | Existing B-Tree indexes are sufficient — partitioning adds DDL overhead without benefit |
| No natural partition key in the access pattern | Without a key that appears in frequent WHERE clauses, there is zero pruning benefit |
| Primary bottleneck is write throughput beyond a single node | Partitioning does not help — horizontal sharding via Vitess or Citus is the correct solution |
| Frequent cross-partition joins on non-partition-key columns | Multi-partition merge adds overhead that may be worse than the equivalent on an unpartitioned table |
| Partition count would exceed 1,000 | PostgreSQL planning overhead at very high partition counts degrades OLTP query latency even after pruning |
The system design interview answer: partition when tables are large and queries are slow on a single node. Shard when data volume or write throughput genuinely exceeds what a single node can serve.
🧭 Choosing the Right Partition Strategy: Decision Guide
| Situation | Recommended Strategy | Reason |
| Time-series data with rolling archival — logs, payments, events | Range by date — monthly or quarterly | Date-range pruning plus instant archival via DETACH PARTITION |
| High-volume inserts where even distribution is required | Hash by primary access key — user ID, order ID | Prevents write hotspot; point lookups prune to exactly one partition |
| Geo-compliance or categorical access patterns | List by region or country code | Partition boundary equals compliance boundary — always add DEFAULT |
| Table so large that monthly range partitions are still too big | Composite: range by month plus hash by user ID within each month | Combines date pruning with write balance |
| Queries are almost entirely point lookups by ID | Hash partitioning | Optimal for point lookup plus write balance — accept range scans on other columns |
| Queries filter by multiple unrelated columns | Choose the column in the most frequent WHERE clause as partition key | Only one partition key per table — target the highest-value pruning predicate |
🌍 Real-World Applications of SQL Partitioning
Financial ledger at a major payments processor. A payments company running on PostgreSQL held 11 billion transaction rows in a single transactions table. Month-end reporting queries scanned the entire table. After migrating to monthly range partitioning on settled_at, the same reports ran against two or three child partitions. Query times dropped from 42 seconds to 900 milliseconds. Old partitions were detached and moved to cheaper cold storage using ALTER TABLE ... DETACH PARTITION, with no application code change. The parent table name never changed — the migration was entirely invisible to the application layer.
E-commerce order history at global scale. A global retailer used list partitioning by region code (EMEA, APAC, AMER, LATAM) to comply with data residency regulations that require customer order data to remain within specific geographic boundaries. Each partition was hosted on a tablespace located in the correct data center. Queries scoped to a single region skipped all other partitions entirely, and compliance auditors could verify data locality at the partition level without touching application logic.
Audit logging at a SaaS platform. A multi-tenant SaaS platform wrote 200 million audit log rows per month. The team used composite partitioning: range by month at the top level, then hash by tenant ID within each month. This gave two benefits simultaneously: old months were detached on a rolling 13-month retention schedule with pg_partman, and hot write traffic was distributed evenly across 16 hash sub-partitions per month rather than concentrating on a single child table.
Key lesson from all three cases: the partition strategy that performed best was always the one whose key matched the dominant query predicate — not the one that seemed intuitively clean on a whiteboard.
🧪 Practical Examples: Applying Partition Strategy to Real Query Patterns
Scenario 1: Analytics on a slow time-series table.
A page_views table with 4 billion rows is slow on all date-range queries. The dominant query pattern is WHERE viewed_at BETWEEN '2024-01-01' AND '2024-01-31'. The correct strategy is range partitioning by month on viewed_at. After partitioning into 48 monthly child tables, the January query reads only one child table — roughly 80 million rows instead of 4 billion. The query planner shows Partitions selected: 1 in EXPLAIN ANALYZE.
Common mistake: partitioning by user_id instead, which makes the January query scan all 48 partitions because viewed_at is not the partition key.
Scenario 2: Preventing a write hotspot.
An events table receives 50,000 inserts per second, all with current timestamps. Range partitioning by date concentrates every insert on the current month's partition — effectively making it a single write target. Switching to hash partitioning on event_id distributes inserts uniformly across 32 partitions, eliminating the hotspot. The trade-off: date-range queries now scan all 32 partitions instead of one.
Decision rule: choose hash when write throughput is the bottleneck; choose range when read selectivity is the bottleneck. You cannot optimise for both simultaneously with a single partition key.
Scenario 3: Testing partition pruning with EXPLAIN.
Run EXPLAIN (ANALYZE) SELECT * FROM orders WHERE created_at >= '2024-06-01' AND created_at < '2024-07-01' and look for Partitions selected: 1 in the output. If you see Partitions selected: 48, the predicate is not hitting the partition key — check for function wrapping or a missing column in the WHERE clause.
🛠️ pg_partman: Automated Partition Lifecycle Management for PostgreSQL
Manual partition pre-creation works for small setups, but production systems need automation. If the cron job that creates next month's partition fails on the 20th, inserts fail at midnight on the 1st — a production incident caused by missing infrastructure, not bugs in application code. pg_partman is the open-source PostgreSQL extension that prevents this.
pg_partman automatically pre-creates future partitions, enforces retention policies, and handles archival. Its background worker (pg_partman_bgw) runs maintenance on a configurable schedule inside the PostgreSQL process — no external cron dependency required.
Key pg_partman configuration parameters:
| Parameter | Purpose | Recommended Value |
p_premake | Number of future partitions to pre-create | At least 2 — covers deploy delays and off-hours maintenance windows |
p_interval | Partition interval | 'daily', 'weekly', 'monthly', 'yearly' |
retention | How long to retain partitions before action | '24 months' for rolling 2-year retention |
retention_keep_table | true detaches instead of dropping — archive pattern; false drops outright | false for OLTP rolling windows; true for data warehouse archival |
infinite_time_partitions | true creates partitions beyond the premake window as data arrives | true for safety — prevents insert failures on unexpected future data |
Operational workflow with pg_partman:
- Create the parent table with the partition key and interval defined.
- Register the parent with pg_partman, specifying interval and premake count.
- Configure retention — how long to keep partitions and whether to detach or drop.
- pg_partman's background worker runs maintenance automatically on the configured schedule.
- Monitor via pg_partman's built-in views:
partman.part_configshows all managed parents;partman.check_subpart_sumsvalidates child table consistency.
Pre-creation safety margin: Set p_premake to at least 2. This creates two months of future partitions immediately. If maintenance is delayed — a holiday, a failed deploy — there is a buffer before inserts would hit a missing partition. For daily-partitioned high-traffic tables, set p_premake to 7 or higher.
For a full deep-dive on pg_partman including native triggers, sub-partitioning hierarchies, and pg_partman_bgw tuning, see the pg_partman GitHub repository.
📚 Lessons Learned: What Production Teaches You About SQL Partitioning
The partition key is a permanent architectural decision. Re-partitioning a 500 GB live table requires building a new partitioned table schema, migrating data with minimal downtime via logical replication or bulk INSERT/ATTACH, and cutting over at a maintenance window. Plan the partition key with the same weight you give to API contracts. Changing it later is a multi-week engineering project under production load.
Monitor partition skew from day one. Set up a weekly query that reports row counts and physical size per child table. A partition growing 10× larger than its siblings signals skewed distribution — a write hotspot forming or an unbalanced hash function. Catching it at 20 GB is a configuration change. Catching it at 200 GB is a live migration under load with risk of data loss.
Always verify pruning with EXPLAIN before shipping to production. Add EXPLAIN (ANALYZE) assertions for critical queries in your CI pipeline. Assert that Partitions excluded is non-zero. A routine refactor that wraps the partition key column in a function call silently disables pruning for a query that was previously fast.
Range partitioning write hotspots appear gradually and then cause sudden spikes. The first month after launching a range-partitioned table, the write hotspot is a single small partition. After a year, the current month's partition receives the same write rate — but the hot partition is one of 12. At two years, one of 24. The hotspot is structural and permanent. Monitor I/O per partition; if the current month consistently shows 5–10× more I/O than any other, add hash sub-partitioning to spread the current period's load.
Partition maintenance automation is not optional in production. Every production system running range-partitioned tables should use pg_partman or equivalent automation. The failure mode — no partition for tomorrow's date — is silent until midnight when inserts start failing. This is not a theoretical concern; it is a production incident that happens in real systems where partition creation was left to a manually-maintained cron job without alerting.
📌 TLDR: Five Rules for SQL Partitioning
Range partitioning for time-series data. Date-range pruning plus instant archival via DETACH PARTITION. Accept the write hotspot on the current period and mitigate with composite partitioning if write volume is high.
Hash partitioning for write balance. Uniform write distribution plus point-lookup pruning. Accept that range predicates on non-key columns scan all partitions.
List partitioning for compliance boundaries. The partition boundary is the compliance boundary for geo-constrained data. Always add a DEFAULT partition.
The partition key must appear in your most frequent WHERE clause. Verify with
EXPLAIN ANALYZEbefore shipping. Zero excluded partitions means zero benefit — and worse performance than an unpartitioned table due to merge overhead.Automate partition lifecycle with pg_partman from day one. Manual partition creation is a production incident waiting to happen. Pre-create at least 2 future partitions; set retention policies before the first partition ages out.
🔗 Related Posts
- Partitioning vs. Sharding: SQL and NoSQL Strategies Overview
- Partitioning vs. Sharding: SQL and NoSQL Strategies Overview
- System Design Sharding Strategy: Choosing Keys, Avoiding Hot Spots, and Resharding Safely
- SQL vs. NoSQL: Choosing the Right Database for Scale
- System Design Replication and Failover
Test Your Knowledge
Ready to test what you just learned?
AI will generate 4 questions based on this article's content.

Written by
Abstract Algorithms
@abstractalgorithms
More Posts
NoSQL Partitioning: How Cassandra, DynamoDB, and MongoDB Split Data
TLDR: Every NoSQL database hides a partitioning engine behind a deceptively simple API. Cassandra uses a consistent hashing ring where a Murmur3 hash of your partition key selects a node — virtual nodes (vnodes) make rebalancing smooth. DynamoDB mana...
Split Brain Explained: When Two Nodes Both Think They Are Leader
TLDR: Split brain happens when a network partition causes two nodes to simultaneously believe they are the leader — each accepting writes the other never sees. Prevent it with quorum consensus (at least ⌊N/2⌋+1 nodes must agree before leadership is g...
Clock Skew and Causality Violations: Why Distributed Clocks Lie
TLDR: Physical clocks on distributed machines cannot be perfectly synchronized. NTP keeps them within tens to hundreds of milliseconds in normal conditions — but under load, across datacenters, or after a VM pause, the drift can reach seconds. When s...
Stale Reads and Cascading Failures in Distributed Systems
TLDR: Stale reads return superseded data from replicas that haven't yet applied the latest write. Cascading failures turn one overloaded node into a cluster-wide collapse through retry storms and redistributed load. Both are preventable — stale reads...
