All Posts

SQL Partitioning: Range, Hash, List, and Composite Strategies Explained

How PostgreSQL and MySQL use range, hash, list, and composite partitioning to eliminate unnecessary I/O — and when NOT to partition at all

Abstract AlgorithmsAbstract Algorithms
··23 min read
📚

Intermediate

For developers with some experience. Builds on fundamentals.

Estimated read time: 23 min

AI-assisted content.

TLDR: SQL partitioning divides one logical table into smaller physical child tables, all accessed through the parent table name. The query optimizer skips irrelevant child tables entirely — a process called partition pruning — turning a 30-second full-table scan into a 200ms single-partition read. Range partitioning is the right choice for time-series data with rolling archival. Hash partitioning prevents write hotspots. List partitioning enforces geo-compliance boundaries. The wrong partition key eliminates all of these gains. Automate partition lifecycle with pg_partman or you will have a midnight incident.

📖 The 2-Billion-Row Problem: Why Indexes Stop Scaling Without Partitioning

An index on a 2-billion-row table is still a 2-billion-entry B-Tree. Every lookup navigates that tree — hundreds of levels deep — and the leaf pages are scattered across the entire tablespace. Even with a selective index, large tables exhaust buffer pool capacity, force expensive disk reads, and make VACUUM operations that span the full table increasingly slow and disruptive.

Partitioning changes the problem fundamentally. Instead of one 400 GB table, you have 48 separate child tables — one per month across four years. A query for last month's orders reads only 8 GB. The other 392 GB are never opened because the database engine evaluates each child table's partition boundary against the query predicate and eliminates non-matching children before reading a single data page. This is partition pruning, and it is the entire performance mechanism that makes partitioning worth the operational overhead.

Partitioning vs. sharding — the distinction that matters in every system design interview:

DimensionPartitioningSharding
Data locationSame node, or coordinator-managed distributionIndependent database nodes, each a fully separate instance
Application connectionSingle endpoint — application sees one tableShard-aware routing required — app knows about multiple databases
Cross-partition queriesSupported — planner merges results transparentlyExpensive scatter-gather or not supported
Transaction guaranteesFull ACID within the nodeCross-shard: 2-Phase Commit or eventual consistency only
Scaling targetQuery performance and data manageabilityWrite throughput beyond what a single node can serve

Rule of thumb: If your application connects to one connection string, you are partitioning. Partitioning and sharding are sequential steps on the scaling ladder — not competing alternatives. Start with one database. Add partitioning when queries get slow on large tables — it is a DDL change, not a topology change. Move to sharding when the dataset or write throughput genuinely exceeds a single node.


🔍 The Four SQL Partitioning Strategies at a Glance

Before going deep on each strategy, here is the orientation table. Every major SQL database — PostgreSQL, MySQL, CockroachDB, Spanner — supports some subset of these. PostgreSQL supports all four natively as of version 10.

StrategySplits ByBest ForKey Trade-Off
RangeOrdered value ranges — dates, IDsTime-series, rolling archival, rolling retention windowsWrite hotspot on the latest partition
ListExplicit enumerated values per partitionGeographic regions, category-based isolation, GDPR complianceAll values must be known at design time
Hashhash(key) mod N — uniform distributionHigh-volume inserts, write hotspot preventionNo range query pruning — all partitions scanned for range predicates
CompositeTwo levels: range-then-hash or list-then-hashVery large tables needing both date pruning and write balanceHigher DDL complexity; more total partitions to manage

Each strategy makes the same fundamental trade-off in a different direction: range partitioning is optimal for queries that filter by a time window but concentrates writes on the current period; hash partitioning is optimal for write balance but cannot prune for range predicates; list partitioning is optimal for categorical isolation but requires foreknowledge of all valid values.


⚙️ Range Partitioning: Month-by-Month Child Tables for Time-Series Data

Range partitioning assigns each row to a child table based on whether the partition key value falls within a defined range boundary. The classic application is a time-series table partitioned by month — each month's data living in its own child table, with older months archived or dropped via a single DDL statement.

Pseudo-schema — range-partitioned orders table:

ColumnTypeNotes
idbigintAuto-increment surrogate key
customer_idbigintRequired — used for per-customer queries
total_centsbigintStored in cents to avoid floating-point rounding
statustextpending, paid, shipped, cancelled
created_attimestamptzPartition key — range boundary applied per month

Partition layout:

Child TableRange BoundaryContains
orders_2024_01[2024-01-01, 2024-02-01)January 2024 rows
orders_2024_02[2024-02-01, 2024-03-01)February 2024 rows
orders_2024_03[2024-03-01, 2024-04-01)March 2024 rows
orders_defaultMAXVALUECatch-all for dates beyond the last explicit partition

What partition pruning delivers here: A query filtering on created_at BETWEEN 2024-01-10 AND 2024-01-20 causes the planner to evaluate each child table's range boundary, find that only orders_2024_01 can possibly match, and skip all other child tables before opening a single data file. In a 48-partition table spanning four years, this prunes 47 of 48 child tables — a 47× reduction in I/O before even consulting a local index on customer_id.

Instant archival is the operational superpower of range partitioning. Detaching the January 2022 child table from the parent removes it from the live table in milliseconds — no multi-hour DELETE with vacuuming. Dropping it outright is equally fast. This is why range partitioning is universally used for compliance retention windows where data must be deleted after a fixed period.

Write hotspot — range partitioning's structural weakness: All current-period inserts land on the most recent child table. January 2024 receives every new order until February 1st, after which February becomes the hot partition. For write-heavy tables, this concentrates I/O on a single child table. The mitigation is composite partitioning — range by month, then hash within each month by customer ID — which distributes writes across multiple sub-partitions while preserving date-range pruning for month-scoped queries.


⚙️ List Partitioning: Categorical Splits for Geo-Compliance Boundaries

List partitioning assigns rows to child tables based on membership in an explicit set of values. When those values are geographic identifiers, the partition boundary becomes a compliance boundary — EU data physically lives in the EU partition, which can be storage-pinned to EU-region infrastructure.

Pseudo-schema — list-partitioned orders table:

ColumnTypeNotes
idbigintAuto-increment surrogate key
customer_idbigintRequired
regiontextPartition key — explicit region code such as US, DE, JP
total_centsbigintRequired
created_attimestamptzRequired

Partition layout:

Child TablePartition ValuesPurpose
orders_us('US', 'CA', 'MX')North American region
orders_eu('DE', 'FR', 'GB', 'NL', 'ES', 'IT')EU region — GDPR compliance boundary
orders_apac('JP', 'SG', 'AU', 'IN', 'KR')APAC region
orders_defaultDEFAULTCatch-all for unrecognized region codes

The DEFAULT partition is mandatory. Without it, an insert for a region code not listed in any explicit partition fails with a constraint error. At midnight during a rollout for a new market, that error means production data loss.

Pruning behaviour: A query with WHERE region = 'DE' prunes to orders_eu — the planner evaluates set membership and eliminates every other child table. A query with WHERE region IN ('DE', 'JP') prunes to orders_eu and orders_apac — two tables instead of four.

Geo-compliance at the database layer: By assigning tablespace or storage constraints to the EU child table (via CockroachDB zone configs or PostgreSQL tablespaces), EU data can be physically constrained to EU-region infrastructure. The partition boundary is the compliance boundary — more reliable than application-layer enforcement because it does not depend on every code path correctly routing writes.


⚙️ Hash Partitioning: Even Write Distribution Across N Buckets

Hash partitioning applies hash(partition_key) mod N to assign each row to one of N child tables. The distribution is deterministic but appears random — which is exactly what prevents write hotspots.

Pseudo-schema — hash-partitioned events table:

ColumnTypeNotes
idbigintAuto-increment surrogate key
customer_idbigintPartition key — hashed to determine child table
event_typetextClick, purchase, view, etc.
payloadjsonbEvent data
created_attimestamptzRequired

Partition layout — 4 partitions:

Child TableAssignment RuleReceives Rows Where
events_p0hash(customer_id) mod 4 = 0~25% of all customer IDs
events_p1hash(customer_id) mod 4 = 1~25% of all customer IDs
events_p2hash(customer_id) mod 4 = 2~25% of all customer IDs
events_p3hash(customer_id) mod 4 = 3~25% of all customer IDs

Pruning for point lookups vs. range queries:

Query PatternPruning Result
WHERE customer_id = 42Prunes to exactly 1 child table — planner computes hash(42) mod 4
WHERE created_at > '2024-01-01'No pruning — the hash has no sense of order on created_at
WHERE customer_id = 42 AND created_at > '2024-01-01'Prunes to exactly 1 child table — customer_id is the partition key

Hash partitioning optimizes for point-lookup performance and write balance. It is the wrong strategy when your most frequent queries filter by an ordered range on a non-hash-key column.

Write distribution is hash partitioning's defining advantage. Inserts for any customer ID land on one of four child tables based on the hash, not on insert time. A high-volume event stream that would saturate a single range partition under date-based partitioning distributes load evenly across four partitions under hash partitioning.


⚙️ Composite Partitioning: Two-Level Strategies for Tables That Outgrow a Single Approach

Composite sub-partitioning applies two levels. The most common pattern: range by month at the first level, then hash by customer ID within each monthly partition. This delivers both date-range pruning for reporting queries and write balance within each month.

Composite partition structure:

LevelStrategyKeyPurpose
Level 1Rangecreated_at — monthlyPrune by date — reporting and instant archival
Level 2Hashcustomer_id — 4 bucketsBalance writes within each monthly range

The resulting partition count: 48 months × 4 hash buckets = 192 child tables for a 4-year dataset. Each holds roughly total_rows / 192 rows.

Pruning for composite partitions:

Query PredicatePartitions Scanned
WHERE created_at BETWEEN Jan and Feb 20244 (all hash buckets for January only)
WHERE created_at BETWEEN Jan and Feb 2024 AND customer_id = 421 (exact hash bucket within January)
WHERE customer_id = 42 (no date filter)192 — zero pruning on a range partition without the partition key

The PostgreSQL planning ceiling: Composite partitioning increases total child table count. PostgreSQL adds per-partition overhead at planning time — negligible below 500 partitions, measurable at 1,000, significant at 5,000. For composite schemes producing thousands of sub-partitions, measure query planning latency at realistic partition counts before committing the design to production.


📊 How the Query Optimizer Decides Which Partitions to Skip

The diagram below traces the full pruning decision path from query arrival through the PostgreSQL planner to partition elimination. Follow it from top to bottom to see exactly where the pruning decision is made and what determines which child tables are eliminated.

flowchart TD
    Q["Query received: WHERE created_at BETWEEN Jan and Feb 2024"] --> OPT["Query optimizer enters partition pruning phase"]
    OPT --> CHECK{"Partition key present in WHERE clause?"}
    CHECK -->|"Yes — pruning active"| BOUNDS["Evaluate each child table boundary against predicate"]
    CHECK -->|"No — no pruning possible"| FULL["Scan all child tables (avoid this pattern)"]
    BOUNDS --> SKIP1["Skip 2021 partitions — bounds outside predicate range"]
    BOUNDS --> SKIP2["Skip 2022 partitions — bounds outside predicate range"]
    BOUNDS --> SKIP3["Skip 2023 partitions — bounds outside predicate range"]
    BOUNDS --> SCAN["Scan Jan 2024 child table — bounds match predicate"]
    SKIP1 --> MERGE["Merge results and return to client"]
    SKIP2 --> MERGE
    SKIP3 --> MERGE
    SCAN --> MERGE
    FULL --> MERGE

The critical fork is the diamond node: is the partition key in the WHERE clause? When it is, the planner eliminates dozens of child tables before opening a single data file. When it is not, every child table is opened — making the partitioned table perform worse than an equivalent unpartitioned table because of the overhead of N table opens, N index scans, and a merge step.

This diagram captures the single most important rule in partition design: the partition key must appear in your most frequent query predicates. Verify it with EXPLAIN (ANALYZE, VERBOSE) and look for Partitions excluded: N in the output. A 48-partition table with a month-scoped query should show Partitions excluded: 47. If excluded is 0, your query pattern and partition key are misaligned.


🧠 Deep Dive: Static vs. Dynamic Partition Pruning in PostgreSQL

PostgreSQL's pruning mechanism has two modes. Understanding the distinction matters for parameterized queries — the most common form in production application code.

Partition Pruning Internals: How PostgreSQL Plans Around Boundaries

Static pruning happens at parse time when the partition key predicate contains a literal value. The planner knows at query compilation time exactly which child tables can possibly match. The plan is fixed — partition metadata is consulted once and eliminated partitions never appear in the execution plan.

Dynamic pruning happens at execution time when the predicate value comes from a runtime parameter: a bind variable in a prepared statement, or a value supplied via a nested loop join. PostgreSQL v12+ supports dynamic pruning for parameterized queries — the executor re-evaluates the partition set at runtime using the bound parameter value. MySQL supports static pruning for RANGE and LIST partitions but has limited dynamic pruning support, which can force full-partition scans for parameterized queries even when a literal equivalent would prune correctly.

Static vs. dynamic pruning support matrix:

DatabaseStatic PruningDynamic PruningNotes
PostgreSQL 10–11✅ RANGE, LIST, HASH❌ Not supportedBind parameters force full scan
PostgreSQL 12+✅ All partition types✅ Parameterized queries and nested loopsRecommended version for production partitioning
MySQL 8.0✅ RANGE, LIST⚠️ LimitedStatic only for most query forms
CockroachDB✅ All partition typesFull pruning support

Performance Analysis: Measuring Pruning Effectiveness

The most direct way to measure pruning impact is EXPLAIN (ANALYZE, BUFFERS). Look at the Partitions selected output. With correct pruning, a query against a 48-partition table that touches only January data should show Partitions selected: 1. Seeing Partitions selected: 48 on a query with a literal date predicate means the partition key is not in the WHERE clause, or it is wrapped in a function.

Query PatternPartitions ScannedRelative Cost
WHERE created_at >= '2024-01-01' (range literal)1Baseline — optimal
WHERE DATE(created_at) = '2024-01-15' (function wrap)4848× row reads
WHERE created_at = $1 (bind param, PG 12+)1Same as literal
WHERE created_at = $1 (bind param, PG 11)48No dynamic pruning

**The function-wrapping trap

Wrapping the partition key column in a function call defeats pruning. WHERE DATE(created_at) = '2024-01-15' forces the planner to materialize DATE(created_at) for every row before evaluating the predicate — it cannot apply partition bounds because the comparison is on the derived value, not the raw column. The correct form is a range predicate on the raw column: WHERE created_at >= '2024-01-15' AND created_at < '2024-01-16'. A routine refactor that wraps the partition key in DATE(), TRUNC(), or TO_CHAR() silently disables pruning for a query that was previously fast. Add EXPLAIN (ANALYZE) assertions for critical queries in CI pipelines to catch this before it reaches production.


⚖️ Hot-Partition Anti-Patterns and When NOT to Partition

Five anti-patterns that destroy partition performance:

Anti-PatternWhat HappensFix
Partition key missing from critical queriesZero pruning — every child table is scanned — performance is worse than unpartitionedAlign the partition key with the most frequent WHERE predicate; redesign if mismatched
Over-partitioning — more than 500 partitionsPlanning overhead adds hundreds of milliseconds to every query even after pruningUse composite partitioning to reduce the top-level partition count
Low-cardinality partition keyA boolean column produces 2 partitions — near-zero benefit with DDL overheadChoose keys with enough distinct values to justify the intended partition count
Skewed range sizes — 80% of data in one partitionThat partition handles 80% of all I/O — the partition that receives the most writes is always the bottleneckUse monthly ranges; add hash sub-partitioning for the current high-write period
Missing DEFAULT partition on list-partitioned tablesAn unrecognized region code causes an insert failure at midnight during a rolloutAlways define a DEFAULT partition before deploying the parent table

When NOT to partition:

ConditionReason to Skip Partitioning
Table has fewer than 50 million rowsExisting B-Tree indexes are sufficient — partitioning adds DDL overhead without benefit
No natural partition key in the access patternWithout a key that appears in frequent WHERE clauses, there is zero pruning benefit
Primary bottleneck is write throughput beyond a single nodePartitioning does not help — horizontal sharding via Vitess or Citus is the correct solution
Frequent cross-partition joins on non-partition-key columnsMulti-partition merge adds overhead that may be worse than the equivalent on an unpartitioned table
Partition count would exceed 1,000PostgreSQL planning overhead at very high partition counts degrades OLTP query latency even after pruning

The system design interview answer: partition when tables are large and queries are slow on a single node. Shard when data volume or write throughput genuinely exceeds what a single node can serve.


🧭 Choosing the Right Partition Strategy: Decision Guide

SituationRecommended StrategyReason
Time-series data with rolling archival — logs, payments, eventsRange by date — monthly or quarterlyDate-range pruning plus instant archival via DETACH PARTITION
High-volume inserts where even distribution is requiredHash by primary access key — user ID, order IDPrevents write hotspot; point lookups prune to exactly one partition
Geo-compliance or categorical access patternsList by region or country codePartition boundary equals compliance boundary — always add DEFAULT
Table so large that monthly range partitions are still too bigComposite: range by month plus hash by user ID within each monthCombines date pruning with write balance
Queries are almost entirely point lookups by IDHash partitioningOptimal for point lookup plus write balance — accept range scans on other columns
Queries filter by multiple unrelated columnsChoose the column in the most frequent WHERE clause as partition keyOnly one partition key per table — target the highest-value pruning predicate

🌍 Real-World Applications of SQL Partitioning

Financial ledger at a major payments processor. A payments company running on PostgreSQL held 11 billion transaction rows in a single transactions table. Month-end reporting queries scanned the entire table. After migrating to monthly range partitioning on settled_at, the same reports ran against two or three child partitions. Query times dropped from 42 seconds to 900 milliseconds. Old partitions were detached and moved to cheaper cold storage using ALTER TABLE ... DETACH PARTITION, with no application code change. The parent table name never changed — the migration was entirely invisible to the application layer.

E-commerce order history at global scale. A global retailer used list partitioning by region code (EMEA, APAC, AMER, LATAM) to comply with data residency regulations that require customer order data to remain within specific geographic boundaries. Each partition was hosted on a tablespace located in the correct data center. Queries scoped to a single region skipped all other partitions entirely, and compliance auditors could verify data locality at the partition level without touching application logic.

Audit logging at a SaaS platform. A multi-tenant SaaS platform wrote 200 million audit log rows per month. The team used composite partitioning: range by month at the top level, then hash by tenant ID within each month. This gave two benefits simultaneously: old months were detached on a rolling 13-month retention schedule with pg_partman, and hot write traffic was distributed evenly across 16 hash sub-partitions per month rather than concentrating on a single child table.

Key lesson from all three cases: the partition strategy that performed best was always the one whose key matched the dominant query predicate — not the one that seemed intuitively clean on a whiteboard.


🧪 Practical Examples: Applying Partition Strategy to Real Query Patterns

Scenario 1: Analytics on a slow time-series table. A page_views table with 4 billion rows is slow on all date-range queries. The dominant query pattern is WHERE viewed_at BETWEEN '2024-01-01' AND '2024-01-31'. The correct strategy is range partitioning by month on viewed_at. After partitioning into 48 monthly child tables, the January query reads only one child table — roughly 80 million rows instead of 4 billion. The query planner shows Partitions selected: 1 in EXPLAIN ANALYZE.

Common mistake: partitioning by user_id instead, which makes the January query scan all 48 partitions because viewed_at is not the partition key.

Scenario 2: Preventing a write hotspot. An events table receives 50,000 inserts per second, all with current timestamps. Range partitioning by date concentrates every insert on the current month's partition — effectively making it a single write target. Switching to hash partitioning on event_id distributes inserts uniformly across 32 partitions, eliminating the hotspot. The trade-off: date-range queries now scan all 32 partitions instead of one.

Decision rule: choose hash when write throughput is the bottleneck; choose range when read selectivity is the bottleneck. You cannot optimise for both simultaneously with a single partition key.

Scenario 3: Testing partition pruning with EXPLAIN. Run EXPLAIN (ANALYZE) SELECT * FROM orders WHERE created_at >= '2024-06-01' AND created_at < '2024-07-01' and look for Partitions selected: 1 in the output. If you see Partitions selected: 48, the predicate is not hitting the partition key — check for function wrapping or a missing column in the WHERE clause.


🛠️ pg_partman: Automated Partition Lifecycle Management for PostgreSQL

Manual partition pre-creation works for small setups, but production systems need automation. If the cron job that creates next month's partition fails on the 20th, inserts fail at midnight on the 1st — a production incident caused by missing infrastructure, not bugs in application code. pg_partman is the open-source PostgreSQL extension that prevents this.

pg_partman automatically pre-creates future partitions, enforces retention policies, and handles archival. Its background worker (pg_partman_bgw) runs maintenance on a configurable schedule inside the PostgreSQL process — no external cron dependency required.

Key pg_partman configuration parameters:

ParameterPurposeRecommended Value
p_premakeNumber of future partitions to pre-createAt least 2 — covers deploy delays and off-hours maintenance windows
p_intervalPartition interval'daily', 'weekly', 'monthly', 'yearly'
retentionHow long to retain partitions before action'24 months' for rolling 2-year retention
retention_keep_tabletrue detaches instead of dropping — archive pattern; false drops outrightfalse for OLTP rolling windows; true for data warehouse archival
infinite_time_partitionstrue creates partitions beyond the premake window as data arrivestrue for safety — prevents insert failures on unexpected future data

Operational workflow with pg_partman:

  1. Create the parent table with the partition key and interval defined.
  2. Register the parent with pg_partman, specifying interval and premake count.
  3. Configure retention — how long to keep partitions and whether to detach or drop.
  4. pg_partman's background worker runs maintenance automatically on the configured schedule.
  5. Monitor via pg_partman's built-in views: partman.part_config shows all managed parents; partman.check_subpart_sums validates child table consistency.

Pre-creation safety margin: Set p_premake to at least 2. This creates two months of future partitions immediately. If maintenance is delayed — a holiday, a failed deploy — there is a buffer before inserts would hit a missing partition. For daily-partitioned high-traffic tables, set p_premake to 7 or higher.

For a full deep-dive on pg_partman including native triggers, sub-partitioning hierarchies, and pg_partman_bgw tuning, see the pg_partman GitHub repository.


📚 Lessons Learned: What Production Teaches You About SQL Partitioning

The partition key is a permanent architectural decision. Re-partitioning a 500 GB live table requires building a new partitioned table schema, migrating data with minimal downtime via logical replication or bulk INSERT/ATTACH, and cutting over at a maintenance window. Plan the partition key with the same weight you give to API contracts. Changing it later is a multi-week engineering project under production load.

Monitor partition skew from day one. Set up a weekly query that reports row counts and physical size per child table. A partition growing 10× larger than its siblings signals skewed distribution — a write hotspot forming or an unbalanced hash function. Catching it at 20 GB is a configuration change. Catching it at 200 GB is a live migration under load with risk of data loss.

Always verify pruning with EXPLAIN before shipping to production. Add EXPLAIN (ANALYZE) assertions for critical queries in your CI pipeline. Assert that Partitions excluded is non-zero. A routine refactor that wraps the partition key column in a function call silently disables pruning for a query that was previously fast.

Range partitioning write hotspots appear gradually and then cause sudden spikes. The first month after launching a range-partitioned table, the write hotspot is a single small partition. After a year, the current month's partition receives the same write rate — but the hot partition is one of 12. At two years, one of 24. The hotspot is structural and permanent. Monitor I/O per partition; if the current month consistently shows 5–10× more I/O than any other, add hash sub-partitioning to spread the current period's load.

Partition maintenance automation is not optional in production. Every production system running range-partitioned tables should use pg_partman or equivalent automation. The failure mode — no partition for tomorrow's date — is silent until midnight when inserts start failing. This is not a theoretical concern; it is a production incident that happens in real systems where partition creation was left to a manually-maintained cron job without alerting.


📌 TLDR: Five Rules for SQL Partitioning

  1. Range partitioning for time-series data. Date-range pruning plus instant archival via DETACH PARTITION. Accept the write hotspot on the current period and mitigate with composite partitioning if write volume is high.

  2. Hash partitioning for write balance. Uniform write distribution plus point-lookup pruning. Accept that range predicates on non-key columns scan all partitions.

  3. List partitioning for compliance boundaries. The partition boundary is the compliance boundary for geo-constrained data. Always add a DEFAULT partition.

  4. The partition key must appear in your most frequent WHERE clause. Verify with EXPLAIN ANALYZE before shipping. Zero excluded partitions means zero benefit — and worse performance than an unpartitioned table due to merge overhead.

  5. Automate partition lifecycle with pg_partman from day one. Manual partition creation is a production incident waiting to happen. Pre-create at least 2 future partitions; set retention policies before the first partition ages out.


Share

Test Your Knowledge

🧠

Ready to test what you just learned?

AI will generate 4 questions based on this article's content.

Abstract Algorithms

Written by

Abstract Algorithms

@abstractalgorithms