System Design Core Concepts: Scalability, CAP, and Consistency
The building blocks of distributed systems. Learn about Vertical vs Horizontal scaling, the CAP Theorem, and ACID vs BASE.
Abstract Algorithms
TLDR: ๐ Scalability, the CAP Theorem, and consistency models are the three concepts that determine whether a distributed system can grow, stay reliable, and deliver correct results. Get these three right and you can reason about any system design question.
๐ Understanding System Design Core Concepts
In 2011, Netflix engineers deployed Chaos Monkey โ a tool that randomly terminated production servers to test system resilience. Their infrastructure stayed up. Netflix had deliberately designed for AP (available + partition-tolerant): they accepted that some users might see slightly stale recommendations in exchange for the system never going down. A competitor designed for CP (consistent + partition-tolerant) would have returned errors during the same failures to protect data correctness.
That's the CAP theorem in action โ and it explains why your architecture choices have real, permanent engineering consequences.
System design revolves around three foundational questions:
- Can the system handle more load? (Scalability)
- What happens when a network partition occurs? (CAP Theorem)
- When multiple clients write, what does each reader see? (Consistency models)
These aren't independent choices. A bank can't be "highly available during network partitions and always consistent" โ the CAP Theorem makes that impossible.
๐ Scalability: Vertical vs. Horizontal
Vertical scaling (scale up) means adding more power to a single machine โ more CPU cores, more RAM, faster storage.
Horizontal scaling (scale out) means adding more machines and distributing the load across them.
| Dimension | Vertical Scaling | Horizontal Scaling |
| Ceiling | Hard hardware limit | Theoretically unlimited |
| Cost curve | Superlinear (big machines cost disproportionately more) | Linear (add commodity boxes) |
| Complexity | Low (no distribution) | High (coordination, consistency, partitioning) |
| Failure model | Single point of failure | Redundant (one node fails, others continue) |
| Best for | Databases under heavy write lock contention | Stateless services, read-heavy workloads |
In practice: most production systems use both. The database gets a large primary (vertical) plus read replicas (horizontal). The application tier is purely horizontal behind a load balancer.
graph LR
Client --> LB[Load Balancer]
LB --> A[App Server 1]
LB --> B[App Server 2]
LB --> C[App Server 3]
A --> Primary[(Primary DB)]
B --> Replica1[(Read Replica)]
C --> Replica2[(Read Replica)]
Primary --> Replica1
Primary --> Replica2
โ๏ธ How the CAP Theorem Works: Every Distributed System Makes a Choice
CAP Theorem states that a distributed system can guarantee at most two of three properties simultaneously:
- Consistency (C) โ every read returns the most recent write (or an error)
- Availability (A) โ every request receives a response (possibly stale)
- Partition Tolerance (P) โ the system continues operating despite network partitions
Because network partitions will happen in any real distributed system, P is non-optional. The real choice is: when a partition occurs, do you sacrifice C or A?
| System type | CAP choice | Example systems | Why |
| CP | Consistency + Partition tolerance | Zookeeper, etcd, HBase | Correct data is more important than availability |
| AP | Availability + Partition tolerance | Cassandra, CouchDB, DynamoDB | Uptime is more important than perfect consistency |
Practical implication:
- Banking transaction? Choose CP โ you can't have a split-brain account balance.
- Social media feed? Choose AP โ slightly stale posts are acceptable; being down is not.
๐ง Deep Dive: Consistency Models Unpacked
The CAP theorem frames a binary choice, but in practice the important dimension is how you trade off consistency against availability during normal operationโnot just during failures.
| Model | Guarantee | Example system |
| Strong consistency | Every read sees the latest write | Single-node RDBMS, ZooKeeper |
| Eventual consistency | Reads converge to the latest writeโeventually | DynamoDB, Cassandra |
| Read-your-writes | You always see your own latest write | Session-scoped DB routing |
| Causal consistency | Causally related ops are ordered; others may not be | MongoDB sessions, CockroachDB |
๐ Eventual Consistency Propagation
sequenceDiagram
participant C1 as Client 1 (Writer)
participant P as Primary Node
participant R1 as Replica 1
participant R2 as Replica 2
participant C2 as Client 2 (Reader)
C1->>P: Write: balance = $500
P-->>C1: ACK (write confirmed)
Note over R1,R2: Async replication in progress
C2->>R2: Read: balance?
R2-->>C2: balance = $400 (stale read)
P->>R1: Replicate: balance = $500
P->>R2: Replicate: balance = $500
Note over R1,R2: Replicas converged
C2->>R2: Read: balance?
R2-->>C2: balance = $500 (consistent)
This diagram demonstrates eventual consistency in action: Client 1 writes a new balance to the Primary node, which immediately acknowledges the write; meanwhile Client 2 reads from Replica 2 and receives a stale value ($400) because asynchronous replication has not yet propagated the update. After both replicas receive the replication message from Primary, all subsequent reads return the correct value ($500). The key takeaway is that in an AP system there is a window of time after a write โ potentially seconds or longer โ during which different readers may see different values, and application logic must be designed to tolerate this.
๐ Consistency Levels: State Transitions
stateDiagram-v2
[*] --> Strong : CP system (ZooKeeper)
[*] --> Eventual : AP system (Cassandra)
[*] --> Weak : Cache only (Redis TTL)
Strong --> Eventual : Relax for throughput
Eventual --> Strong : Require linearizability
Eventual --> ReadYourWrites : Scope to session
ReadYourWrites --> Causal : Add causal ordering
Weak --> Eventual : Add replication sync
This state diagram maps the consistency model spectrum and the architectural decisions that transition between levels. A CP system starts in Strong consistency while an AP system starts in Eventual; the labeled arrows show how relaxing or tightening guarantees moves you between states โ for example, scoping reads to a session turns Eventual into Read-Your-Writes, and adding causal ordering upgrades that to Causal. The key takeaway is that consistency is not a binary on/off choice: you can tune the level per-operation or per-session scope rather than applying one model uniformly across your entire system.
๐ System Design Decision Flow: Choosing the Right Architecture
When designing a distributed system, the sequence of decisions follows a consistent pattern. Use this flow to reason through any system design problem systematically.
graph TD
A[Define consistency requirement] --> B{Strong consistency required?}
B -->|Yes| C[Choose CP system: etcd, Zookeeper, PostgreSQL]
B -->|No| D{High availability critical?}
D -->|Yes| E[Choose AP system: Cassandra, DynamoDB]
D -->|No| F[Choose simple single-node or CA architecture]
C --> G[Add replication + failover]
E --> H[Add conflict resolution + eventual sync]
F --> I[Scale vertically first, then revisit]
This decision framework applies to every data store, cache, and message queue in your architecture. The same three-question pattern โ consistency need, availability requirement, partition tolerance โ governs each component choice independently.
๐งฎ ACID vs. BASE: Two Ways to Think About Data Correctness
ACID (traditional relational databases):
- Atomic โ all operations in a transaction succeed or all fail
- Consistent โ the database moves from one valid state to another valid state
- Isolated โ concurrent transactions don't interfere with each other
- Durable โ committed data survives crashes
BASE (most NoSQL / distributed systems):
- Basically Available โ availability is prioritized; partial failures are acceptable
- Soft state โ data can be stale; state may change without input (replication catching up)
- Eventual consistency โ given no new writes, all replicas will eventually converge to the same value
| Property | ACID | BASE |
| Consistency model | Strong / serializable | Eventual / weak |
| Availability during partition | May refuse | Always responds |
| Use case | Financial transactions, inventory | User preferences, cache, social data |
| Typical tech | PostgreSQL, MySQL | Cassandra, DynamoDB, Redis |
๐ Real-World Applications: Applying These Concepts: E-Commerce Checkout Example
An e-commerce checkout flow actually uses multiple consistency models simultaneously:
| Component | Consistency need | Choice | Why |
| Inventory decrement | Strong (no overselling) | ACID + CP | Two checkouts can't both buy the last item |
| Product catalog | Eventual (stale by seconds is fine) | BASE + AP | Faster reads; slightly stale prices are acceptable |
| Session cart | Eventual (user-scoped) | AP | Cart doesn't need global consistency |
| Payment processing | Strong (exact, auditable) | ACID + CP | Regulatory and financial correctness |
This is the key insight: you don't choose one consistency model for your whole system. You partition your data by its consistency requirements.
โ๏ธ Trade-offs & Failure Modes: When Things Go Wrong: Partition Scenarios
A network partition means some nodes can't communicate with others. What happens next depends on your CAP choice:
CP system during partition: Nodes that can't confirm they hold the latest data refuse to serve requests. Clients see errors or timeouts. Data is always correct when available.
AP system during partition: All nodes continue serving. Clients on different sides of the partition may see different values. When the partition heals, the system runs a reconciliation protocol (last-write-wins, vector clocks, CRDTs) to converge.
Common failure modes:
- Split-brain โ two nodes both think they're the primary and accept conflicting writes (CP systems prevent this; AP systems must handle reconciliation)
- Stale reads โ a read reaches a replica that hasn't received the latest replication update (expected in AP systems, needs careful application-level handling)
- Replication lag โ primary accepts a write but it hasn't propagated to all replicas before the primary crashes (mitigated by synchronous replication or quorum writes)
๐งญ Decision Guide: Choosing the Right Consistency Level for Your Use Case
| Situation | CAP preference | Consistency model | Reasoning |
| Financial transaction | CP | Strong (serializable) | Correctness is non-negotiable |
| Real-time leaderboard | AP | Eventual | Slightly stale rankings are fine |
| Collaborative document editing | CP with CRDT | Causal | Ordering matters; CRDTs enable merge-friendly writes |
| User profile reads | AP | Read-your-writes | User should see their own updates; global consistency unnecessary |
| Inventory system | CP | Linearizable | Prevent overselling with quorum writes |
๐งช Little's Law: Back-of-Envelope Capacity Math
This section applies Little's Law to back-of-envelope capacity estimation โ one of the most practical calculations in both system design interviews and real production capacity planning. It was chosen because it provides a quantitative bridge between the scalability concepts in this post and a concrete number: given your target request rate and latency budget, how much concurrency must your system support? As you work through the example, focus on how the three variables (concurrency L, arrival rate ฮป, latency W) constrain each other โ improving any one of them directly affects the resources you need to provision.
$$L = \lambda \cdot W$$
Where:
- $L$ = average number of requests in the system (concurrency)
- $\lambda$ = arrival rate (requests per second)
- $W$ = average time spent in the system (latency in seconds)
Example: if your service handles 500 req/s and average latency is 200 ms:
$$L = 500 imes 0.2 = 100 ext{ concurrent requests}$$
This tells you how many concurrent connections your server needs to handle. Multiply by average memory per request to estimate RAM needs.
๐ ๏ธ Spring Boot + Spring Data Redis: CAP-Aware Caching in Java
Spring Boot is the dominant Java framework for building production microservices โ its @Cacheable, spring-data-redis, and spring-data-jpa integrations let you implement the AP/CP architectural patterns from this post without writing low-level connection management code.
The e-commerce example in this post โ strong consistency for payments, eventual consistency for the product catalog โ maps directly to using Spring Data JPA (for PostgreSQL ACID transactions) alongside Spring Data Redis (for the AP cache layer):
// build.gradle dependencies:
// implementation 'org.springframework.boot:spring-boot-starter-data-jpa'
// implementation 'org.springframework.boot:spring-boot-starter-data-redis'
// implementation 'org.springframework.boot:spring-boot-starter-cache'
import org.springframework.cache.annotation.Cacheable;
import org.springframework.cache.annotation.CacheEvict;
import org.springframework.data.jpa.repository.JpaRepository;
import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Transactional;
// --- CP path: ACID inventory decrement (PostgreSQL via JPA) ---
@Service
public class InventoryService {
private final InventoryRepository repo;
@Transactional // ACID โ rolls back if decrement fails
public void decrementStock(Long productId, int qty) {
Inventory inv = repo.findByIdForUpdate(productId) // SELECT FOR UPDATE (row lock)
.orElseThrow(() -> new RuntimeException("Product not found"));
if (inv.getStock() < qty) throw new IllegalStateException("Insufficient stock");
inv.setStock(inv.getStock() - qty);
repo.save(inv);
}
}
// --- AP path: eventually consistent catalog cache (Redis) ---
@Service
public class ProductCatalogService {
private final ProductRepository productRepo;
@Cacheable(value = "catalog", key = "#productId",
unless = "#result == null") // AP โ stale by up to TTL seconds
public ProductDto getProduct(Long productId) {
return productRepo.findById(productId)
.map(ProductDto::from)
.orElse(null);
}
@CacheEvict(value = "catalog", key = "#productId") // Invalidate on update
public void updateProduct(Long productId, ProductDto dto) {
// update persisted to PostgreSQL, cache evicted
}
}
Configure Redis TTL in application.yml:
spring:
cache:
redis:
time-to-live: 30s # AP: catalog allowed to be stale up to 30 seconds
data:
redis:
host: localhost
port: 6379
@Transactional on the inventory path enforces ACID semantics; @Cacheable with a TTL on the catalog path provides the BASE/AP behaviour โ exactly the per-component consistency partitioning described in the e-commerce real-world example.
For a full deep-dive on Spring Data Redis and CAP-aware Spring architectures, a dedicated follow-up post is planned.
๐ Key Lessons for System Designers
These lessons come from systems that succeeded โ and the many that didn't โ in production environments.
Consistency is not free. Every guarantee of consistency requires coordination among nodes, which adds latency. Before choosing strong consistency, verify that the use case actually requires it. Many systems choose strong consistency out of habit and pay an unnecessary availability tax.
Design for failure, not for success. The CAP Theorem forces you to explicitly decide what your system does when things go wrong, not just when everything works. Define your system's behavior under partition before you write the first line of code.
Start with known tradeoffs. When building a new service, pick an existing well-understood data store (PostgreSQL for ACID, Cassandra for AP, Redis for low-latency cache) and document why you made that choice. Unknown tradeoffs from custom solutions accumulate as technical debt.
Monitor tail latency, not averages. Average latency hides the worst-case experience. p99 latency โ the slowest 1% of requests โ is what users actually notice. Instrument your system for percentile-based metrics from day one.
๐ TLDR: Summary & Key Takeaways
- Vertical scaling has a hard ceiling; horizontal scaling requires coordination but is theoretically unlimited.
- CAP Theorem forces a binary choice during partitions: consistency or availability. Partition tolerance is mandatory.
- ACID databases guarantee correctness; BASE systems trade consistency for availability and throughput.
- Real systems use both โ CP for critical paths (payments, inventory), AP for non-critical paths (catalog, preferences).
- Tail latency (p95/p99), not average latency, is the metric that actually determines whether your system feels fast to users.
๐ Practice Quiz
A banking transaction service experiences a network partition. Which CAP choice should it make, and why?
- A) AP โ banking needs to stay available even if data might be stale
- B) CP โ money amounts must be correct; it's better to reject requests than return wrong balances
- C) CA โ banks don't experience partitions so this doesn't apply
Correct Answer: B โ Financial systems must never return or commit incorrect balances. Under a partition, the correct behavior is to reject the request rather than risk split-brain writes to account balances.
Which consistency model does DynamoDB default to, and what is the tradeoff?
- A) Strong consistency โ all reads reflect the latest write, at higher latency and cost
- B) Eventual consistency โ reads may be slightly stale, but throughput is higher and cost is lower
- C) Serializable isolation โ all operations appear to execute sequentially
Correct Answer: B โ DynamoDB defaults to eventually consistent reads; strongly consistent reads are available at higher cost. Choosing between them is a deliberate tradeoff of staleness tolerance vs. performance and cost.
Your product catalog service needs to serve 10M reads/sec. Staleness of up to 2 seconds is acceptable. What is the best choice?
- A) Single-node PostgreSQL with synchronous replication
- B) Multi-master NoSQL cluster with eventual consistency and geographically distributed replicas
- C) Single Redis instance with write-through cache
Correct Answer: B โ At 10M reads/sec, a single-node solution cannot handle the load. A geographically distributed AP cluster scales horizontally and the 2-second staleness tolerance aligns with eventual consistency semantics.
๐ Related Posts
- System Design: Databases, SQL vs. NoSQL, and Scaling
- System Design: Networking, DNS, CDNs, and Load Balancers
- Consistent Hashing Explained

Written by
Abstract Algorithms
@abstractalgorithms
More Posts

Adapting to Virtual Threads for Spring Developers
TLDR: Platform threads (one OS thread per request) max out at a few hundred concurrent I/O-bound requests. Virtual threads (JDK 21+) allow millions โ with zero I/O-blocking cost. Spring Boot 3.2 enables them with a single property. Avoid synchronized...

Java 8 to Java 25: How Java Evolved from Boilerplate to a Modern Language
TLDR: Java went from the most verbose mainstream language to one of the most expressive. Lambdas killed anonymous inner classes. Records killed POJOs. Virtual threads killed thread pools for I/O work.
Data Anomalies in Distributed Systems: Split Brain, Clock Skew, Stale Reads, and More
TLDR: Distributed systems produce anomalies not because the code is buggy โ but because physics makes it impossible to be perfectly consistent, available, and partition-tolerant simultaneously. Split brain, stale reads, clock skew, causality violatio...
Sharding Approaches in SQL and NoSQL: Range, Hash, and Directory-Based Strategies Compared
TLDR: Sharding splits your database across multiple physical nodes so no single machine carries all the data or absorbs all the writes. The strategy you choose โ range, hash, consistent hashing, or directory โ determines whether range queries stay ch...
