System Design Networking: DNS, CDNs, and Load Balancers

The internet's traffic control system. We explain how DNS resolves names, CDNs cache content, and Load Balancers distribute traffic.

System Design Interview Prep

Abstract Algorithms

·Feb 8, 2026·16 min read

Cover Image for System Design Networking: DNS, CDNs, and Load Balancers

📚

Intermediate

For developers with some experience. Builds on fundamentals.

Estimated read time: 16 min

AI-assisted content. This post may have been written or enhanced with AI tools. Please verify critical information independently.

TLDR: When you hit a URL, DNS translates the name to an IP, CDNs serve static assets from the edge nearest to you, and Load Balancers spread traffic across many servers so no single machine becomes a bottleneck. These three layers are the traffic control system of the modern internet.

🌐 Three Layers That Stand Between a User and Your Server

In 2021, GitHub went dark for nearly two hours. The cause was a misconfigured BGP route that blackholed traffic to GitHub's DNS servers — meaning no request could resolve github.com to an IP address at all, let alone reach an application server. The outage was not a code bug or a database crash; it was a failure in the three infrastructure layers that execute before your application code ever runs.

Understanding DNS, CDNs, and load balancers — and critically, how they fail — is what separates engineers who can debug production outages from those who just restart services and hope.

When DNS resolution fails, the client never receives an IP address to connect to. A dig or nslookup command simply times out with no servers reachable — meaning every user on every device sees a blank page, not because GitHub's servers were down, but because the phone book pointing to those servers was unreachable. No DNS resolution means no request ever gets off the ground.

Every web request travels through at least three invisible layers before it reaches your application code:

DNS (Domain Name System) — translates a human-readable hostname like github.com into a machine-readable IP address. Think of it as the internet's phone book: without it, nothing is addressable by name.
CDN (Content Delivery Network) — serves cached assets (images, JS, CSS) from a server geographically close to the user, cutting the round-trip distance to your origin.
Load Balancer — distributes live requests across a pool of backend servers, removing any single point of failure.

Remove any one of these, and your system either breaks under load, slows to a crawl for distant users, or collapses on a single node.

Layer	Role	What breaks without it
DNS	Name → IP resolution	Nothing is reachable by hostname
CDN	Edge caching of static content	All requests hit origin; high latency for distant users
Load Balancer	Traffic distribution	Single-server bottleneck; no fault tolerance

🔍 The Building Blocks: What Each Component Does

Every web request to your application routes through three core network layers before reaching your code.

DNS (Domain Name System) is the phone book of the internet: it translates human-readable hostnames like api.example.com into machine-readable IP addresses. Without DNS, users would need to memorize IP addresses directly to reach your service.

CDN (Content Delivery Network) is a distributed cache of static assets. By placing copies of your images, CSS, and JavaScript at dozens of edge locations worldwide, a CDN serves those files from the server physically closest to each user — dramatically reducing round-trip latency.

Load Balancer is the traffic manager. It accepts all incoming connections and distributes them across a pool of application servers. If one server fails a health check, the load balancer stops sending it traffic automatically — providing fault tolerance without operator intervention.

Together, DNS resolves where to send traffic, the CDN handles static content at the edge, and the load balancer distributes dynamic requests across healthy backend instances. These three layers form the foundation of any horizontally scalable web architecture.

📖 DNS: The Internet's Phone Book

DNS maps example.com → 192.0.2.1. The resolution chain has four hops:

Recursive resolver (usually your ISP or 8.8.8.8) receives the query.
Root server directs to the TLD server.
TLD server (.com) directs to the authoritative name server.
Authoritative NS returns the A record (IPv4) or AAAA record (IPv6).

TTL (Time To Live) controls how long resolvers cache the result. Lower TTL = faster failover; higher TTL = lower resolver load.

DNS record type	Purpose
A	Hostname → IPv4
AAAA	Hostname → IPv6
CNAME	Alias one hostname to another
MX	Mail server routing
NS	Delegated authoritative name server
TXT	Arbitrary text (SPF, DKIM, verification)

DNSSEC adds cryptographic signatures to prevent cache poisoning — always enable it on public zones.

📊 DNS Resolution: Full Lookup Chain

sequenceDiagram
    participant B as Browser
    participant OS as OS Cache
    participant RR as Recursive Resolver
    participant Root as Root NS (.)
    participant TLD as TLD NS (.com)
    participant Auth as Authoritative NS

    B->>OS: Resolve api.example.com
    OS-->>B: Cache miss
    B->>RR: Query api.example.com
    RR->>Root: Where is .com?
    Root-->>RR: TLD NS address for .com
    RR->>TLD: Where is example.com?
    TLD-->>RR: Auth NS for example.com
    RR->>Auth: A record for api.example.com?
    Auth-->>RR: 203.0.113.10 (TTL 300s)
    RR-->>B: 203.0.113.10 (cached per TTL)

This diagram traces a full DNS resolution from the browser through five hops to the authoritative nameserver. The browser first checks the OS cache; on a miss, the recursive resolver fans out through root, TLD, and authoritative nameservers before returning the final IP address with its TTL. Notice that once the result is cached at the recursive resolver, all subsequent lookups skip these hops entirely — this is why a high TTL reduces resolver load but delays failover propagation when you change an IP address.

📦 CDNs: Bringing Content Closer to Users

A CDN is a globally distributed cache. When a user requests /static/logo.png, the CDN serves it from an edge PoP (Point of Presence) in their city rather than your origin server in a distant datacenter.

CDN hit ratio is the fraction of requests served from cache:

$$ ext{Avg Latency} = r \cdot L_ ext{edge} + (1 - r) \cdot L_ ext{origin}$$

where $r$ is the hit ratio. Even improving $r$ from 0.80 to 0.95 dramatically reduces effective latency.

Cache invalidation strategies:

Versioned filenames (app.v3.js) — cached indefinitely; invalidate by renaming.
Cache-Control headers — max-age=86400 for 24-hour TTL.
CDN purge API — force-invalidate specific paths (use sparingly; defeats caching).

Edge compute (Cloudflare Workers, Lambda@Edge) allows running lightweight logic at the edge — A/B tests, request rewriting, or authentication — without round-tripping to origin.

⚙️ Load Balancers: Distributing Traffic Intelligently

A load balancer sits in front of your server pool and routes incoming requests:

flowchart TD
    A[User Browser] -->|DNS lookup returns LB IP| B[Load Balancer L7]
    B -->|Route /api| C[App Server 1]
    B -->|Route /api| D[App Server 2]
    B -->|Route /static| E[CDN Edge Node]
    C --> F[Origin DB]
    D --> F

Layer 4 vs Layer 7:

	Layer 4 (TCP/UDP)	Layer 7 (HTTP)
Inspects	IP + port	Full HTTP headers, URL, cookies
Routing	By connection	By path, host, method
Use case	Raw throughput, TCP forwarding	Application-aware routing, TLS termination

Routing algorithms:

Algorithm	How it works	Best for
Round-Robin	Rotate through servers sequentially	Homogeneous servers
Least Connections	Route to server with fewest active connections	Variable request duration
IP Hash	Hash client IP → server	Session affinity (sticky sessions)
Weighted	Assign traffic % per server	Heterogeneous server capacities

Health checks probe each backend (HTTP /health, TCP ping) on a configurable interval. Unhealthy servers are removed from rotation without human intervention.

Load imbalance metric:

$$ ext{Imbalance} = rac{\max( ext{server QPS})}{ ext{avg}( ext{server QPS})}$$

A ratio near 1.0 means excellent distribution; above 2.0 is a warning sign.

📊 CDN Cache Miss and Hit Flow

sequenceDiagram
    participant C as Client
    participant E as CDN Edge PoP
    participant O as Origin Server

    C->>E: GET /static/logo.png
    Note over E: Cache MISS (first request)
    E->>O: Forward to origin
    O-->>E: 200 OK + logo.png payload
    E-->>C: 200 OK (served + cached at edge)
    Note over E: Cached for max-age=86400s

    C->>E: GET /static/logo.png
    Note over E: Cache HIT
    E-->>C: 200 OK (from edge, ~5ms)
    Note over O: Origin not contacted on hit

This sequence diagram contrasts a CDN cache miss on the first request with a cache hit on the second request for the same static asset. On the first request, the edge PoP has no cached copy and must forward the request to origin, incurring full round-trip latency; on the second request the edge serves directly from its cache in roughly 5 ms without contacting origin at all. The key takeaway: maximising the cache hit ratio by setting appropriate Cache-Control headers and using versioned filenames is the most direct lever for reducing both user-facing latency and origin infrastructure load.

🧠 Deep Dive: Inside a DNS Resolution and Cache Lifecycle

When you type a URL, the OS checks its local DNS cache first. On a miss, the query travels to a recursive resolver, which fans out to root → TLD → authoritative nameserver. Each hop adds a few milliseconds — but the result is cached per TTL, so repeat visits skip all hops. This is why TTL tuning matters: a 24-hour TTL means stale IPs persist for 24 hours after a server change.

Cache stage	Managed by	TTL impact
OS / browser	Local machine	Very short; typically 30–60 s
Recursive resolver	ISP or public DNS	Set by authoritative NS record
Authoritative NS	You	Lower = faster failover; higher = less resolver load

📊 The Full Request Journey: DNS → CDN → Load Balancer → App

sequenceDiagram
    participant U as User Browser
    participant R as Recursive Resolver
    participant A as Authoritative NS
    participant LB as Load Balancer
    participant CDN as CDN Edge
    participant App as App Server

    U->>R: Resolve example.com
    R->>A: Query A record
    A-->>R: Returns LB IP
    R-->>U: LB IP (cached per TTL)
    U->>LB: GET /api/data
    LB->>App: Forward request
    App-->>LB: JSON response
    LB-->>U: JSON response
    U->>CDN: GET /static/logo.png
    CDN-->>U: Serve from edge cache

This diagram shows how a single user interaction splits into two parallel paths: the API request travels through DNS resolution to the load balancer and then to an app server, while the static asset request is intercepted and served directly by the nearest CDN edge node. Notice that the CDN request never touches the load balancer or app server — this is the core value proposition of a CDN, removing an entire class of requests from the origin. The combined effect of DNS caching, edge caching, and load-balanced app servers is what allows a modest backend cluster to serve millions of users globally.

⚖️ Trade-offs & Failure Modes: Trade-offs and Failure Modes

Layer	Common failure	Mitigation
DNS	Stale cached IP after failover	Set TTL ≤ 60s before planned change
DNS	Cache poisoning	Enable DNSSEC
CDN	Stale content after deploy	Versioned asset filenames + purge on deploy
CDN	Cache miss storm on cold start	Warm cache before traffic shift
Load Balancer	Health-check lag → routing to dead server	Aggressive health checks + circuit breaker
Load Balancer	Session breakage	Sticky sessions or stateless session design

🌍 Real-World Applications: Real-World Deployments: DNS, CDN, and Load Balancer in Action

Major web platforms rely on all three layers working in concert.

E-commerce (Amazon, Shopify): A CDN caches product images and CSS globally, reducing page load times for users worldwide. A Layer-7 load balancer routes checkout API requests to dedicated payment-processing servers. GeoDNS routes users to the nearest regional datacenter, keeping latency below 50 ms for 99% of requests.

Streaming platforms (Netflix, YouTube): DNS Anycast routes users to the nearest Point of Presence. The CDN edge stores cached video segments. Adaptive bitrate algorithms request different quality segments based on available bandwidth — all served from edge nodes, not the origin.

SaaS platforms (Slack, Notion): Load balancers distribute WebSocket connections across stateful nodes with sticky sessions, ensuring each user remains connected to the same backend throughout their session. CDNs cache static app bundles so browser reloads are instant.

Startup MVP: A single load balancer in front of two application servers, backed by a CDN like Cloudflare's free tier, handles most early-stage traffic without dedicated infrastructure investment. Start simple and add DNS-based geo-routing as your user base grows internationally.

🧭 Decision Guide: When to Use What

Situation	Recommendation	Why
Serving static assets	Deploy a CDN (Cloudflare, Fastly, CloudFront)	Edge caching cuts latency dramatically
Horizontal scaling of API servers	Layer 7 Load Balancer	Smart routing, health checks, TLS termination
Global user base	GeoDNS + Anycast + regional LB	Routes users to nearest edge, minimizing RTT
Session affinity needed	IP-Hash or Cookie-Based LB	Guarantees subsequent requests hit same backend
Rapid failover	TTL ≤ 60s + DNS health-monitored failover	Reduces stale records during outages
Dynamic content caching	CDN edge compute (ESI, Cloudflare Workers)	Caches fragments while personalizing at edge

🧪 Practical Setup: CDN and Load Balancer in 4 Steps

This example walks through the four-step sequence for wiring DNS, a CDN, and a load balancer into an existing web application — the exact configuration that connects all three layers discussed throughout this post. It was designed around the real-world order of operations engineers follow in production: start at the cache layer, then distribute traffic, then tune DNS last, because updating DNS before the load balancer is healthy is the most common cause of self-inflicted outages during infrastructure changes. Focus on the dependency between steps: each one builds on the previous, and the health-check endpoint you configure in Step 2 is the exact signal that makes Step 4's failover test meaningful.

Step 1 — Configure the CDN: point your CDN provider (Cloudflare, CloudFront, Fastly) at your origin server. Set Cache-Control: max-age=31536000 on static assets and use versioned filenames like app.v4.js so browsers never serve stale bundles without intentional invalidation.

Step 2 — Deploy a load balancer: provision a managed load balancer (AWS ALB, GCP Load Balancing, NGINX). Add at least two backend servers for redundancy. Enable HTTPS termination at the load balancer and configure a /health endpoint on each backend that returns 200 only when the server is genuinely ready.

Step 3 — Update DNS: point your domain's A record to the load balancer IP. Set TTL to 300 seconds initially; lower it to 60 seconds before planned failover events to reduce stale-record impact.

Step 4 — Test failover: take one backend server out of rotation manually and verify the load balancer detects the failure within one health-check interval and stops routing traffic to it before restoring the server.

🎯 What to Learn Next

🛠️ Implementation Approach: Health Checks and Dynamic Routing

In practice, a load balancer's health-check mechanism is only as useful as the signal the backend exposes. A well-designed health endpoint does not simply confirm the process is alive — it validates that all critical dependencies (database connections, downstream services, message brokers) are genuinely reachable. When any dependency is degraded, the endpoint returns an unhealthy status so the load balancer can drain the server from rotation immediately, before real traffic hits a broken path.

Path-based routing at the gateway layer complements this: API traffic is forwarded to the backend service pool, while static asset requests are redirected to the CDN origin — keeping application servers free for dynamic workloads. A circuit breaker pattern sits between the gateway and the backend pool; if a backend's error rate spikes above a threshold, the circuit opens and a fallback response is returned, preventing cascading failures from propagating to users.

The health-check interval and the circuit-breaker thresholds are tuned together: a shorter probe interval detects failures faster but generates more background traffic, while a longer interval reduces overhead but extends the window in which a degraded server remains in rotation. Most production systems settle on 5–10 second probe intervals with a 3-failure threshold before removing a node.

For a full deep-dive on API gateway patterns and circuit breakers, see the related posts below.

📚 Lessons from Operating These Layers

Teams that run DNS, CDNs, and load balancers in production converge on the same operational lessons.

DNS TTL discipline matters: set TTL low (60–300 s) before planned changes; let it drift back up during normal operation to reduce resolver load. Many outages last longer than necessary because an engineer raised TTL before a migration and forgot to lower it first.

Cache invalidation is harder than it looks: versioned filenames are the most reliable cache-busting strategy. Relying on CDN purge APIs for time-sensitive invalidations introduces propagation delay of seconds to minutes across distributed edge nodes.

Health checks must reflect true readiness: a /health endpoint that returns 200 even when the database is down will route traffic to a broken server. Health checks should probe critical dependencies, not just process liveness.

Monitor the cache hit ratio daily: a ratio below 0.80 indicates cache misconfiguration or too many unique URLs bypassing edge caching. Tuning Cache-Control headers is usually the first and highest-impact fix.

📌 TLDR: Summary & Key Takeaways

DNS = global phone book; Anycast + low TTL = fast, resilient name resolution.
CDN = edge cache; versioned filenames and cache-control headers keep content fresh and fast.
Load Balancer = traffic manager; choose algorithm (Round-Robin, Least-Conn, IP-Hash) based on session needs.
Combine all three: DNS → LB → CDN → Origin → backend services.
Monitor latency at each layer; cache-hit ratio and health-check lag are the most actionable signals.

Test Your Knowledge

🧠

Ready to test what you just learned?

AI will generate 4 questions based on this article's content.

Stale Reads and Cascading Failures in Distributed Systems

TLDR: Stale reads return superseded data from replicas that haven't yet applied the latest write. Cascading failures turn one overloaded node into a cluster-wide collapse through retry storms and redistributed load. Both are preventable — stale reads...

May 3, 2026•23 min read

Clock Skew and Causality Violations: Why Distributed Clocks Lie

TLDR: Physical clocks on distributed machines cannot be perfectly synchronized. NTP keeps them within tens to hundreds of milliseconds in normal conditions — but under load, across datacenters, or after a VM pause, the drift can reach seconds. When s...

May 3, 2026•18 min read

Split Brain Explained: When Two Nodes Both Think They Are Leader

TLDR: Split brain happens when a network partition causes two nodes to simultaneously believe they are the leader — each accepting writes the other never sees. Prevent it with quorum consensus (at least ⌊N/2⌋+1 nodes must agree before leadership is g...

May 3, 2026•20 min read

NoSQL Partitioning: How Cassandra, DynamoDB, and MongoDB Split Data

TLDR: Every NoSQL database hides a partitioning engine behind a deceptively simple API. Cassandra uses a consistent hashing ring where a Murmur3 hash of your partition key selects a node — virtual nodes (vnodes) make rebalancing smooth. DynamoDB mana...

May 3, 2026•22 min read