System Design Networking: DNS, CDNs, and Load Balancers
The internet's traffic control system. We explain how DNS resolves names, CDNs cache content, and Load Balancers distribute traffic.
Abstract Algorithms
TLDR: When you hit a URL, DNS translates the name to an IP, CDNs serve static assets from the edge nearest to you, and Load Balancers spread traffic across many servers so no single machine becomes a bottleneck. These three layers are the traffic control system of the modern internet.
๐ Three Layers That Stand Between a User and Your Server
In 2021, GitHub went dark for nearly two hours. The cause was a misconfigured BGP route that blackholed traffic to GitHub's DNS servers โ meaning no request could resolve github.com to an IP address at all, let alone reach an application server. The outage was not a code bug or a database crash; it was a failure in the three infrastructure layers that execute before your application code ever runs.
Understanding DNS, CDNs, and load balancers โ and critically, how they fail โ is what separates engineers who can debug production outages from those who just restart services and hope.
Here is what a DNS failure looks like from a client's perspective:
$ dig github.com
;; connection timed out; no servers could be reached
That single timeout means every user on every device sees a blank page โ not because GitHub's servers were down, but because the phone book pointing to those servers was unreachable. No DNS resolution, no request ever gets off the ground.
Every web request travels through at least three invisible layers before it reaches your application code:
- DNS (Domain Name System) โ translates a human-readable hostname like
github.cominto a machine-readable IP address. Think of it as the internet's phone book: without it, nothing is addressable by name. - CDN (Content Delivery Network) โ serves cached assets (images, JS, CSS) from a server geographically close to the user, cutting the round-trip distance to your origin.
- Load Balancer โ distributes live requests across a pool of backend servers, removing any single point of failure.
Remove any one of these, and your system either breaks under load, slows to a crawl for distant users, or collapses on a single node.
| Layer | Role | What breaks without it |
| DNS | Name โ IP resolution | Nothing is reachable by hostname |
| CDN | Edge caching of static content | All requests hit origin; high latency for distant users |
| Load Balancer | Traffic distribution | Single-server bottleneck; no fault tolerance |
๐ The Building Blocks: What Each Component Does
Every web request to your application routes through three core network layers before reaching your code.
DNS (Domain Name System) is the phone book of the internet: it translates human-readable hostnames like api.example.com into machine-readable IP addresses. Without DNS, users would need to memorize IP addresses directly to reach your service.
CDN (Content Delivery Network) is a distributed cache of static assets. By placing copies of your images, CSS, and JavaScript at dozens of edge locations worldwide, a CDN serves those files from the server physically closest to each user โ dramatically reducing round-trip latency.
Load Balancer is the traffic manager. It accepts all incoming connections and distributes them across a pool of application servers. If one server fails a health check, the load balancer stops sending it traffic automatically โ providing fault tolerance without operator intervention.
Together, DNS resolves where to send traffic, the CDN handles static content at the edge, and the load balancer distributes dynamic requests across healthy backend instances. These three layers form the foundation of any horizontally scalable web architecture.
๐ DNS: The Internet's Phone Book
DNS maps example.com โ 192.0.2.1. The resolution chain has four hops:
- Recursive resolver (usually your ISP or 8.8.8.8) receives the query.
- Root server directs to the TLD server.
- TLD server (.com) directs to the authoritative name server.
- Authoritative NS returns the A record (IPv4) or AAAA record (IPv6).
TTL (Time To Live) controls how long resolvers cache the result. Lower TTL = faster failover; higher TTL = lower resolver load.
| DNS record type | Purpose |
| A | Hostname โ IPv4 |
| AAAA | Hostname โ IPv6 |
| CNAME | Alias one hostname to another |
| MX | Mail server routing |
| NS | Delegated authoritative name server |
| TXT | Arbitrary text (SPF, DKIM, verification) |
DNSSEC adds cryptographic signatures to prevent cache poisoning โ always enable it on public zones.
๐ DNS Resolution: Full Lookup Chain
sequenceDiagram
participant B as Browser
participant OS as OS Cache
participant RR as Recursive Resolver
participant Root as Root NS (.)
participant TLD as TLD NS (.com)
participant Auth as Authoritative NS
B->>OS: Resolve api.example.com
OS-->>B: Cache miss
B->>RR: Query api.example.com
RR->>Root: Where is .com?
Root-->>RR: TLD NS address for .com
RR->>TLD: Where is example.com?
TLD-->>RR: Auth NS for example.com
RR->>Auth: A record for api.example.com?
Auth-->>RR: 203.0.113.10 (TTL 300s)
RR-->>B: 203.0.113.10 (cached per TTL)
This diagram traces a full DNS resolution from the browser through five hops to the authoritative nameserver. The browser first checks the OS cache; on a miss, the recursive resolver fans out through root, TLD, and authoritative nameservers before returning the final IP address with its TTL. Notice that once the result is cached at the recursive resolver, all subsequent lookups skip these hops entirely โ this is why a high TTL reduces resolver load but delays failover propagation when you change an IP address.
๐ฆ CDNs: Bringing Content Closer to Users
A CDN is a globally distributed cache. When a user requests /static/logo.png, the CDN serves it from an edge PoP (Point of Presence) in their city rather than your origin server in a distant datacenter.
CDN hit ratio is the fraction of requests served from cache:
$$ ext{Avg Latency} = r \cdot L_ ext{edge} + (1 - r) \cdot L_ ext{origin}$$
where $r$ is the hit ratio. Even improving $r$ from 0.80 to 0.95 dramatically reduces effective latency.
Cache invalidation strategies:
- Versioned filenames (
app.v3.js) โ cached indefinitely; invalidate by renaming. - Cache-Control headers โ
max-age=86400for 24-hour TTL. - CDN purge API โ force-invalidate specific paths (use sparingly; defeats caching).
Edge compute (Cloudflare Workers, Lambda@Edge) allows running lightweight logic at the edge โ A/B tests, request rewriting, or authentication โ without round-tripping to origin.
โ๏ธ Load Balancers: Distributing Traffic Intelligently
A load balancer sits in front of your server pool and routes incoming requests:
flowchart TD
A[User Browser] -->|DNS lookup returns LB IP| B[Load Balancer L7]
B -->|Route /api| C[App Server 1]
B -->|Route /api| D[App Server 2]
B -->|Route /static| E[CDN Edge Node]
C --> F[Origin DB]
D --> F
Layer 4 vs Layer 7:
| Layer 4 (TCP/UDP) | Layer 7 (HTTP) | |
| Inspects | IP + port | Full HTTP headers, URL, cookies |
| Routing | By connection | By path, host, method |
| Use case | Raw throughput, TCP forwarding | Application-aware routing, TLS termination |
Routing algorithms:
| Algorithm | How it works | Best for |
| Round-Robin | Rotate through servers sequentially | Homogeneous servers |
| Least Connections | Route to server with fewest active connections | Variable request duration |
| IP Hash | Hash client IP โ server | Session affinity (sticky sessions) |
| Weighted | Assign traffic % per server | Heterogeneous server capacities |
Health checks probe each backend (HTTP /health, TCP ping) on a configurable interval. Unhealthy servers are removed from rotation without human intervention.
Load imbalance metric:
$$ ext{Imbalance} = rac{\max( ext{server QPS})}{ ext{avg}( ext{server QPS})}$$
A ratio near 1.0 means excellent distribution; above 2.0 is a warning sign.
๐ CDN Cache Miss and Hit Flow
sequenceDiagram
participant C as Client
participant E as CDN Edge PoP
participant O as Origin Server
C->>E: GET /static/logo.png
Note over E: Cache MISS (first request)
E->>O: Forward to origin
O-->>E: 200 OK + logo.png payload
E-->>C: 200 OK (served + cached at edge)
Note over E: Cached for max-age=86400s
C->>E: GET /static/logo.png
Note over E: Cache HIT
E-->>C: 200 OK (from edge, ~5ms)
Note over O: Origin not contacted on hit
This sequence diagram contrasts a CDN cache miss on the first request with a cache hit on the second request for the same static asset. On the first request, the edge PoP has no cached copy and must forward the request to origin, incurring full round-trip latency; on the second request the edge serves directly from its cache in roughly 5 ms without contacting origin at all. The key takeaway: maximising the cache hit ratio by setting appropriate Cache-Control headers and using versioned filenames is the most direct lever for reducing both user-facing latency and origin infrastructure load.
๐ง Deep Dive: Inside a DNS Resolution and Cache Lifecycle
When you type a URL, the OS checks its local DNS cache first. On a miss, the query travels to a recursive resolver, which fans out to root โ TLD โ authoritative nameserver. Each hop adds a few milliseconds โ but the result is cached per TTL, so repeat visits skip all hops. This is why TTL tuning matters: a 24-hour TTL means stale IPs persist for 24 hours after a server change.
| Cache stage | Managed by | TTL impact |
| OS / browser | Local machine | Very short; typically 30โ60 s |
| Recursive resolver | ISP or public DNS | Set by authoritative NS record |
| Authoritative NS | You | Lower = faster failover; higher = less resolver load |
๐ The Full Request Journey: DNS โ CDN โ Load Balancer โ App
sequenceDiagram
participant U as User Browser
participant R as Recursive Resolver
participant A as Authoritative NS
participant LB as Load Balancer
participant CDN as CDN Edge
participant App as App Server
U->>R: Resolve example.com
R->>A: Query A record
A-->>R: Returns LB IP
R-->>U: LB IP (cached per TTL)
U->>LB: GET /api/data
LB->>App: Forward request
App-->>LB: JSON response
LB-->>U: JSON response
U->>CDN: GET /static/logo.png
CDN-->>U: Serve from edge cache
This diagram shows how a single user interaction splits into two parallel paths: the API request travels through DNS resolution to the load balancer and then to an app server, while the static asset request is intercepted and served directly by the nearest CDN edge node. Notice that the CDN request never touches the load balancer or app server โ this is the core value proposition of a CDN, removing an entire class of requests from the origin. The combined effect of DNS caching, edge caching, and load-balanced app servers is what allows a modest backend cluster to serve millions of users globally.
โ๏ธ Trade-offs & Failure Modes: Trade-offs and Failure Modes
| Layer | Common failure | Mitigation |
| DNS | Stale cached IP after failover | Set TTL โค 60s before planned change |
| DNS | Cache poisoning | Enable DNSSEC |
| CDN | Stale content after deploy | Versioned asset filenames + purge on deploy |
| CDN | Cache miss storm on cold start | Warm cache before traffic shift |
| Load Balancer | Health-check lag โ routing to dead server | Aggressive health checks + circuit breaker |
| Load Balancer | Session breakage | Sticky sessions or stateless session design |
๐ Real-World Applications: Real-World Deployments: DNS, CDN, and Load Balancer in Action
Major web platforms rely on all three layers working in concert.
E-commerce (Amazon, Shopify): A CDN caches product images and CSS globally, reducing page load times for users worldwide. A Layer-7 load balancer routes checkout API requests to dedicated payment-processing servers. GeoDNS routes users to the nearest regional datacenter, keeping latency below 50 ms for 99% of requests.
Streaming platforms (Netflix, YouTube): DNS Anycast routes users to the nearest Point of Presence. The CDN edge stores cached video segments. Adaptive bitrate algorithms request different quality segments based on available bandwidth โ all served from edge nodes, not the origin.
SaaS platforms (Slack, Notion): Load balancers distribute WebSocket connections across stateful nodes with sticky sessions, ensuring each user remains connected to the same backend throughout their session. CDNs cache static app bundles so browser reloads are instant.
Startup MVP: A single load balancer in front of two application servers, backed by a CDN like Cloudflare's free tier, handles most early-stage traffic without dedicated infrastructure investment. Start simple and add DNS-based geo-routing as your user base grows internationally.
๐งญ Decision Guide: When to Use What
| Situation | Recommendation | Why |
| Serving static assets | Deploy a CDN (Cloudflare, Fastly, CloudFront) | Edge caching cuts latency dramatically |
| Horizontal scaling of API servers | Layer 7 Load Balancer | Smart routing, health checks, TLS termination |
| Global user base | GeoDNS + Anycast + regional LB | Routes users to nearest edge, minimizing RTT |
| Session affinity needed | IP-Hash or Cookie-Based LB | Guarantees subsequent requests hit same backend |
| Rapid failover | TTL โค 60s + DNS health-monitored failover | Reduces stale records during outages |
| Dynamic content caching | CDN edge compute (ESI, Cloudflare Workers) | Caches fragments while personalizing at edge |
๐งช Practical Setup: CDN and Load Balancer in 4 Steps
This example walks through the four-step sequence for wiring DNS, a CDN, and a load balancer into an existing web application โ the exact configuration that connects all three layers discussed throughout this post. It was designed around the real-world order of operations engineers follow in production: start at the cache layer, then distribute traffic, then tune DNS last, because updating DNS before the load balancer is healthy is the most common cause of self-inflicted outages during infrastructure changes. Focus on the dependency between steps: each one builds on the previous, and the health-check endpoint you configure in Step 2 is the exact signal that makes Step 4's failover test meaningful.
Step 1 โ Configure the CDN: point your CDN provider (Cloudflare, CloudFront, Fastly) at your origin server. Set Cache-Control: max-age=31536000 on static assets and use versioned filenames like app.v4.js so browsers never serve stale bundles without intentional invalidation.
Step 2 โ Deploy a load balancer: provision a managed load balancer (AWS ALB, GCP Load Balancing, NGINX). Add at least two backend servers for redundancy. Enable HTTPS termination at the load balancer and configure a /health endpoint on each backend that returns 200 only when the server is genuinely ready.
Step 3 โ Update DNS: point your domain's A record to the load balancer IP. Set TTL to 300 seconds initially; lower it to 60 seconds before planned failover events to reduce stale-record impact.
Step 4 โ Test failover: take one backend server out of rotation manually and verify the load balancer detects the failure within one health-check interval and stops routing traffic to it before restoring the server.
๐ฏ What to Learn Next
- System Design Core Concepts: Scalability, CAP, and Consistency
- System Design Databases: SQL vs. NoSQL and Scaling
- The Ultimate Guide to Acing the System Design Interview
๐ ๏ธ Spring Boot + Spring Cloud Gateway: Health Checks and Dynamic Routing in Java
Spring Boot is the standard Java framework for building services behind load balancers, and Spring Cloud Gateway is its companion API gateway that implements Layer-7 routing, health-check probes, and circuit-breaker patterns โ making the load balancer concepts from this post directly configurable in Java code.
The /health endpoint that load balancers probe is provided automatically by Spring Boot Actuator; Spring Cloud Gateway handles path-based routing and can integrate with a discovery service for dynamic backend registration:
// dependencies: spring-boot-starter-actuator, spring-cloud-starter-gateway,
// spring-cloud-starter-circuitbreaker-resilience4j
// application.yml โ Spring Cloud Gateway routing config (NGINX equivalent in Java)
/*
spring:
cloud:
gateway:
routes:
- id: api-route
uri: lb://backend-service # lb:// = Spring Cloud LoadBalancer
predicates:
- Path=/api/**
filters:
- StripPrefix=1
- nam
e: CircuitBreaker
args:
name: backendCB
fallbackUri: forward:/fallback
- id: static-route
uri: https://cdn.example.com
predicates:
- Path=/static/**
*/
import org.springframework.boot.actuate.health.Health;
import org.springframework.boot.actuate.health.HealthIndicator;
import org.springframework.stereotype.Component;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;
// Custom health indicator โ LB probes GET /actuator/health
// Returns 200 only when DB connection is healthy (not just process liveness)
@Component("database")
public class DatabaseHealthIndicator implements HealthIndicator {
private final javax.sql.DataSource dataSource;
public DatabaseHealthIndicator(javax.sql.DataSource dataSource) {
this.dataSource = dataSource;
}
@Override
public Health health() {
try (var conn = dataSource.getConnection()) {
conn.createStatement().execute("SELECT 1");
return Health.up().withDetail("db", "reachable").build();
} catch (Exception e) {
// LB removes server from rotation when this returns DOWN
return Health.down().withDetail("error", e.getMessage()).build();
}
}
}
// Fallback controller called by circuit breaker when backend is unhealthy
@RestController
public class FallbackController {
@GetMapping("/fallback")
public String fallback() {
return "{\"error\": \"Service temporarily unavailable. Please retry.\"}";
}
}
Configure health-check interval in application.yml:
management:
endpoints:
web:
exposure:
include: health,metrics,info
endpoint:
health:
show-details: always # exposes per-component status to the LB probe
The DatabaseHealthIndicator returns HTTP 503 when the database is unreachable โ the correct signal for a load balancer to remove the server from rotation, implementing the "health checks must reflect true readiness" lesson from the Lessons section.
For a full deep-dive on Spring Cloud Gateway, a dedicated follow-up post is planned.
๐ Lessons from Operating These Layers
Teams that run DNS, CDNs, and load balancers in production converge on the same operational lessons.
DNS TTL discipline matters: set TTL low (60โ300 s) before planned changes; let it drift back up during normal operation to reduce resolver load. Many outages last longer than necessary because an engineer raised TTL before a migration and forgot to lower it first.
Cache invalidation is harder than it looks: versioned filenames are the most reliable cache-busting strategy. Relying on CDN purge APIs for time-sensitive invalidations introduces propagation delay of seconds to minutes across distributed edge nodes.
Health checks must reflect true readiness: a /health endpoint that returns 200 even when the database is down will route traffic to a broken server. Health checks should probe critical dependencies, not just process liveness.
Monitor the cache hit ratio daily: a ratio below 0.80 indicates cache misconfiguration or too many unique URLs bypassing edge caching. Tuning Cache-Control headers is usually the first and highest-impact fix.
๐ TLDR: Summary & Key Takeaways
- DNS = global phone book; Anycast + low TTL = fast, resilient name resolution.
- CDN = edge cache; versioned filenames and cache-control headers keep content fresh and fast.
- Load Balancer = traffic manager; choose algorithm (Round-Robin, Least-Conn, IP-Hash) based on session needs.
- Combine all three: DNS โ LB โ CDN โ Origin โ backend services.
- Monitor latency at each layer; cache-hit ratio and health-check lag are the most actionable signals.
๐ Practice Quiz
Q1: Which DNS record type aliases one hostname to another?
- A) A record
- B) CNAME record
- C) MX record
Correct Answer: B
Q2: Why does a CDN reduce latency for users far from the origin?
- A) It compresses responses at the origin
- B) It serves cached content from an edge server geographically close to the user
- C) It upgrades the user's network connection
Correct Answer: B
Q3: When is IP-Hash load balancing the right choice?
- A) When all servers have identical capacity
- B) When you need session affinity so users consistently hit the same backend
- C) When you want to minimize active connections per server
Correct Answer: B
๐ Related Posts
- System Design Core Concepts: Scalability, CAP, and Consistency
- System Design Databases: SQL vs. NoSQL and Scaling
- API Gateway vs. Load Balancer vs. Reverse Proxy

Written by
Abstract Algorithms
@abstractalgorithms
More Posts

Adapting to Virtual Threads for Spring Developers
TLDR: Platform threads (one OS thread per request) max out at a few hundred concurrent I/O-bound requests. Virtual threads (JDK 21+) allow millions โ with zero I/O-blocking cost. Spring Boot 3.2 enables them with a single property. Avoid synchronized...

Java 8 to Java 25: How Java Evolved from Boilerplate to a Modern Language
TLDR: Java went from the most verbose mainstream language to one of the most expressive. Lambdas killed anonymous inner classes. Records killed POJOs. Virtual threads killed thread pools for I/O work.
Data Anomalies in Distributed Systems: Split Brain, Clock Skew, Stale Reads, and More
TLDR: Distributed systems produce anomalies not because the code is buggy โ but because physics makes it impossible to be perfectly consistent, available, and partition-tolerant simultaneously. Split brain, stale reads, clock skew, causality violatio...
Sharding Approaches in SQL and NoSQL: Range, Hash, and Directory-Based Strategies Compared
TLDR: Sharding splits your database across multiple physical nodes so no single machine carries all the data or absorbs all the writes. The strategy you choose โ range, hash, consistent hashing, or directory โ determines whether range queries stay ch...
