All Posts

API Gateway vs. Load Balancer vs. Reverse Proxy: What's the Difference?

They all sit in front of your servers. But do you need Nginx, HAProxy, or Kong? We clarify the roles of each component.

Abstract AlgorithmsAbstract Algorithms
ยทยท14 min read
Cover Image for API Gateway vs. Load Balancer vs. Reverse Proxy: What's the Difference?
Share
AI Share on X / Twitter
AI Share on LinkedIn
Copy link

TLDR: A Reverse Proxy hides your servers and handles caching/SSL. A Load Balancer spreads traffic across server instances. An API Gateway manages API concerns โ€” auth, rate limiting, routing, and protocol translation. Modern tools (Nginx, AWS ALB, Kong) often combine all three, but understanding what each layer does independently is essential for system design.


๐Ÿ” The Basics: What Each Layer Does

Three terms โ€” reverse proxy, load balancer, API gateway โ€” describe components that all sit in front of your backend, yet serve distinct purposes. Confusing them leads to misconfigurations that are expensive to debug in production.

The one-sentence role of each:

ComponentCore job
Reverse ProxyHides backend servers; handles SSL, caching, and compression at the edge
Load BalancerDistributes connections across multiple identical server instances
API GatewayEnforces API-level policy: auth, rate limits, routing, and protocol translation

The confusion arises because modern tools collapse all three. Nginx can act as a reverse proxy, load balancer, and primitive gateway simultaneously. AWS ALB is an L7 load balancer with some gateway features. Kong is a full API gateway with load balancing built in.

Understanding the layers independently makes it possible to diagnose issues correctly โ€” is the problem at the SSL termination layer, the traffic distribution layer, or the auth enforcement layer?


๐Ÿ“– Three Guards, One Door: The Traffic Handling Hierarchy

When a user's request leaves their browser, it goes through multiple intermediary layers before hitting your application. These layers have overlapping but distinct responsibilities:

Client
  โ”‚
  โ–ผ
Reverse Proxy   โ† "Who asked? Cache it if possible. SSL here."
  โ”‚
  โ–ผ
Load Balancer   โ† "Which server instance gets this?"
  โ”‚
  โ–ผ
API Gateway     โ† "Is this user authorized? Rate-limited? Which microservice?"
  โ”‚
  โ–ผ
Application Servers

All three components sit in front of your servers. The confusion arises because modern tools blur the boundaries โ€” Nginx can be all three simultaneously.


๐Ÿ”’ Reverse Proxy: Hiding Your Servers and Doing the Boring Work

A Reverse Proxy intercepts incoming requests and forwards them to backend servers on behalf of the client. The client never knows the backend server's address.

What reverse proxies do:

ResponsibilityWhy it matters
SSL/TLS terminationOffloads encryption from app servers
Static content cachingReduces backend load for repeated requests
Compression (gzip, brotli)Reduces response size to client
IP maskingHides backend topology from clients
DDoS absorptionFirst line of defense against volumetric attacks

Example (Nginx as reverse proxy):

server {
    listen 443 ssl;
    server_name api.example.com;

    ssl_certificate /path/to/cert.pem;
    ssl_certificate_key /path/to/key.pem;

    location / {
        proxy_pass http://backend_servers;  # forwards to backend
    }
}

Every request from the internet hits Nginx first. Backend servers only see Nginx's IP โ€” never the real client.


โš–๏ธ Trade-offs & Failure Modes: Load Balancer: Distributing Traffic So No Server Burns

A Load Balancer distributes incoming connections across a pool of identical server instances. The goal is to prevent any single instance from being overwhelmed.

Distribution algorithms:

AlgorithmHow it worksBest for
Round RobinEach server in sequenceUniform, stateless requests
Least ConnectionsRoute to server with fewest active connectionsVariable request duration
IP HashHash client IP โ†’ sticky serverSessions requiring same-server affinity
Weighted Round RobinAssign proportionally more traffic to stronger serversMixed-capacity fleets

Health checks: Load balancers continuously probe backends. If /health returns non-200 or times out, the server is removed from rotation โ€” automatically, without manual intervention.

graph LR
    C[Client] --> LB[Load Balancer]
    LB --> S1[Server 1]
    LB --> S2[Server 2]
    LB --> S3[Server 3]
    S1 -->|health 200| LB
    S2 -->|health 503| LB
    LB -.-> S2
    style S2 stroke-dasharray: 5 5

L4 vs. L7 load balancers:

  • L4 (transport layer): Routes based on IP and TCP port. Extremely fast. Cannot see HTTP headers. Example: AWS NLB.
  • L7 (application layer): Can route based on URL path, HTTP headers, cookies. Slightly slower but far more flexible. Example: AWS ALB.

๐ŸŒ API Gateway: The Smart Layer That Knows About Your APIs

An API Gateway goes beyond traffic routing โ€” it enforces API-level concerns that are too application-specific for a reverse proxy.

Core API Gateway responsibilities:

FeatureWhat it means
Authentication/AuthorizationValidate JWT, OAuth2, API key before reaching the backend
Rate LimitingReject requests over the configured threshold (e.g., 100 req/min per user)
Request TransformationConvert REST to gRPC, add/remove headers, rename fields
RoutingRoute /v1/users to user-service, /v1/orders to order-service
Analytics & LoggingTrack per-endpoint latency, error rates, usage by API key
Circuit BreakingStop forwarding to unhealthy downstream services

Example (Kong route + rate-limit plugin):

services:
  - nam
e: user-service
    url: http://users:8080
    routes:
      - nam
e: users-route
        paths: ["/v1/users"]
    plugins:
      - nam
e: rate-limiting
        config:
          minute: 100
          policy: local
      - nam
e: jwt

No app code changes needed โ€” auth and rate limiting are enforced at the gateway.

๐Ÿ“Š API Gateway Request Flow

sequenceDiagram
    participant C as Client
    participant GW as API Gateway
    participant AU as Auth Service
    participant MS as Microservice
    C->>GW: HTTP Request
    GW->>AU: Authenticate token
    AU-->>GW: Token valid
    GW->>GW: Rate limit check
    GW->>MS: Route to service
    MS-->>C: Response

โš™๏ธ Core Mechanics: What Each Layer Actually Does to Requests

Understanding the mechanics helps you choose the right tool and configure it correctly.

Reverse proxy mechanics: The proxy receives the client's request and opens a new connection to the backend on the client's behalf. The backend only ever sees the proxy's IP. SSL is terminated at the proxy โ€” the backend connection can be plain HTTP internally.

Load balancer mechanics: The load balancer maintains a pool of backend instances and a health check loop. Every incoming connection is assigned to one instance based on the configured algorithm. If a health check fails, the instance is removed from rotation automatically.

API gateway mechanics: The gateway applies a policy pipeline to every request: authenticate โ†’ authorize โ†’ rate-limit check โ†’ request transformation โ†’ route to backend โ†’ response transformation. Each step can be configured independently per route. The gateway can talk to multiple downstream services within a single client request (request fan-out).

The key mechanical difference:

MechanismOperates atAware of HTTP content?Per-user state?
Reverse ProxyL4/L7Partially (headers, URL)No
Load Balancer L4L4 (TCP/IP)NoNo
Load Balancer L7L7 (HTTP)Yes (headers, paths)Via sticky sessions
API GatewayL7FullyYes (per API key / per user)

This table explains why you cannot enforce per-user rate limiting at the load balancer โ€” it does not maintain per-user state by default.

๐Ÿ“Š L4 vs L7 vs Proxy Routing

flowchart LR
    TR[Incoming Traffic] --> LB[Load Balancer L4]
    TR --> RP[Reverse Proxy L7]
    TR --> GW[API Gateway L7+]
    LB --> SV[Backend Servers]
    RP --> SV
    GW --> AU[Auth + Rate Limit]
    AU --> MS[Microservices]

๐Ÿง  Deep Dive: Why Rate Limiting Belongs at the Gateway, Not the Load Balancer

Reverse proxies and L4 load balancers are stateless โ€” they route packets without knowing anything about the user making the request. An API Gateway is stateful per consumer: it tracks request counts against rate-limit quotas, validates JWT claims, and maps API keys to consumer identities. This per-user state is what makes rate limiting possible at the gateway layer but not at the load balancer โ€” which sees only connections, not identities.


๐Ÿ“Š How a Request Travels Through All Three Layers

flowchart TD
    Client[Browser / Mobile Client] --> RP[Reverse Proxy\nNginx โ€” SSL, cache, compress]
    RP -->|Cache miss or dynamic| LB[Load Balancer\nAWS ALB โ€” distribute traffic]
    LB --> GW[API Gateway\nKong โ€” auth, rate limit, route]
    GW --> US[User Service : 8080]
    GW --> OS[Order Service : 8081]
    GW --> PS[Payment Service : 8082]
    US -->|health 200| LB
    OS -->|health 200| LB
    PS -->|health 503| LB
    LB -.->|removed from pool| PS

The three layers handle distinctly different concerns as the request descends:

  1. Reverse Proxy handles transport: SSL decryption, static caching, DDoS absorption.
  2. Load Balancer handles distribution: which instance handles this request, and is that instance healthy?
  3. API Gateway handles application policy: who is this user, are they allowed, which service should respond?

๐ŸŒ Real-World Applications: Real-World Deployment Patterns

Most production stacks do not cleanly separate these three layers into distinct products โ€” they use tools that collapse two or three roles, but the logical separation still matters for debugging and ownership:

ToolActs as
NginxReverse proxy + basic load balancer
AWS ALBL7 load balancer + some gateway features
AWS API GatewayFull gateway + rate limiting + auth
KongAPI gateway + rate limit + plugin ecosystem
EnvoySidecar proxy + LB + service mesh component

Netflix uses Nginx at the edge for SSL termination and static asset caching, AWS Global Accelerator for regional traffic routing, and Zuul (their internal gateway) for per-service auth and rate limiting. Each layer is independently operable โ€” an Nginx config change doesn't touch auth policy, and a Zuul plugin update doesn't affect load-balancing weights.

Stripe runs a similar topology: Nginx terminates SSL and absorbs volumetric floods, HAProxy distributes to API pods, and their internal gateway enforces the API key validation and per-customer rate limits that power their public API offering. The clean separation means their platform team can own rate-limiting policy without touching the Nginx configurations owned by the infra team.


๐Ÿงช Practical Configuration Guide

Scenario: Adding rate limiting to a new endpoint without touching application code.

# Kong declarative config โ€” add rate limiting to /v1/orders
services:
  - nam
e: order-service
    url: http://orders:8081
    routes:
      - nam
e: orders-route
        paths: ["/v1/orders"]
    plugins:
      - nam
e: rate-limiting
        config:
          minute: 60        # 60 requests per minute per consumer
          policy: redis     # shared state across gateway instances
      - nam
e: jwt           # auth before rate limiting

No code change to the Order Service. The gateway enforces both auth and rate limiting declaratively.

Scenario: Checking if a server is actually removed from the load balancer pool.

# AWS ALB โ€” list target health for a target group
aws elbv2 describe-target-health \
  --target-group-arn arn:aws:elasticloadbalancing:us-east-1:...

# Output shows which instances are "healthy", "unhealthy", or "draining"

Scenario: Verifying SSL terminates at the reverse proxy, not the app.

# Check the certificate on the public-facing endpoint
openssl s_client -connect api.example.com:443 -brief

# Check the internal connection (should be plain HTTP)
curl -v http://internal-backend:8080/health

๐Ÿงญ Decision Guide: Choosing the Right Layer: Decision Guide

You need...Use
Hide backend server IPs, SSL offload, edge cachingReverse Proxy (Nginx, Cloudflare)
Spread traffic across identical server instancesLoad Balancer (ALB, HAProxy)
Auth, rate limiting, per-route routingAPI Gateway (Kong, AWS API GW)
All three in a single toolNginx with plugins, or a cloud-managed API Gateway
Low-latency L4 routing for TCP servicesL4 Load Balancer (NLB)
Path/header-based routingL7 Load Balancer (ALB)

๐ŸŽฏ What to Learn Next


๐Ÿ› ๏ธ Spring Cloud Gateway: Routing, Rate Limiting, and Auth Filters in Java

Spring Cloud Gateway (SCG) is a reactive, non-blocking API gateway built on Spring WebFlux that implements the API Gateway layer described in this post โ€” routing, rate limiting, authentication, and circuit breaking โ€” without any code changes in downstream services.

Routes are declared in application.yml (or a Java DSL); filters attach cross-cutting policy to each route. SCG integrates natively with Spring Security for JWT validation, Redis for distributed rate-limit counters, and Resilience4j for circuit breaking.

# application.yml โ€” Spring Cloud Gateway declarative route configuration
spring:
  cloud:
    gateway:
      routes:
        # Route 1: user-service โ€” auth + rate limiting
        - id: user-service
          uri: lb://user-service          # lb:// resolves via Eureka/Kubernetes
          predicates:
            - Path=/v1/users/**
          filters:
            - nam
e: RequestRateLimiter
              args:
                redis-rate-limiter.replenishRate: 100   # 100 req/sec baseline
                redis-rate-limiter.burstCapacity: 200   # allows short bursts
                key-resolver: "#{@userKeyResolver}"     # rate-limit per user ID

        # Route 2: order-service โ€” circuit breaker + header rewrite
        - id: order-service
          uri: lb://order-service
          predicates:
            - Path=/v1/orders/**
          filters:
            - nam
e: CircuitBreaker
              args:
                name: orderCB
                fallbackUri: forward:/fallback/orders
            - AddRequestHeader=X-Gateway-Source, spring-cloud-gateway
            - StripPrefix=1                              # remove /v1 before forwarding
// KeyResolver โ€” rate-limit by authenticated user extracted from JWT claim
@Bean
public KeyResolver userKeyResolver() {
    return exchange -> exchange.getPrincipal()
        .map(java.security.Principal::getName)
        .defaultIfEmpty("anonymous");
}

// Fallback controller โ€” called when circuit breaker trips
@RestController
public class FallbackController {
    @GetMapping("/fallback/orders")
    public ResponseEntity<Map<String, String>> orderFallback() {
        return ResponseEntity.status(HttpStatus.SERVICE_UNAVAILABLE)
            .body(Map.of("error", "Order service is temporarily unavailable"));
    }
}

The Order Service and User Service require zero code changes โ€” rate limiting, circuit breaking, and JWT extraction are enforced entirely at the gateway layer via configuration.

NGINX as a Reverse Proxy (the layer before the gateway):

# nginx.conf โ€” SSL termination + compression before traffic reaches the gateway
server {
    listen 443 ssl http2;
    server_name api.example.com;

    ssl_certificate     /etc/ssl/certs/api.crt;
    ssl_certificate_key /etc/ssl/private/api.key;

    gzip on;
    gzip_types application/json text/plain;

    location / {
        proxy_pass http://spring-cloud-gateway:8080;
        proxy_set_header X-Forwarded-For $remote_addr;
        proxy_set_header X-Forwarded-Proto https;
    }
}

NGINX terminates SSL and compresses responses at the edge; Spring Cloud Gateway enforces auth and routing at the API layer โ€” each component owns exactly one layer.

For a full deep-dive on Spring Cloud Gateway's filter chain, Spring Security JWT integration, and Resilience4j circuit breaker configuration, a dedicated follow-up post is planned.


๐Ÿ“š Lessons from Production Deployments

Lesson 1: L4 vs. L7 is a latency vs. flexibility trade-off. L4 load balancers route by IP/port and are extremely fast because they don't inspect HTTP content. L7 load balancers parse headers and paths, enabling smart routing โ€” but add ~0.5โ€“2ms per request. For most applications the flexibility of L7 is worth the cost. For extremely latency-sensitive TCP services (databases, game servers), L4 is preferable.

Lesson 2: The health check endpoint is a critical dependency. If /health returns 200 even when the service is degraded (database connection down, dependencies unavailable), the load balancer keeps sending traffic to a broken instance. Implement deep health checks that validate actual functionality, not just HTTP reachability.

Lesson 3: Gateway plugins accumulate and slow down requests. Each gateway plugin (auth, rate-limit, logging, transformation) adds processing time to every request. Profile your gateway pipeline regularly and remove plugins that are no longer needed. A gateway with 8 plugins enabled on every route can add 20โ€“50ms of latency that appears as application slowness.

Lesson 4: Don't put your entire auth logic in the gateway. The gateway validates tokens (JWT signature, expiry, audience). It does not know your application's permission model. Fine-grained authorization (can this user edit this order?) belongs in the service itself.


๐Ÿ“Œ TLDR: Summary & Key Takeaways

  • Reverse proxy: hides backends, offloads SSL, caches static content. Client never knows your server's real address.
  • Load balancer: distributes connections across instances using round-robin, least-connections, or IP hash.
  • API Gateway: enforces auth, rate limits, routing, and transformation at the HTTP/API layer.
  • Health checks are the load balancer's mechanism to remove failed instances without human intervention.
  • L4 load balancers route by IP/port (fast); L7 route by headers/paths (flexible).

๐Ÿ“ Practice Quiz

  1. What is the primary job of an L7 load balancer over an L4 load balancer?

    • A) L7 is faster because it operates at a lower level
    • B) L7 can route based on HTTP headers and URL paths, not just IP and port
    • C) L7 handles SSL termination while L4 does not
    • D) L7 monitors health checks while L4 does not

    Correct Answer: B โ€” L7 inspects HTTP content, enabling path-based and header-based routing. L4 routes purely on IP address and TCP port, which is faster but less flexible.

  2. A user repeatedly hitting the same shopping cart endpoint exceeds the allowed rate. Which component should enforce the limit?

    • A) Reverse Proxy โ€” it can block requests before they reach the backend
    • B) API Gateway โ€” it enforces per-user or per-endpoint rate limiting policy
    • C) Load Balancer โ€” it drops excess connections
    • D) The application service itself

    Correct Answer: B โ€” API Gateways maintain per-consumer state (via Redis or local counters) and are the correct layer for rate limiting. Reverse proxies lack per-user context; load balancers distribute rather than restrict.

  3. Your team deploys a new microservice for search. What is the minimal addition needed to route /v1/search traffic to it without changing existing services?

    • A) Add a new server to the load balancer pool
    • B) Add a route rule in the API Gateway mapping /v1/search to the new service
    • C) Update the reverse proxy SSL certificate
    • D) Deploy a new load balancer instance

    Correct Answer: B โ€” the API Gateway's routing configuration is the right place to add new route mappings. Other services and infrastructure remain untouched.



Abstract Algorithms

Written by

Abstract Algorithms

@abstractalgorithms