How It Works: Under the Hood of the Redis Single-Threaded Engine
Understand epoll multiplexing, memory layouts, ziplists, and eviction policies inside the Redis engine.

Abstract Algorithms
Helping engineers master software engineering topics.
TLDR: Redis achieves sub-millisecond latencies and millions of operations per second by executing all commands inside a single thread. This avoids thread context switching and lock contention, relying on non-blocking I/O multiplexing (
epoll) to handle thousands of concurrent client sockets.
π Design Challenge: Scaling Cache Concurrency without Locks
Imagine you are scaling an API gateway that handles 50,000 requests per second. The gateway query rate-limit quotas are stored in a centralized cache.
If the cache uses a traditional multithreaded locking architecture (like a synchronized shared hash map), every incoming request thread must acquire a lock on the target key's bucket before incrementing the request counter.
Under high concurrent load, this design runs into lock contention bottlenecks:
- As thread count increases, threads spend more CPU cycles waiting in queues for locks rather than doing useful work.
- The Operating System constantly context-switches threads in and out of CPU cores, consuming memory bandwidth and polluting CPU L1/L2 caches.
- Thread safety bugs (like deadlocks or race conditions) emerge, causing unpredictable latencies.
To solve this, Redis takes a different approach: execute all commands on a single thread. By eliminating locks, thread context switches, and synchronization overhead, Redis processes commands at CPU memory speeds.
However, this introduces a new challenge: how can a single thread process requests from 10,000 concurrent client connections without blocking? The answer lies in I/O multiplexing.
π Core Architecture: The Single-Threaded Event Loop
The Redis engine is built around a non-blocking event loop. Instead of allocating a thread per client connection, Redis delegates socket monitoring to the Operating System kernel.
The architecture is composed of four main components:
- Socket Connections: Client TCP sockets sending commands (e.g.,
GET,SET). - I/O Multiplexer (Epoll/Kqueue): A kernel-level system utility that monitors thousands of sockets concurrently and returns only the sockets that have pending data to read.
- Event Demultiplexer: Translates raw socket events into execution tasks (Read, Write, Close).
- File Event Handler: The single-threaded execution loop that processes commands sequentially.
The diagram below shows the component architecture:
graph LR
Sockets[Client Sockets 1-N] -->|Send Data| Multiplexer[I/O Multiplexer - epoll]
Multiplexer -->|Ready Sockets| Queue[Event Queue]
Queue -->|Process Sequentially| Loop[Single-Threaded Event Loop]
Loop -->|Query/Write| Memory[In-Memory Data Store]
This system diagram illustrates the Redis command processing chain. Multiple client sockets send concurrent network packets. The I/O multiplexer (epoll) monitors these connections at the OS kernel level, queueing only the ready sockets. The single-threaded event loop pulls events from the queue one-by-one and executes them against the in-memory data store, avoiding any concurrency lock overhead.
βοΈ Core Mechanics: Epoll, Multiplexing, and Data Structures
The core execution path of Redis is driven by the ae event library, which wraps OS-specific multiplexing system calls.
1. Sockets Multiplexing with Epoll
On Linux systems, Redis uses the epoll system call. The event loop registers client socket file descriptors (FDs) with epoll_ctl.
When the loop runs, it invokes epoll_wait, which blocks until at least one socket has data ready to read. The OS kernel wakes up the thread and returns a list of active FDs, preventing the thread from running busy loops.
2. Specialized Memory Layouts
To keep memory access fast, Redis uses custom data layouts that minimize overhead:
- Dict: A hash table implementation with incremental rehashing. When the table grows, Redis rehashing occurs step-by-step during normal command lookups to avoid long blockage pauses.
- Ziplist: A highly compressed, contiguous byte array used for small lists, hashes, and sorted sets. It eliminates pointer overhead, keeping data contiguous in CPU caches.
- Skiplist: A probabilistic alternative to balanced trees, used alongside hash tables to implement Sorted Sets (
zset). It allows sorted element queries in $O(\log N)$ time.
π Architectural Blueprint: End-to-End Command Processing
To trace how a command is processed, the sequence diagram below maps a request from socket arrival to client response:
sequenceDiagram
participant Client
participant OS as OS Kernel (epoll)
participant Loop as Redis Event Loop
participant DB as In-Memory DB
Client->>OS: Send "SET key value" over TCP
Note over OS: Mark socket FD as readable
Loop->>OS: Call epoll_wait()
OS-->>Loop: Return active socket FD list
Loop->>Loop: Read bytes from socket buffer
Loop->>Loop: Parse command protocols (RESP)
Loop->>DB: Execute write on key
DB-->>Loop: Acknowledge write
Loop->>Client: Send "OK" response
This sequence diagram traces the lifecycle of a Redis command. The client writes command bytes to a TCP socket. The OS kernel marks the file descriptor as readable. The Redis event loop detects the active FD using epoll_wait(), reads the socket buffer, parses the RESP protocol, executes the update in-memory, and writes the response back to the client socket.
π§ Deep Dive: Ziplists, Skiplists, and Eviction Logic
To scale Redis memory efficiency, we must analyze the internal layouts of its structures.
The Internals of Redis Memory Allocation, RESP, and Defragmentation
Every Redis object is wrapped in a robj structure, which defines the type, encoding, and pointers. To minimize allocator fragmentation (using jemalloc), Redis uses Ziplists for small collections. A ziplist stores elements as contiguous bytes containing the length of the previous entry, the length of the current entry, and the payload. Because it is a single contiguous block of memory, it avoids the pointer overhead of standard linked lists (which require 24 bytes per node for pointers). If a ziplist exceeds size bounds (e.g., 512 elements or 64 bytes per entry), Redis automatically converts it to a standard quicklist or hashtable.
The database parses incoming bytes using the REdis Serialization Protocol (RESP). RESP parses simple strings, bulk strings, integers, arrays, and errors using simple prefixes (like $ for bulk strings and * for arrays). By keeping the protocol human-readable yet simple, the parser consumes very little CPU.
Memory fragmentation is a significant challenge when keys are updated with varying sizes. Because jemalloc allocates memory in fixed-size arenas, deleting and resizing keys leaves empty holes. To counter this, Redis implements Active Defragmentation. It scans the keyspace, allocates new contiguous memory blocks for fragmented objects, updates the pointers, and releases the fragmented memory pages back to the OS.
Performance Analysis of Multi-Threading in Redis 6+
A common interview question asks: "Is Redis multithreaded now?" The answer is nuanced. Since Redis 6, the engine supports I/O Threads to parallelize network read and write execution.
When a socket has data, the main thread assigns the socket to an I/O thread. The I/O thread reads the raw TCP buffer and parses the RESP command.
However, the actual execution of the command against the in-memory database remains strictly single-threaded on the main thread. Once computed, the response serialisation is delegated back to the I/O threads to be written to client sockets, combining single-threaded transaction safety with multi-threaded I/O scaling.
π Real-World Implementation: Caching at Stripe
Stripe utilizes large Redis clusters to manage rate limiting for millions of API calls. Since rate limit counters require atomic updates, Stripe uses Redis Lua Scripts:
- Because Redis runs commands sequentially on a single thread, any Lua script is executed atomically.
- No other client commands can execute while a Lua script is running, ensuring counter checks and updates complete without race conditions.
βοΈ Trade-offs and Failure Modes: Single-Thread Blockage and Slow Queries
The single-threaded design introduces unique failure modes:
- The Blocking Command Trap: If a query runs in $O(N)$ time (e.g.,
KEYS *orHGETALLon a hash with 10 million elements), the single thread will block for seconds. All other client requests queue up, leading to connection timeouts across your microservices. - No Multi-Core Utilization: Redis cannot scale beyond a single CPU core for command processing. To utilize multi-core servers, you must run multiple Redis instances on different ports (sharding).
π§ Decision Guide: Redis vs. Memcached vs. KeyDB
Use this guide to choose the right in-memory store for your architecture.
| Feature | Redis | Memcached | KeyDB |
| Concurrency Model | Single-threaded event loop | Multithreaded with lock boundaries | Multithreaded architecture (shares event loop) |
| Data Structures | Rich (Lists, Sets, Hashes, Sorted Sets) | Simple (Strings / Key-Value only) | Rich (Redis compatible) |
| Atomic Operations | Yes (Lua Scripts, Transactions) | No | Yes |
| Horizontal Scale | Redis Cluster (Sharding) | Client-side hashing | Multi-Master active replication |
π§ͺ Practical Implementation: Tuning redis.conf for Production
To configure Redis eviction policies and persistence in production:
# Max memory constraint (e.g., 2 GB)
maxmemory 2gb
# Eviction strategy: Evict any key using approximated LRU
maxmemory-policy allkeys-lru
# Approximated LRU sample size (higher = more accurate, but slightly more CPU)
maxmemory-samples 5
This configuration guarantees that when memory usage reaches 2 GB, Redis will automatically evict the least recently used keys, preventing the JVM/OS from terminating the Redis process due to Out-Of-Memory limits.
π Lessons Learned: Production Redis Anti-Patterns
Avoid these common Redis mistakes in production:
- Running
KEYS *: Never runKEYS *in production. It scans the entire keyspace, blocking the single thread. UseSCANinstead, which iterates through keys incrementally without blocking the loop. - Unbounded Keyspaces: Always set a Time-To-Live (TTL) on caching keys. Without a TTL, temporary data will accumulate, eventually triggering memory evictions on critical static data.
- Large Payloads: Do not store large objects (e.g., 50 MB PDFs) in Redis. Network serialization overhead blocks the event loop, degrading QPS for all other clients.
π Summary: The Redis Engine Rulebook
- Single Thread: Avoids lock contention and context switching, executing commands at memory speeds.
- Epoll Multiplexing: Leverages the OS kernel to monitor thousands of active sockets without resource exhaustion.
- Atomic execution: Guarantees atomic operations naturally since no other command can run concurrently.
- Ziplists: Compressed data layouts reduce memory usage and optimize CPU cache access.
- Avoid $O(N)$ Operations: Never run blocking commands like
KEYS *that lock the single execution thread.
AI-generated article quiz
Test your understanding
Ready to test what you just learned?
Generate four focused questions from this article. Answers include immediate explanations.
Guided series path
How It Works: Internals Explained
Reader feedback
Was this article useful?
Rate it if it helped, then continue with the next deep dive when you are ready.
Sign in to save your rating.
Article metadata