LLD for LRU Cache: Designing a High-Performance Cache
OOP design breakdown: Cache<K,V> interface, SOLID principles, and the HashMap + DoublyLinkedList pattern that powers LRU eviction in Java.
Abstract AlgorithmsAI-assisted content. This post may have been written or enhanced with AI tools. Please verify critical information independently.
TLDR
TLDR: An LRU (Least Recently Used) Cache evicts the item that hasn't been accessed the longest when it's full. The classic implementation combines a HashMap (O(1) lookup) with a Doubly Linked List (O(1) move-to-front) for overall O(1) get and put.
π The Browser History Analogy
Your browser "Back" button stores the pages you visited most recently. If you visit 100 pages and your browser only keeps 10 in memory, it drops the oldest one.
LRU says: the item you haven't needed in the longest time is the least valuable β evict it first.
The classic challenge is implementing this in O(1) for both get and put. A hash map gives O(1) lookup but can't maintain order. A linked list maintains order but gives O(n) lookup. The insight: use both together.
π Eviction Policies: Why LRU Makes Sense
A cache is a small, fast store that holds a hot subset of a larger, slower dataset. When your application repeatedly reads the same data from a database, remote API, or disk, caching the results in memory slashes latency from milliseconds to microseconds.
Because caches have limited capacity, they need an eviction policy β a rule for deciding which item to drop when the cache is full and a new item arrives.
| Policy | Full Name | Rule | Best For |
| FIFO | First In, First Out | Evict the oldest-inserted item | Streaming pipelines |
| LFU | Least Frequently Used | Evict the item with the fewest total accesses | Stable popularity rankings |
| LRU | Least Recently Used | Evict the item not accessed for the longest time | Most interactive workloads |
LRU works well because recency correlates with future demand in typical access patterns. The web page you loaded two seconds ago is far more likely to be re-accessed than one you loaded yesterday. LRU exploits this temporal locality naturally.
Key vocabulary:
- Cache hit: the requested key is found in the cache β fast path, no source fetch needed.
- Cache miss: the key is absent; fetch from the slow source, insert the result into cache.
- Hit rate:
hits Γ· total requests. A well-designed LRU cache often achieves 80β99% hit rates in production. - Eviction: removing an existing entry to make room for a new one.
π’ The HashMap + Doubly Linked List Architecture
flowchart LR
HashMap[HashMap (key node pointer)] --> DLL[Doubly Linked List Head (MRU) ... Tail (LRU)]
The two-structure design is the central insight: the HashMap (left) provides O(1) key lookup by storing a direct pointer to each Node in the list, while the Doubly Linked List (right) maintains access order so the most recently used node stays at the head and the least recently used node sits at the tail. Together, both get() and put() complete in O(1) β something neither data structure can achieve alone. The arrow represents the dependency: every cache operation starts at the HashMap to retrieve the node reference, then hands off to the DLL for order-preserving position updates.
- HashMap: Maps each key to its
Nodein the listβ O(1) access. - Doubly Linked List: Maintains access order. Head = Most Recently Used, Tail = Least Recently Used.
- On
get(key): find the node (O(1) via map), move it to head (O(1) via pointer manipulation). - On
put(key, val)when full: remove tail node (O(1)), delete its key from map, insert new node at head.
βοΈ Step-by-Step Trace: Capacity = 3
| Operation | List (Head β Tail) | HashMap | Note |
put(1, A) | [1] | {1βNode1} | First entry |
put(2, B) | [2, 1] | {1,2} | 2 is MRU |
put(3, C) | [3, 2, 1] | {1,2,3} | Full |
get(1) | [1, 3, 2] | {1,2,3} | 1 accessed β moves to head |
put(4, D) | [4, 1, 3] | {1,3,4} | 2 (tail) evicted; 4 inserted at head |
Key: get(1) bumped 1 to head, so 3 became the new tail. When 4 arrives, 2 gets evicted β not 3.
π Cache Entry Lifecycle States
stateDiagram-v2
[*] --> HOT : put(key, val) inserted at head (MRU)
HOT --> HOT : get(key) moved back to head
HOT --> COLD : newer entries push it toward tail
COLD --> HOT : get(key) moved back to head (MRU)
COLD --> EVICTED : cache full tail node removed on next put()
EVICTED --> [*]
The state machine captures the complete lifecycle of a single cache entry from insertion to eviction. A node enters as HOT (placed at the head/MRU position) and returns to HOT whenever get() is called on it, resetting its eviction clock with each access. Without further access, newer insertions push it down the list toward the tail, transitioning it to COLD. The only exit path is from COLD: when the cache reaches capacity and a new key arrives, put() removes the tail node β always the entry that has gone the longest without any access β and purges its key from the HashMap simultaneously.
π LRU Cache Operation Flow
Every get and put follows a deterministic decision path. Tracking which branch executes is the key to understanding unexpected evictions.
flowchart TD
A[Request arrives] --> B{get or put?}
B -->|"get(key)"| C{Key in HashMap?}
C -->|No| D[Return -1 (cache miss)]
C -->|Yes| E[Move node to head (mark as MRU)]
E --> F[Return value (cache hit)]
B -->|"put(key, val)"| G{Key exists?}
G -->|Yes| H[Update value Move node to head]
G -->|No| I[Create new node Insert at head]
I --> J{Size > capacity?}
J -->|Yes| K[Remove tail node Delete from HashMap]
J -->|No| L[Done ]
K --> L
This decision tree captures every execution path through get() and put() in a single view. The left branch (get) is two steps: a HashMap lookup, then either a miss (return -1) or a move-to-head followed by returning the value. The right branch (put) fans wider: if the key already exists, update the value and promote to head; if it is a new key, insert at head then check whether size now exceeds capacity and evict the tail. The critical invariant the diagram makes visible is that both happy paths β cache hit on get and key-exists on put β converge on the same move-to-head action, so MRU promotion is always the final step of any successful access.
Both operations always leave the target node at the head of the listβ the Most Recently Used position. The tail node is always the next eviction candidate.
Notice there is no branching inside the pointer manipulation itself: whether you are moving an existing node or inserting a new one, the same removeNode + insertAtHead helpers do the work. This uniformity is exactly why the sentinel-node design pays off β every node, real or dummy, participates in the same four-pointer relinking sequence.
π§± How the Four OOP Pillars Shape the LRU Cache Design
Object-oriented design is not just syntax β it is a discipline for assigning clear, bounded responsibilities to each class. The LRU cache is a compact example of all four OOP pillars working together in a single design.
| Pillar | Where It Appears in the LRU Cache |
| Encapsulation | LruCache<K,V> hides its HashMap<K, Node<K,V>> and DoublyLinkedList β callers only see get(K key) and put(K key, V value); the prev/next node pointers are never exposed |
| Abstraction | A Cache<K,V> interface defines the contract β callers do not need to know whether the policy is LRU, LFU, or FIFO |
| Inheritance | LruCache<K,V> implements Cache<K,V>; LfuCache<K,V> implements Cache<K,V> β both satisfy the same interface and are usable wherever Cache is expected |
| Polymorphism | Cache<K,V> browserCache = new LruCache<>(100) β the calling code is entirely policy-agnostic |
Encapsulation: The Node<K,V> inner class β with its prev and next pointers β is declared private static. A dedicated DoublyLinkedList helper manages all pointer manipulation. Neither the node structure nor the list internals are visible outside LruCache. This strong boundary means the implementation can swap HashMap for ConcurrentHashMap, change the sentinel-node scheme, or add hit-rate counters without changing a single line of calling code.
Abstraction: The Cache<K,V> interface exposes only the operations callers need: get(), put(), size(), and capacity(). Everything else β eviction logic, node ordering, pointer relinking β is an implementation detail hidden behind the interface boundary. A browser cache, a session store, and a database buffer pool can all be coded against Cache<K,V> without committing to a specific eviction policy.
Inheritance and Polymorphism: Because LruCache and LfuCache both implement Cache<K,V>, a ProductService or SessionManager that declares Cache<K,V> as its dependency can receive either implementation without modification. Swapping the eviction strategy becomes a configuration decision, not a code change.
classDiagram
direction LR
class Cache~K,V~ {
<<interface>>
+get(key K) Optional~V~
+put(key K, value V) void
+size() int
+capacity() int
}
class LruCache~K,V~ {
-capacity int
-map HashMap
-head Node
-tail Node
+get(key K) Optional~V~
+put(key K, value V) void
}
class LfuCache~K,V~ {
-capacity int
-freqMap HashMap
+get(key K) Optional~V~
+put(key K, value V) void
}
class Node~K,V~ {
-key K
-value V
-prev Node
-next Node
}
Cache <|.. LruCache : implements
Cache <|.. LfuCache : implements
LruCache *-- Node : has-a (private)
The class diagram expresses all four OOP pillars in a single picture. Cache<K,V> is the abstract boundary β callers depend on this interface, never on a concrete implementation (abstraction + polymorphism). Both LruCache<K,V> and LfuCache<K,V> implement Cache<K,V> (inheritance), making them drop-in substitutes wherever a Cache reference is declared. The filled-diamond arrow from LruCache to Node<K,V> denotes strong composition: Node is a private inner class whose prev and next pointers never escape LruCache's boundary, enforcing the encapsulation principle that keeps pointer logic invisible to all callers.
Class diagram: LruCache and LfuCache both implement Cache<K,V> (inheritance + polymorphism). Node<K,V> is a private composition inside LruCache β enforcing encapsulation so pointer details never leak to callers.
β SOLID Principles in the LRU Cache Design
Getting the algorithm right is only half the job in LLD. The other half is designing classes that are easy to extend, test, and replace independently. Every SOLID principle has a concrete mapping in the LRU cache design.
| Principle | How It Applies to the LRU Cache |
| SRP β Single Responsibility | DoublyLinkedList manages node ordering only; LruCache coordinates the map + list; Node<K,V> is a plain data container β each class has exactly one job |
| OCP β Open / Closed | Adding LFU, MRU, or ARC eviction means writing a new class that implements Cache<K,V> β LruCache itself never changes |
| LSP β Liskov Substitution | LfuCache<K,V> and LruCache<K,V> both honour the get()/put() contract β either can replace the other without breaking any caller |
| ISP β Interface Segregation | Cache<K,V> exposes exactly get(), put(), size(), capacity() β callers are not forced to implement eviction callbacks or internal hooks they do not need |
| DIP β Dependency Inversion | Application code depends on the Cache<K,V> abstraction, not on LruCache<K,V> directly β the eviction policy can be swapped without touching business logic |
SRP in practice: When a pointer-corruption bug surfaces, you look inside DoublyLinkedList. When an eviction fires at the wrong time, you look inside LruCache. The responsibilities never overlap, so the search space for each bug is immediately clear β and the unit test for each class is small and self-contained.
OCP in practice: A product manager requests LFU eviction for the product catalogue while keeping LRU for session data. The answer: write LfuCache<K,V> implements Cache<K,V> and inject it into the product service. The existing LruCache is untouched β closed for modification, open for extension.
DIP in practice: A ProductService that declares private final Cache<String, Product> cache can be unit-tested with a trivial NoOpCache<> stub, load-tested with LruCache<>, and deployed backed by Caffeine β all without changing the service class itself.
π Interface Contracts and the Eviction Policy Boundary
Defining explicit Java interfaces converts the LRU cache from a standalone algorithm into a reusable design component. Two interfaces define the full contract:
// The public contract β what all callers see
interface Cache<K, V> {
Optional<V> get(K key); // empty Optional on cache miss
void put(K key, V value); // evicts the LRU entry if already at capacity
int size();
int capacity();
}
// The internal eviction contract β what LruCache uses to decide what to remove
interface EvictionPolicy<K> {
K evict(); // returns the key to remove when the cache is full
void onAccess(K key); // called on every successful get
void onInsert(K key); // called on every put
}
Why two separate interfaces? Cache<K,V> is the public boundary that callers depend on β it is stable and changes rarely. EvictionPolicy<K> is the internal seam where a new eviction algorithm slots in. Keeping them separate means you can plug in an LfuPolicy<K> or an ArcPolicy<K> without changing a single method signature on the public Cache<K,V> API. LruCache implements Cache<K,V> and uses EvictionPolicy<K> internally; callers never see the second interface at all.
flowchart LR
Client[Client Code Cache<K,V> ref] -->|"get / put"| LruCache[LruCache<K,V> implements Cache<K,V>]
LruCache -->|"onAccess / onInsert / evict"| Policy[EvictionPolicy<K> (LruPolicy or LfuPolicy)]
LruCache --> HashMap[HashMap<K, Node>]
LruCache --> DLL[DoublyLinkedList<Node>]
This flow diagram shows the two-layer interface separation that makes the eviction policy pluggable without touching any caller. Client code holds a Cache<K,V> reference and is entirely unaware that LruCache is on the other end β this is the Dependency Inversion Principle in concrete form. LruCache then delegates all eviction decisions to EvictionPolicy<K> by calling onAccess, onInsert, and evict, while keeping its own HashMap and DoublyLinkedList for the actual data storage and ordering. The payoff: swapping LruPolicy for LfuPolicy is a single-line change at the injection site; neither the client code nor the HashMap/DLL wiring requires any modification.
The client only talks to the Cache<K,V> interface. LruCache delegates eviction decisions to EvictionPolicy<K> β keeping the policy pluggable without any change to the public API or the cache wiring.
Swapping the Eviction Policy: Zero Business Logic Change
The payoff of this design is that the business layer never references a concrete cache class. To switch from LRU to LFU for a high-frequency product catalogue, it is a single-line change at the injection site:
// Before: LRU eviction
Cache<String, Product> cache = new LruCache<>(1000);
// After: LFU eviction β ProductService is unchanged
Cache<String, Product> cache = new LfuCache<>(1000);
The ProductService constructor receives Cache<String, Product> and never imports LruCache or LfuCache directly:
public class ProductService {
private final Cache<String, Product> cache; // depends on abstraction, not implementation
public ProductService(Cache<String, Product> cache) {
this.cache = cache;
}
public Product getProduct(String id) {
return cache.get(id).orElseGet(() -> fetchFromDb(id));
}
}
This is DIP and OCP in action simultaneously: the service depends on an abstraction (Cache<K,V>), and the eviction implementation is open for extension (LfuCache) without modifying any existing class.
π Extended Class Diagram: DoublyLinkedList and EvictionPolicy
classDiagram
class Cache~K,V~ {
<<interface>>
+get(key K) Optional~V~
+put(key K, value V) void
+size() int
+capacity() int
}
class LruCache~K,V~ {
-int capacity
-HashMap map
-Node head
-Node tail
+get(key K) Optional~V~
+put(key K, value V) void
-removeNode(Node) void
-insertAtHead(Node) void
}
class LfuCache~K,V~ {
-int capacity
-HashMap freqMap
+get(key K) Optional~V~
+put(key K, value V) void
}
class Node~K,V~ {
-K key
-V value
-Node prev
-Node next
}
class DoublyLinkedList~K,V~ {
-Node head
-Node tail
+insertAtHead(Node) void
+removeTail() Node
+removeNode(Node) void
}
class EvictionPolicy~K~ {
<<interface>>
+evict() K
+onAccess(K) void
+onInsert(K) void
}
Cache <|.. LruCache : implements
Cache <|.. LfuCache : implements
LruCache *-- Node : has-a (private)
LruCache *-- DoublyLinkedList : uses
LruCache o-- EvictionPolicy : delegates to
This expanded class diagram surfaces the responsibilities that the earlier diagram collapsed inside LruCache. DoublyLinkedList<K,V> is now a first-class collaborator with its own three-method interface (insertAtHead, removeTail, removeNode), giving it a single responsibility for all pointer manipulation and making it independently unit-testable without a HashMap in sight. The open-diamond from LruCache to EvictionPolicy<K> shows aggregation rather than composition: the policy is injected, not owned, so it can be swapped at runtime. Reading left to right, the SRP boundary is fully visible β Cache defines the contract, LruCache coordinates the collaborators, DoublyLinkedList manages order, EvictionPolicy selects eviction victims, and Node is a plain data container.
π§ Deep Dive: LRU Cache Java Implementation
import java.util.HashMap;
public class LRUCache {
private static class Node {
int key, val;
Node prev, next;
Node(int k, int v) { key = k; val = v; }
}
private final int capacity;
private final HashMap<Integer, Node> map = new HashMap<>();
// Sentinel nodes simplify boundary conditions
private final Node head = new Node(0, 0); // Most Recent side
private final Node tail = new Node(0, 0); // Least Recent side
public LRUCache(int capacity) {
this.capacity = capacity;
head.next = tail;
tail.prev = head;
}
public int get(int key) {
if (!map.containsKey(key)) return -1;
Node n = map.get(key);
moveToHead(n);
return n.val;
}
public void put(int key, int val) {
if (map.containsKey(key)) {
Node n = map.get(key);
n.val = val;
moveToHead(n);
} else {
Node n = new Node(key, val);
map.put(key, n);
insertAtHead(n);
if (map.size() > capacity) {
Node lru = removeTail();
map.remove(lru.key);
}
}
}
private void moveToHead(Node n) {
removeNode(n);
insertAtHead(n);
}
private void removeNode(Node n) {
n.prev.next = n.next;
n.next.prev = n.prev;
}
private void insertAtHead(Node n) {
n.next = head.next;
n.prev = head;
head.next.prev = n;
head.next = n;
}
private Node removeTail() {
Node lru = tail.prev;
removeNode(lru);
return lru;
}
}
Sentinel nodes (head and tail) eliminate null checks at list boundaries β every real node always has a prev and next.
π LRU Cache get / put Sequence
sequenceDiagram
participant C as Client
participant LC as LruCache
participant HM as HashMap
participant DL as DoublyLinkedList
Note over C,DL: get(key) cache hit path
C->>LC: get(key)
LC->>HM: containsKey(key)
HM-->>LC: node reference
LC->>DL: removeNode(node)
LC->>DL: insertAtHead(node)
LC-->>C: Optional.of(value)
Note over C,DL: put(key, val) eviction path
C->>LC: put(key, val)
LC->>HM: containsKey(key)
HM-->>LC: false (new key)
LC->>DL: insertAtHead(newNode)
LC->>LC: size > capacity?
LC->>DL: removeTail() lruNode
LC->>HM: remove(lruNode.key)
LC-->>C: void
The sequence diagram reveals the exact inter-object message choreography for the two most performance-critical paths. On a cache hit (get), LruCache asks HashMap for the node reference, then sends removeNode followed by insertAtHead to DoublyLinkedList β just two pointer-relinking operations β before returning the value to the client. On the eviction path (put with a new key), the message flow extends: after inserting the new node at the head, LruCache checks its own size, calls removeTail() on the list to retrieve the LRU node, and immediately calls remove(lruNode.key) on the HashMap to keep both structures in sync. That final paired operation β removeTail on the list and remove on the map β is the invariant that every correct LRU implementation must maintain.
π Spring REST Endpoint: Exposing the LRU Cache via HTTP
The domain model is complete. To wire it into a Spring Boot application, a @RestController wraps the LRUCache service and exposes HTTP endpoints for get and put operations.
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.*;
@RestController
@RequestMapping("/api/cache")
public class LRUCacheController {
private final LRUCache<String, String> cache;
// Inject a fixed-capacity cache via constructor β swap for @Bean for configurable capacity
public LRUCacheController() {
this.cache = new LRUCacheImpl<>(100);
}
/**
* GET /api/cache/{key}
* Returns the cached value, or 404 if evicted or never set.
*/
@GetMapping("/{key}")
public ResponseEntity<String> get(@PathVariable String key) {
String value = cache.get(key);
if (value == null) {
return ResponseEntity.notFound().build();
}
return ResponseEntity.ok(value);
}
/**
* PUT /api/cache/{key}
* Inserts or updates a key-value pair; evicts the LRU entry if at capacity.
*/
@PutMapping("/{key}")
public ResponseEntity<Void> put(
@PathVariable String key,
@RequestBody String value) {
cache.put(key, value);
return ResponseEntity.ok().build();
}
/**
* GET /api/cache/size
* Returns the current number of entries in the cache.
*/
@GetMapping("/size")
public ResponseEntity<Integer> size() {
return ResponseEntity.ok(cache.size());
}
}
Key design choices in this endpoint:
@RestController+@RequestMappingkeep the HTTP concern isolated from the cache domain model β theLRUCacheImplis unaware of HTTP.GET /{key}promotes the key on every hit (viacache.get()), keeping the LRU order accurate across HTTP traffic.PUT /{key}is idempotent β calling it twice with the same key simply updates the value and marks that key as recently used.- For production use, replace the constructor-injected
LRUCacheImplwith a Spring@Beanbacked by Caffeine Cache (see the π οΈ section below) to get thread safety, metrics, and eviction listeners out of the box.
βοΈ Trade-offs & Failure Modes: Thread Safety and Cache Invalidation
The basic implementation above is not thread-safe. For multi-threaded cache (e.g., shared across API threads):
| Approach | How | Trade-off |
synchronized on all methods | synchronized get(), synchronized put() | Simple; locks entire cache per op β fine for low contention |
ReentrantReadWriteLock | Read lock on get, write lock on put | Higher throughput for read-heavy workloads |
LinkedHashMap(capacity, 0.75f, true) + synchronized | Java built-in access-order LRU | Simple but coarse lock |
| Segmented locks (ConcurrentHashMap + partitioned DLL) | Shard by key hash | Best throughput; complex implementation |
For production caches, most teams use a battle-tested library like Caffeine (Java) or functools.lru_cache (Python) rather than rolling their own.
π Real-World Applications of LRU Caching
LRU β or an LRU-approximating policy β appears at every layer of the modern computing stack. Understanding the concept lets you configure and reason about these systems rather than treating them as black boxes.
CPU L1/L2/L3 caches: Hardware caches in your processor keep the most recently accessed memory lines close to the ALU. A cache miss here triggers a round-trip to slower DRAM, costing ~100Γ more cycles. The CPU's replacement policy is LRU-like.
Browser cache: When you load a webpage, the browser caches images, scripts, and stylesheets locally. If the cache hits its size limit, LRU evicts assets from pages you haven't visited recently, keeping your frequently-used sites fast.
CDN edge caches: Services like Cloudflare and Akamai cache static assets at geographically distributed edge nodes. LRU ensures hot content stays near users; cold content is dropped to free capacity for new requests.
Database buffer pools: MySQL's InnoDB buffer pool and PostgreSQL's shared buffer cache both keep recently used disk pages in RAM. LRU eviction means a repeated query hits memory instead of disk β often 100β1000Γ faster.
Redis with
maxmemory-policy allkeys-lru: Redis supports LRU as a configurable eviction strategy. Setting this policy turns Redis into a drop-in application-level LRU cache for any key-value workload.
When LRU is the wrong choice: Batch ETL jobs and sequential table scans fill the cache with data that will never be accessed again, polluting it for other queries. For these workloads, LFU or adaptive policies like ARC or Window-TinyLFU (used by Caffeine) perform significantly better.
π§ͺ Practice: Extend and Test the LRU Cache
Work through these exercises to internalize how the data structure behaves under different access patterns.
Exercise 1 β Trace by hand:
Starting with capacity = 2, trace these operations and record both the list state and the HashMap contents after each step:
put(1, "A") β put(2, "B") β get(1) β put(3, "C") β get(2)
Which key is evicted on the final put? Which key does get(2) return β hit or miss?
Hint: After
get(1), key1moves to head (MRU). Afterput(3, "C")evicts the LRU tail, key2is gone. The finalget(2)is a miss.
Exercise 2 β Add a getAll() method:
Extend the Java implementation with a List<Integer> getAll() method that returns all current keys in MRU-to-LRU order. Walk the linked list from head.next to tail.prev, collecting each node's key.
Exercise 3 β Implement with LinkedHashMap:
Java's java.util.LinkedHashMap supports access-order iteration natively. Re-implement LRUCache using LinkedHashMap with the constructor flag accessOrder = true and override removeEldestEntry. Compare the resulting code length and complexity against the custom HashMap + DLL approach.
π§ Decision Guide: Choosing the Right Cache Implementation
| Condition | Recommendation |
| Need O(1) get and O(1) put | HashMap + Doubly Linked List β the classic LRU combination |
| Thread-safe, low contention | synchronized methods on get() and put() |
| Thread-safe, read-heavy workload | ReentrantReadWriteLock β concurrent reads, exclusive writes |
| Production use case | Use Caffeine (Java) or functools.lru_cache (Python) β battle-tested, no custom implementation needed |
| Cache miss is expensive | Use write-through or write-behind caching to keep cache warm |
| Future access β recent access | Consider LFU (Least Frequently Used) instead β LRU struggles with one-off cache floods |
Start with synchronized methods for correctness. Profile under production load before switching to ReentrantReadWriteLock. For any new project, start with Caffeine unless the interview or exercise specifically requires a from-scratch implementation.
π οΈ Caffeine Cache and Spring Cache: Production-Grade LRU Without the Boilerplate
Caffeine is the fastest JVM caching library β a near-optimal LRU/LFU hybrid backed by a W-TinyLFU eviction policy, with concurrent non-blocking reads. Spring Cache abstracts the cache provider behind @Cacheable, @CachePut, and @CacheEvict annotations β you swap Caffeine for Redis or Ehcache by changing one config line.
How they solve the problem in this post: Caffeine's LinkedHashMap-based implementation handles the O(1) get and put described in this post, but adds concurrency safety, expiry-based eviction, and hit-rate statistics out of the box. The maximumSize parameter directly maps to the capacity n in the LRU design.
// βββ Part 1: Standalone Caffeine LRU β mirrors the LinkedHashMap approach βββββ
// Dependency: com.github.ben-manes.caffeine:caffeine:3.1.8
import com.github.benmanes.caffeine.cache.Cache;
import com.github.benmanes.caffeine.cache.Caffeine;
public class CaffeineDemo {
public static void main(String[] args) {
// Build an LRU cache: capacity 3, expire 10 min after last access
Cache<String, String> cache = Caffeine.newBuilder()
.maximumSize(3) // LRU capacity
.expireAfterAccess(java.time.Duration.ofMinutes(10))
.recordStats() // hit/miss counters
.build();
cache.put("a", "Apple");
cache.put("b", "Banana");
cache.put("c", "Cherry");
System.out.println(cache.getIfPresent("a")); // hit β "Apple"; promotes "a" to MRU
cache.put("d", "Date"); // capacity exceeded β evicts LRU entry ("b", not "a")
System.out.println(cache.getIfPresent("b")); // miss β null (evicted)
System.out.println(cache.getIfPresent("a")); // hit β "Apple"
// Hit rate metrics (Micrometer-compatible)
System.out.println(cache.stats()); // CacheStats{hitRate=0.667, evictionCount=1}
}
}
// βββ Part 2: Spring Cache + Caffeine β declarative caching in a service βββββββ
// application.yml:
// spring.cache.type: caffeine
// spring.cache.caffeine.spec: maximumSize=500,expireAfterWrite=10m
import org.springframework.cache.annotation.Cacheable;
import org.springframework.cache.annotation.CacheEvict;
import org.springframework.stereotype.Service;
@Service
public class ProductService {
// @Cacheable: return cached value if present; call method and cache result on miss
@Cacheable(value = "products", key = "#productId")
public String getProduct(String productId) {
System.out.println("Cache MISS β fetching from DB for: " + productId);
return "Product-" + productId; // simulates slow DB query
}
// @CacheEvict: invalidate entry when product is updated
@CacheEvict(value = "products", key = "#productId")
public void updateProduct(String productId, String newData) {
System.out.println("Evicting cache for: " + productId);
// persist newData to database
}
}
// βββ Part 3: ConcurrentLinkedHashMap β thread-safe LRU for custom scenarios βββ
// For cases where you need manual control without Caffeine
import java.util.LinkedHashMap;
import java.util.Map;
import java.util.Collections;
public class ThreadSafeLruCache<K, V> {
private final Map<K, V> cache;
public ThreadSafeLruCache(int capacity) {
// accessOrder=true: get() moves entry to tail (MRU position) β pure LRU semantics
this.cache = Collections.synchronizedMap(
new LinkedHashMap<K, V>(capacity, 0.75f, true) {
@Override
protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
return size() > capacity; // evict LRU when over capacity
}
});
}
public V get(K key) { return cache.get(key); }
public void put(K key, V value){ cache.put(key, value); }
public int size() { return cache.size(); }
}
Caffeine's W-TinyLFU eviction is statistically superior to pure LRU for workloads with repeated frequency patterns (product catalogue, API responses), while still approximating LRU for streaming access patterns. Use Caffeine directly when you need sub-microsecond latency; use @Cacheable with Spring Cache when you want to switch providers without touching service code.
For a full deep-dive on Caffeine's W-TinyLFU algorithm and Spring Cache tuning, a dedicated follow-up post is planned.
π Key Lessons from the LRU Cache Design
Match data structures to constraints. O(1) lookup demands a HashMap. O(1) order-maintenance demands a doubly linked list. Neither structure alone is enough β the LRU cache is a textbook example of combining two structures to satisfy multiple constraints simultaneously.
Sentinel nodes eliminate boundary conditions. Dummy
headandtailnodes ensure every real node always has a non-nullprevandnext. This removes an entire class ofif (node.prev == null)guards, making the code shorter and less error-prone.Recency is a good-enough proxy for value β but not perfect. LRU assumes "recently used = likely needed again." This holds for interactive web workloads, session caches, and CPU instruction streams. It breaks for sequential scan workloads. Know your access pattern before choosing a policy.
Single-threaded correctness does not imply thread safety. The
HashMap + DLLcombination is not atomic. In a multi-threaded server, interleavedgetandputcalls will corrupt the list's pointers. Addsynchronizedmethods as a baseline, or delegate to Caffeine for production.Understand the library before you use it. Caffeine's Window-TinyLFU, Guava's
CacheBuilder, and Redis'sallkeys-lruall implement variations of this same core concept. Knowing the HashMap + DLL internals makes you a more effective user of every caching library you'll encounter.
π TLDR: Summary & Key Takeaways
- LRU evicts the item not accessed the longest β ideal when recency correlates with future access probability.
- HashMap + Doubly Linked List gives O(1) get and put.
- Sentinel head/tail nodes simplify insertion and removal at boundaries.
- For thread safety,
synchronizedmethods are a good starting point; Caffeine or Guava Cache for production.
π Related Posts
- Hash Tables Explained: How HashMap Works Under the Hood β the data structure powering the O(1) lookup in every LRU cache implementation.
- LLD for Parking Lot System β another classic low-level design problem that combines multiple data structures into a cohesive system.
- LLD for URL Shortener β caching is a critical optimization layer in URL shortener systems; see how the two designs connect.
- LLD for Elevator System β practice applying LLD thinking to a stateful scheduling problem.

Written by
Abstract Algorithms
@abstractalgorithms
More Posts
RAG vs Fine-Tuning: When to Use Each (and When to Combine Them)
TLDR: RAG gives LLMs access to current knowledge at inference time; fine-tuning changes how they reason and write. Use RAG when your data changes. Use fine-tuning when you need consistent style, tone, or domain reasoning. Use both for production assi...
Fine-Tuning LLMs with LoRA and QLoRA: A Practical Deep-Dive
TLDR: LoRA freezes the base model and trains two tiny matrices per layer β 0.1 % of parameters, 70 % less GPU memory, near-identical quality. QLoRA adds 4-bit NF4 quantization of the frozen base, enabling 70B fine-tuning on 2Γ A100 80 GB instead of 8...
Build vs Buy: Deploying Your Own LLM vs Using ChatGPT, Gemini, and Claude APIs
TLDR: Use the API until you hit $10K/month or a hard data privacy requirement. Then add a semantic cache. Then evaluate hybrid routing. Self-hosting full model serving is only cost-effective at > 50M tokens/day with a dedicated MLOps team. The build ...
Watermarking and Late Data Handling in Spark Structured Streaming
TLDR: A watermark tells Spark Structured Streaming: "I will accept events up to N minutes late, and then I am done waiting." Spark tracks the maximum event time seen per partition, takes the global minimum across all partitions, subtracts the thresho...
