Write-Time vs Read-Time Fan-Out: How Social Feeds Scale

The celebrity threshold, Redis sorted sets, and the hybrid model every social platform actually ships.

System Design Interview Prep

Abstract Algorithms

·Mar 29, 2026·18 min read

Cover Image for Write-Time vs Read-Time Fan-Out: How Social Feeds Scale

📚

Intermediate

For developers with some experience. Builds on fundamentals.

Estimated read time: 18 min

AI-assisted content. This post may have been written or enhanced with AI tools. Please verify critical information independently.

TLDR: Fan-out is the act of distributing one post to many followers' feeds. Write-time fan-out (push) pre-computes feeds at post time — fast reads but catastrophic write amplification for celebrities. Read-time fan-out (pull) computes feeds on demand — no amplification but slow reads at scale. Production systems like Twitter/X, Instagram, and Facebook use a hybrid: write-time fan-out for users with ≤ 10K followers, read-time fan-out injected at read for celebrities above that threshold.

When Justin Bieber tweets, roughly 300 million accounts are eligible to see it. Your system has milliseconds to decide: do you immediately push that tweet into every follower's pre-built feed cache, or do you wait and assemble each follower's feed the moment they open the app?

This is the fan-out problem — the act of distributing a single write event to many downstream consumers — and it is the central architectural decision in every social feed, notification system, and activity stream. Get it wrong and you either melt your database on a celebrity post or make your users wait three seconds every time they check their timeline.

Neither naive approach survives production at scale. The push (write-time) model works brilliantly for normal users but breaks on high-follower accounts. The pull (read-time) model handles celebrities gracefully but destroys read latency as follower graphs grow. The production answer — used by Twitter/X, Instagram, and Facebook — is a hybrid that routes based on a configurable follower-count threshold.

Model	Write Cost	Read Cost	Who Uses It
Write-time fan-out (push)	High — N Redis writes per post	O(1) — pre-built cache	Normal users (≤ 10K followers)
Read-time fan-out (pull)	None	High — query N timelines, merge	Celebrity accounts (> 10K followers)
Hybrid	Moderate	Low for most, managed for celebrities	Twitter/X, Instagram, Facebook

🔍 Fan-Out Fundamentals: The Push vs. Pull Decision

Fan-out refers to the branching factor of a write event: when one user posts, the event must "fan out" to all followers who should see it. The decision of when that fan-out happens determines your system's read latency, write throughput, and storage requirements.

Think of it like a newspaper distribution model. A push model is the morning doorstep delivery: every subscriber gets a physical copy at their door before they wake up. Reading is instant — your paper is already there. But if 10 million people subscribe overnight, the printing and delivery cost explodes. A pull model is the newsstand: no delivery cost, but every reader must walk to the stand and assemble their own selection from different sections each morning. Fast to produce, slow to consume.

In software terms:

Write-time fan-out writes the post reference into each follower's feed cache (typically a Redis sorted set) at the moment of posting. Feed reads are O(1) cache lookups.
Read-time fan-out defers all computation to read time. When a user requests their feed, the service fetches every followed account's recent posts, merges, and sorts them on the fly.

The break-even point is roughly the celebrity threshold — commonly set between 5,000 and 10,000 followers. Below this threshold, write-time fan-out is cheap enough. Above it, the write amplification becomes unsustainable.

⚙️ Two Approaches, Two Bottlenecks: Write-Time vs. Read-Time Fan-Out

Write-Time Fan-Out: Paying the Cost at Post Time

When a user with 500 followers posts, a fan-out worker writes 500 entries into Redis — one per follower's sorted set feed cache — with the score set to the post's Unix timestamp. Later, when any of those 500 followers opens their timeline, the system does a single ZREVRANGE on their Redis key: constant time, regardless of who they follow.

The Mermaid diagram below shows the write-time path:

flowchart TD
    classDef svc fill:#f5a623,stroke:#d4880a,color:#000
    classDef db fill:#4a9fd4,stroke:#2d7aad,color:#fff
    classDef cache fill:#cc2936,stroke:#a01e2b,color:#fff
    classDef mq fill:#27ae60,stroke:#1e8449,color:#fff
    classDef client fill:#2c3e50,stroke:#1a252f,color:#fff
    classDef infra fill:#7f8c8d,stroke:#616a6b,color:#fff

    A[Author Posts]:::client --> B[Post Service]:::svc
    B --> C[Post DB]:::db
    B --> D[Kafka post.created]:::mq
    D --> E[Fan-out Worker]:::svc
    E --> F[Feed Cache Follower 1]:::cache
    E --> G[Feed Cache Follower 2]:::cache
    E --> H[Feed Cache Follower N]:::cache
    F --> READ[O1 Feed Read]:::client
    G --> READ
    H --> READ

Write-time fan-out: the post is fanned out to N follower caches asynchronously via Kafka. Reads are instant.

The problem surfaces with the star problem: a celebrity with 10 million followers triggers 10 million Redis ZADD operations per post. Even at 100 µs each, that is 1,000 seconds of combined write work — and celebrities often post several times a day.

Read-Time Fan-Out: Paying the Cost at Read Time

Read-time fan-out inverts the model entirely. Nothing is written to follower caches at post time. Instead, when a user requests their feed, the Feed Service:

Looks up the user's following list (e.g., from a graph store)
Queries each followed account's individual post timeline (either from cache or DB)
Merges all results
Sorts by timestamp (or ranking score)
Returns the paginated result

flowchart TD
    classDef svc fill:#f5a623,stroke:#d4880a,color:#000
    classDef db fill:#4a9fd4,stroke:#2d7aad,color:#fff
    classDef cache fill:#cc2936,stroke:#a01e2b,color:#fff
    classDef client fill:#2c3e50,stroke:#1a252f,color:#fff

    A[Read Request]:::client --> B[Feed Service]:::svc
    B --> C[Following List Lookup]:::svc
    C --> D[Timeline Followed User A]:::db
    C --> E[Timeline Followed User B]:::db
    C --> F[Timeline Followed User N]:::db
    D --> G[Merge Sort by Timestamp]:::svc
    E --> G
    F --> G
    G --> H[Client Feed]:::client

Read-time fan-out: the feed is assembled on every read request by querying each followed user's posts.

This eliminates write amplification entirely. Posting is cheap: write to the post store, done. But reading becomes expensive. If a user follows 1,000 accounts, that is 1,000 lookups per feed request, followed by an N-way merge sort. The latency compounds as the following graph grows.

🧠 Deep Dive: Inside the Redis Feed Cache and the Async Fan-Out Pipeline

The Internals: Redis Sorted Sets, Kafka Topics, and Fan-Out Workers

The canonical feed cache data structure is a Redis sorted set (ZSET). Each follower's feed is stored as a key like feed:{userId} where each member is a postId and the score is the post's Unix timestamp in milliseconds. This enables:

ZADD feed:42 1711720800000 postId:9999 — insert a post reference
ZREVRANGE feed:42 0 49 WITHSCORES — fetch the 50 most recent posts
ZREMRANGEBYRANK feed:42 0 -501 — evict entries beyond a 500-post cap

The feed cache stores only post IDs (or lightweight references), not full post content. Post content is fetched separately in a second read from a post cache or DB, allowing the feed cache to remain compact and fast.

Fan-out is decoupled from the post write path via Kafka. The Post Service emits a post.created event to a Kafka topic. One or more Fan-out Consumer groups read from this topic and perform the actual Redis writes. This separation means:

The author gets an instant write confirmation (synchronous DB write only)
Fan-out happens asynchronously and is retry-safe via Kafka offset management
Multiple consumer groups can independently process the same event (e.g., fan-out workers, notification workers, analytics workers)

Kafka Topic: post.created
  Partition key: authorId (ensures ordered delivery per author)
  Consumer Group 1: fan-out-workers (writes to Redis feed caches)
  Consumer Group 2: notification-workers (sends push notifications)
  Consumer Group 3: analytics-ingest (feeds the engagement pipeline)

Cache TTL and eviction: Feed caches are ephemeral. A common strategy is to cap each user's feed at 500–1000 post IDs and set a TTL of 48–72 hours. If the cache misses (user hasn't logged in recently), a cold-start backfill job reconstructs the feed from the post DB on the next read.

Performance Analysis: Read/Write Complexity by Model

Model	Write Cost per Post	Read Cost per Request	Storage
Write-time fan-out	O(N) where N = follower count	O(1) Redis ZREVRANGE	N × feed entries per post
Read-time fan-out	O(1)	O(F × P) where F = following count, P = posts per user	Post store only
Hybrid	O(N) for normal users, O(1) for celebrities	O(1) for cached feed + O(C) celebrity inject	Bounded by threshold

The write-time model's O(N) write cost is the core bottleneck. For a power-law follower distribution (most users have few followers, a handful have millions), the average case is manageable — but the tail case (celebrities) is catastrophic without the threshold guard.

The read-time model's O(F × P) read cost is fine if F (following count) is small, but degrades badly as users follow thousands of accounts. The merge sort across F timelines is the bottleneck: it must happen before the response can be returned.

The hybrid caps the worst cases of both: celebrities are excluded from write-time fan-out (preventing O(10M) writes), and their posts are injected at read time in a single targeted lookup (O(C) where C is typically 5–50 celebrities per user).

📊 The Hybrid Architecture: Routing Posts by Follower Count

The hybrid model routes the fan-out decision at post time, not at read time. The Fan-out Router — consuming from the post.created Kafka topic — checks the author's follower count against a configurable threshold (typically 10,000). Normal users fan out immediately; celebrity posts skip the write fan-out entirely and are stored in a dedicated Celebrity Post Store instead.

flowchart TD
    classDef svc fill:#f5a623,stroke:#d4880a,color:#000
    classDef db fill:#4a9fd4,stroke:#2d7aad,color:#fff
    classDef cache fill:#cc2936,stroke:#a01e2b,color:#fff
    classDef mq fill:#27ae60,stroke:#1e8449,color:#fff
    classDef client fill:#2c3e50,stroke:#1a252f,color:#fff
    classDef infra fill:#7f8c8d,stroke:#616a6b,color:#fff

    POST[Author Posts]:::client --> PS[Post Service]:::svc
    PS --> POSTDB[Post DB]:::db
    PS --> KAFKA[Kafka post.created]:::mq
    KAFKA --> ROUTER[Fan-out Router]:::svc
    ROUTER -->|followers ≤ 10K| FW[Write-Time Fan-out Worker]:::svc
    ROUTER -->|followers gt 10K| CELDB[Celebrity Post Store]:::db
    FW --> REDIS[Follower Feed Caches Redis]:::cache
    READ[Follower Requests Feed]:::client --> FS[Feed Service]:::svc
    FS --> REDIS
    FS -->|merge celebrity posts at read| CELDB
    FS --> MERGE[Merge and Rank]:::svc
    MERGE --> READ

The hybrid architecture: normal posts fan out at write time to Redis; celebrity posts bypass the fan-out queue and are injected at read time.

At read time, the Feed Service performs two lookups: one ZREVRANGE against the user's Redis feed cache (containing normal-user posts), and one targeted query against the Celebrity Post Store for each celebrity the user follows. The two result sets are merged and ranked before being returned.

🌍 How Twitter/X, Instagram, and Facebook Actually Handle Fan-Out

Twitter/X pioneered the hybrid model at scale. The pre-2010 system computed timelines on every read and collapsed under Barack Obama's 2009 inauguration traffic. Engineers rebuilt around write-time fan-out with a Redis-backed timeline store and added the celebrity bypass path (called the "mixed timeline" in their 2013 engineering blog post) after Katy Perry's follower count exposed the write amplification ceiling.

Instagram follows a similar pattern with one optimization: their Fan-out Service uses a tiered consumer group arrangement. High-frequency posters (not just high-follower accounts) are also routed to the read-time path, preventing a single prolific user from flooding the fan-out queue.

Facebook uses a variation called aggregated fan-out: instead of writing individual post IDs to each follower's cache, they write aggregated "story batches" that group multiple posts from the same author. This reduces the number of Redis entries per follower's feed while maintaining freshness — a useful optimization when one followed account posts 50 times a day.

The common thread across all three: eventual consistency for feeds is acceptable. A post appearing in a follower's feed 1–3 seconds after posting is imperceptible to users. This tolerance is what makes async Kafka-based fan-out viable; you do not need synchronous fan-out to guarantee correctness.

⚖️ Trade-offs and Failure Modes: Write Amplification, Cache Lag, and the Star Problem

Write amplification is the defining failure mode of naive write-time fan-out. Every celebrity post multiplies into millions of Redis writes. At 100 µs per write, a 10M-follower account posting once generates ~1000 CPU-seconds of Redis work — sustained bursts from trending celebrities can saturate fan-out worker pools entirely, causing lag for normal users.

Cache eviction and cold-start gaps create consistency issues. If a user's feed cache has expired and they log in after a week, their pre-built feed is empty. The cold-start backfill job must reconstruct the feed from the post DB, which can be slow and may miss posts from the celebrity path (which was never written to their cache). The hybrid system must handle this gracefully — typically by always querying the celebrity post store even on cold-start.

Follow/unfollow consistency is subtle: when Alice unfollows Bob, Bob's future posts should stop appearing in Alice's feed. For the write-time path, this is handled naturally (fan-out workers check the current follower list). For the celebrity path, the Feed Service filters followed celebrities at read time using the user's current follow graph.

Queue lag under load is the operational risk of Kafka-based async fan-out. During a viral event (a breaking news post from a major account), the post.created topic can back up. Fan-out workers must scale horizontally via Kafka partition count. A common production safeguard is to dedicate separate Kafka consumer groups and worker pools for write-time vs. celebrity posts, preventing one path from starving the other.

🧭 Decision Guide: Choosing Your Fan-Out Strategy

Situation	Recommendation
Most users have < 5K followers	Pure write-time fan-out; simpler architecture, O(1) reads
Users span a wide follower range (some have millions)	Hybrid: write-time below threshold, read-time above it
Read latency is critical (< 50ms P99)	Write-time fan-out with Redis feed cache; cold-start backfill required
Write throughput is critical; reads can tolerate > 200ms	Read-time fan-out; simpler writes, no fan-out workers needed
Celebrity threshold to use	Start at 10K; tune based on write amplification metrics in production
Avoid when	Users regularly follow thousands of accounts AND you choose read-time only — merge sort at read time will be slow
Alternative for extreme scale	Aggregated fan-out (group posts per author per time window before writing to cache)
Edge case: celebrity follows celebrity	Both are on the read-time path; no double write amplification

🧪 Java: The Hybrid Fan-Out Router in Code

This section shows the two focal decision points: the fan-out routing check and the hybrid feed merge at read time.

FanOutService.java — Celebrity threshold routing

@Service
public class FanOutService {

    private static final int CELEBRITY_THRESHOLD = 10_000;

    private final FollowerRepository followerRepository;
    private final FeedCacheRepository feedCache;       // Redis ZSET wrapper
    private final CelebrityPostRepository celebStore;  // Dedicated celebrity store

    public void handlePostCreated(PostCreatedEvent event) {
        long followerCount = followerRepository.countFollowers(event.authorId());

        if (followerCount <= CELEBRITY_THRESHOLD) {
            fanOutToFollowerCaches(event);
        } else {
            // Skip write fan-out; store in celebrity post store for read-time injection
            celebStore.save(event.authorId(), event.postId(), event.createdAt());
        }
    }

    private void fanOutToFollowerCaches(PostCreatedEvent event) {
        // Paginate followers to avoid loading millions into memory at once
        int page = 0;
        List<Long> followerBatch;
        do {
            followerBatch = followerRepository.findFollowers(event.authorId(), page++, 500);
            for (Long followerId : followerBatch) {
                feedCache.addPost(followerId, event.postId(), event.createdAt().toEpochMilli());
                feedCache.trimFeed(followerId, 500); // Keep at most 500 entries per cache
            }
        } while (!followerBatch.isEmpty());
    }
}

FeedService.java — Hybrid read merge (cached feed + celebrity posts)

@Service
public class FeedService {

    private final FeedCacheRepository feedCache;
    private final CelebrityPostRepository celebStore;
    private final FollowGraphRepository followGraph;
    private final PostRepository postRepository;

    public List<Post> getTimeline(long userId, int limit) {
        // 1. Fetch pre-built cached feed (write-time fan-out posts)
        List<Long> cachedPostIds = feedCache.getRecentPostIds(userId, limit);

        // 2. Find which followed accounts are on the celebrity (read-time) path
        List<Long> followedCelebrities = followGraph.getFollowedCelebrities(userId, CELEBRITY_THRESHOLD);

        // 3. Fetch recent posts from each celebrity directly
        List<PostRef> celebPosts = followedCelebrities.stream()
            .flatMap(celeb -> celebStore.getRecentPosts(celeb, 20).stream())
            .collect(Collectors.toList());

        // 4. Merge all post IDs, deduplicate, sort by timestamp descending
        List<Long> allPostIds = Stream.concat(
                cachedPostIds.stream(),
                celebPosts.stream().map(PostRef::postId)
            )
            .distinct()
            .sorted(Comparator.comparingLong(id -> -postRepository.getTimestamp(id)))
            .limit(limit)
            .collect(Collectors.toList());

        // 5. Hydrate post objects in a single batch read
        return postRepository.findAllById(allPostIds);
    }
}

Key design decisions in the code:

Follower pagination (page of 500) prevents memory pressure during fan-out of moderately popular accounts
trimFeed enforces the per-user cache cap immediately after each write
Celebrity merge happens at read time with a single query per celebrity, not one per post
Post hydration is batched (findAllById) to avoid N+1 queries against the post store

🛠️ Kafka in Practice: Async Fan-Out Workers That Actually Scale

Apache Kafka is the standard delivery layer for async fan-out in production social systems. The post.created topic carries the event from the Post Service to all downstream consumers. Here is the core Spring Kafka configuration that governs fan-out worker behavior:

@Configuration
public class FanOutKafkaConfig {

    @Bean
    public ConcurrentKafkaListenerContainerFactory<String, PostCreatedEvent> fanOutContainerFactory(
            ConsumerFactory<String, PostCreatedEvent> consumerFactory) {

        ConcurrentKafkaListenerContainerFactory<String, PostCreatedEvent> factory =
            new ConcurrentKafkaListenerContainerFactory<>();

        factory.setConsumerFactory(consumerFactory);
        factory.setConcurrency(8);             // 8 threads per pod, one per Kafka partition
        factory.getContainerProperties()
               .setAckMode(ContainerProperties.AckMode.MANUAL_IMMEDIATE); // Commit only after Redis write succeeds

        return factory;
    }
}

@Component
public class FanOutConsumer {

    @KafkaListener(
        topics = "post.created",
        groupId = "fan-out-workers",
        containerFactory = "fanOutContainerFactory"
    )
    public void consume(PostCreatedEvent event, Acknowledgment ack) {
        fanOutService.handlePostCreated(event);
        ack.acknowledge(); // Only commit offset once fan-out is complete
    }
}

The partition key for post.created should be authorId. This ensures all posts from the same author land on the same partition — preserving per-author ordering and preventing two fan-out workers from racing to update the same set of follower caches simultaneously.

For a full deep-dive on Kafka consumer group configuration, retry semantics, and dead-letter queues, see the companion post on System Design HLD: Notification Service.

📚 Lessons Learned: What Goes Wrong in Production

1. Forgetting to handle the celebrity path during cold-start backfill. When a user's feed cache expires and is rebuilt from the post DB, it is easy to only restore write-time fan-out posts. Celebrity posts — which were never written to the user's cache — will be silently absent from the rebuilt feed. Always run the celebrity merge step even during backfill.

2. Using a static celebrity threshold. A hardcoded 10K threshold is a starting point, not a law. Monitor your fan-out worker queue depth and write amplification metrics. Some systems use a dynamic threshold: if the Kafka lag for fan-out-workers exceeds a threshold, the router automatically reclassifies borderline accounts to the read-time path.

3. Not partitioning the fan-out Kafka topic by authorId. Random or round-robin partitioning means two fan-out workers can concurrently process posts from the same author, leading to out-of-order writes into follower feed caches. Partition by authorId to serialize per-author fan-out.

4. Letting the feed cache grow unbounded. Without a TTL and size cap, popular users' feed caches balloon in Redis memory. A 500-post cap per user with a 48-hour TTL is a reasonable default — but monitor memory usage and tune accordingly.

5. Conflating "celebrity" with "high-follower-count." Some systems also route accounts with high post frequency (many posts per hour) to the read-time path, even if their follower count is modest. A prolific but moderately followed account can still overwhelm the fan-out queue. Consider a composite routing decision: followerCount > threshold OR postFrequency > rateThreshold.

📌 Summary & Key Takeaways

Write-time fan-out (push) pre-computes follower feeds at post time using Redis sorted sets. Reads are O(1) but writes are O(N followers). Works well for normal accounts.
Read-time fan-out (pull) defers feed assembly to read time. Writes are cheap but reads require an N-way merge across all followed users. Works well for celebrities but degrades with large following counts.
The hybrid model routes by follower count at a configurable threshold (typically 10K). Normal users get write-time fan-out; celebrity posts are injected at read time from a dedicated store.
Redis sorted sets (scored by timestamp) are the standard feed cache structure. Size-cap entries per user and set a TTL to control memory growth.
Kafka decouples fan-out from the synchronous post write path, making fan-out retry-safe and independently scalable. Partition by authorId to preserve per-author ordering.
Eventual consistency is acceptable for feeds — a 1–3 second delivery lag is imperceptible to users and enables safe async fan-out.
One-liner to remember: Write-time fan-out buys fast reads by paying a write tax; read-time fan-out skips the tax but charges interest on every read.

Test Your Knowledge

🧠

Ready to test what you just learned?

AI will generate 4 questions based on this article's content.

Stale Reads and Cascading Failures in Distributed Systems

TLDR: Stale reads return superseded data from replicas that haven't yet applied the latest write. Cascading failures turn one overloaded node into a cluster-wide collapse through retry storms and redistributed load. Both are preventable — stale reads...

May 3, 2026•23 min read

Split Brain Explained: When Two Nodes Both Think They Are Leader

TLDR: Split brain happens when a network partition causes two nodes to simultaneously believe they are the leader — each accepting writes the other never sees. Prevent it with quorum consensus (at least ⌊N/2⌋+1 nodes must agree before leadership is g...

May 3, 2026•20 min read

Clock Skew and Causality Violations: Why Distributed Clocks Lie

TLDR: Physical clocks on distributed machines cannot be perfectly synchronized. NTP keeps them within tens to hundreds of milliseconds in normal conditions — but under load, across datacenters, or after a VM pause, the drift can reach seconds. When s...

May 3, 2026•18 min read

Count-Min Sketch Explained: Frequency Estimation at Streaming Scale

TLDR: Count-Min Sketch (CMS) is a fixed-size d × w counter matrix that estimates how often any element has appeared in a stream. Insert: hash the element with each of the d hash functions to get one column per row, increment those d counters. Query: ...

May 3, 2026•21 min read