High-Level Design: Building a Real-Time Ad Click Aggregator at Scale
Design a high-throughput real-time stream aggregation system that processes billions of events with exactly-once guarantees.

Abstract Algorithms
Helping engineers master software engineering topics.
TLDR: Scaling an ad click aggregator requires processing massive event streams (billions of clicks per day) with exactly-once delivery guarantees. We achieve this using Kafka-based event ingestion, Apache Flink stateful stream aggregation, and timeseries databases.
π System Overview & Real-Time Event Challenge
In the digital advertising industry, click aggregation is the foundation of billing, analytics, and fraud detection. Advertisers pay based on the number of clicks their ads receive. If the system fails to count clicks accurately, advertisers are billed incorrectly, leading to disputes, revenue loss, and audit failures.
The system faces two primary scale challenges:
- Flash Ingestion Surges: Events like the Super Bowl trigger massive click spikes (e.g., hundreds of thousands of clicks per second). The ingestion layer must buffer these writes without dropping events.
- Duplicate Clicks (Deduplication): Network retries, mobile browser reloads, and click-spam bots generate duplicate events. The processing engine must deduplicate these events in real-time, ensuring each user action is counted exactly once.
π Core Requirements and Capacity Estimations
To establish a clear scope, we define the requirements and calculate the system capacity.
Functional Requirements
- Ingest Clicks: Receive user click events from global browsers.
- Aggregate Metrics: Count clicks per ad ID, grouped by 1-minute, 5-minute, and 1-hour windows.
- Query Metrics: Support low-latency dashboard queries displaying ad performance metrics.
- Detect Fraud: Identify and filter duplicate or robotic click-spam patterns.
Non-Functional Requirements
- Exactly-Once Processing: Clicks must be counted exactly once to prevent billing errors.
- Low-Latency Queries: Dashboards must load aggregation metrics in under 500 milliseconds.
- Fault Tolerance: The system must recover state after node crashes without losing data.
Capacity Estimations
Let's estimate the scale of a global ad platform:
- Daily Click Volume: 10,000,000,000 (10 billion) clicks per day.
- Average Clicks/Second: 10,000,000,000 / 86,400 seconds = 115,740 clicks/sec.
- Peak Click Volume: 500,000 clicks/sec during major events.
- Event Size: Each click payload (ad_id, user_id, timestamp, ip, cost) is 200 bytes.
- Raw Data Ingestion Rate (Peak): 500,000 clicks/sec * 200 bytes = 100 MB/sec.
- Aggregated Storage Rate: 1,000,000 active ads. Aggregating counts per ad per minute results in 1,000,000 rows/minute. Row size = 50 bytes. Storage speed = 50 MB/minute.
βοΈ Core Mechanics: API, Schema, and Storage Architecture
Our system separates raw event ingestion from metric query paths.
API Design
We define the endpoints for click collection and dashboard retrieval:
| Endpoint | HTTP Method | Description | Input Parameters | Return Format |
/api/v1/clicks | POST | Ingest a raw user ad click | ad_id, user_id, timestamp, token | HTTP 202 Accepted |
/api/v1/ads/{id}/metrics | GET | Fetch aggregated click metrics | ad_id, resolution (1m/5m/1h), range | JSON timeseries metrics |
Database Schema (Timeseries Store)
We store aggregated metrics in a timeseries-optimized database like ClickHouse or InfluxDB:
| Table Name | Column Name | Data Type | Key Type | Partitioning Strategy |
| AdMetricsMinute | ad_id | VARCHAR(64) | Primary Key | Partitioned by event_time (daily) |
| AdMetricsMinute | event_time | TIMESTAMP | Primary Key | Index on (ad_id, event_time) |
| AdMetricsMinute | click_count | BIGINT | - | - |
| AdMetricsMinute | total_cost | DECIMAL(18,4) | - | - |
Cache Schema (Deduplication Store)
To identify duplicate click events at the edge, we use Redis as an active deduplication cache:
| Key Pattern | Value Format | TTL | Eviction Policy | Purpose |
dedup:{ad_id}:{user_id}:{minute} | 1 | 5 minutes | volatile-lru | Edge cache to detect and drop duplicate clicks |
π Architectural Blueprint: High-Level System Flow
The diagram below maps the components of our ad click aggregator:
graph TD
Client[User Browser] -->|Click Event| LB[Load Balancer]
LB -->|Forward| API[Click Collection Service]
API -->|Write Dedup Key| Redis[Deduplication Redis]
API -->|Buffer Event| Kafka[Apache Kafka Cluster]
Kafka -->|Stream Consumption| Flink[Apache Flink Engine]
Flink -->|Write Aggregations| TSDB[ClickHouse Timeseries DB]
Dashboard[Query Dashboard] -->|Get Metrics| QuerySvc[Query Service]
QuerySvc -->|Read Metrics| TSDB
This system diagram illustrates the architecture of our ad click aggregation platform. Raw click events from browsers are received by a click collection service behind a load balancer. The service runs a quick deduplication check in Redis and immediately appends the event to an Apache Kafka broker. Apache Flink consumes from Kafka, maintaining stateful sliding window aggregations in-memory before flushing the final aggregates to a ClickHouse timeseries database.
π§ Deep Dive: Exactly-Once Processing and Streaming Windows
Guaranteeing exact metrics under flash load requires coordinate states across the processing pipeline.
The Internals of Sliding Windows and Deduplication
To count clicks exactly once, the system uses two layers of defense:
- Edge Deduplication: When a click arrives, the Collection Service builds a fingerprint hash using
(user_id + ad_id + timestamp_minute). It checks if this key exists in Redis using an atomic transaction. If present, it drops the click as a duplicate. - Stream Engine Checkpoints: In case of node crashes inside the streaming engine, Apache Flink uses Chandy-Lamport state snapshotting. Flink injects checkpoint barriers into the event stream. When a partition processor encounters a barrier, it saves its current window state (e.g., intermediate click count) to durable storage (Amazon S3). If a processor crashes, Flink rolls back the stream to the last checkpoint, replaying Kafka events from that offset.
Performance Analysis of Stream Ingestion and Storage Writes
Directly writing raw clicks to a relational database creates an unsustainable IO bottleneck. By using Flink's in-memory windows, we group clicks locally before flushing. For example, 50,000 raw clicks/second are aggregated in RAM per ad. Flink only writes the final sum to ClickHouse once per minute. This reduces database writes from 50,000 IOPS to just 1,000 IOPS, ensuring the timeseries store is not overloaded.
π Write and Read Path Sequences
Write Path Flow (Click Ingestion & Processing)
- The client clicks an ad, triggering a
POST /clicksrequest. - The Collection Service validates the click token and checks the Redis deduplication cache.
- The Collection Service publishes the event to Apache Kafka using
ad_idas the partition key. - Flink reads from Kafka, validates watermarks, and aggregates metrics inside 1-minute time windows.
- Flink flushes the 1-minute aggregates to ClickHouse.
Read Path Flow (Dashboard Query)
- The dashboard client requests metrics for an ad:
GET /ads/{id}/metrics. - The Query Service intercepts the request.
- The Query Service runs a SELECT query against the ClickHouse aggregated tables.
- The query returns timeseries metrics to the client in under 100 milliseconds.
π Real-World Implementation: Ad Click Processing at Scale
Real-world advertising giants like Google and Meta split processing into hot and cold paths:
- The Hot Path (Real-time): Stream aggregators like Flink compute click metrics over short sliding windows to update dashboards and trigger real-time fraud alerts.
- The Cold Path (Batch): Raw click events are saved to Hadoop or Snowflake. Nightly batch jobs reconcile the day's traffic to calculate final advertiser billing.
βοΈ Trade-offs and Failure Modes: Latency vs. Accuracy in Streaming
Choosing a streaming architecture requires balancing latency, accuracy, and operational overhead:
| Metric | Flink Stateful Processing (Kappa) | Spark Structured Streaming (Micro-Batch) | Lambda Architecture (Batch + Stream) |
| Aggregation Latency | Sub-second (real-time processing) | 1-5 seconds (micro-batching) | Low for dashboard, high for billing reconciliation |
| Consistency Guarantee | Exactly-once (Chandy-Lamport checkpoints) | Exactly-once (Transactional logs) | Eventually consistent (Batch overwrites stream) |
| Storage Overhead | Medium (state size in RocksDB) | Low | High (duplicate raw storage in data lake and DB) |
| Operational Complexity | High | Medium | Very High (requires maintaining two codebases) |
Our design uses Flink stateful processing to achieve real-time metrics with exactly-once guarantees.
π§ Decision Guide: Lambda vs. Kappa Architecture
Use this decision table to select the correct streaming architecture for your application.
| Situation | Recommendation | Alternative |
| Billing applications requiring real-time updates and strict audit paths | Kappa Architecture (Flink stateful stream) | Replays source logs to recover state after failures. |
| Simple count metrics where latency is not critical | Micro-Batching (Spark Streaming) | Simpler to operate and scale. |
| Complex machine learning pipelines utilizing raw log histories | Lambda Architecture | Combines batch history with real-time stream feeds. |
π§ͺ Practical Interview Execution: 45-Minute Delivery Strategy
Manage your time during a system design interview using this schedule:
- Minutes 0-5 (Clarify Requirements): Agree on ingestion scale, peak QPS, deduplication constraints, and query latencies.
- Minutes 5-15 (High-Level Architecture): Whiteboard the API, Kafka buffer, Flink processor, and timeseries database.
- Minutes 15-30 (Deep Dive): Explain how Flink handles exactly-once processing using checkpoints and watermarks.
- Minutes 30-40 (Deduplication): Detail how the edge Redis cache prevents double-billing of click spam.
- Minutes 40-45 (Trade-offs): Summarize design trade-offs, comparing Kappa architecture to Lambda configurations.
π οΈ Apache Flink: Stream Windowing Configuration
To configure Apache Flink for exactly-once processing:
- Enable checkpointing with a 10-second interval, using RocksDB as the state backend.
- Set the checkpointing mode to
EXACTLY_ONCE. - Configure the Kafka consumer to read with
read_committedisolation, ensuring only committed transactional messages are aggregated.
π Lessons Learned: Production Streaming Pitfalls
Avoid these standard mistakes when deploying stream aggregators:
- Ignoring Late-Arriving Data: Network delays can cause click events to arrive minutes late. Ensure your Flink watermarks allow for an acceptable late-arrival buffer (e.g., 5 minutes) to avoid dropping valid clicks.
- Unbounded State Growth: If you aggregate metrics over long windows (e.g., 24 hours) without eviction rules, Flink's state size will exceed memory limits, causing out-of-memory crashes.
- Weak Deduplication TTLs: Setting the Redis deduplication TTL too short (e.g., 30 seconds) allows duplicate clicks from network retries to slip through and double-bill clients.
π Summary: The Ad Click Aggregator Cheat Sheet
- Edge Deduplication: Filter click spam early using Redis atomic fingerprint lookups.
- Stateful Streaming: Use Apache Flink to aggregate metrics in-memory before writing to the database.
- Exactly-Once Processing: Enable Flink checkpoints and Kafka transactional commits.
- Timeseries Storage: Flush aggregated metrics to a timeseries database to support low-latency dashboard queries.
- Watermarks: Configure Flink watermarks to handle late-arriving events due to network latency.
AI-generated article quiz
Test your understanding
Ready to test what you just learned?
Generate four focused questions from this article. Answers include immediate explanations.
Guided series path
System Design Interview Prep
Reader feedback
Was this article useful?
Rate it if it helped, then continue with the next deep dive when you are ready.
Sign in to save your rating.
Article metadata