Home/Blog/System Design/High-Level Design: Building a Real-Time Ad Click Aggregator at Scale
System DesignIntermediateβ€’9 min readβ€’

High-Level Design: Building a Real-Time Ad Click Aggregator at Scale

Design a high-throughput real-time stream aggregation system that processes billions of events with exactly-once guarantees.

Abstract Algorithms

Abstract Algorithms

Helping engineers master software engineering topics.

TLDR: Scaling an ad click aggregator requires processing massive event streams (billions of clicks per day) with exactly-once delivery guarantees. We achieve this using Kafka-based event ingestion, Apache Flink stateful stream aggregation, and timeseries databases.


πŸ“– System Overview & Real-Time Event Challenge

In the digital advertising industry, click aggregation is the foundation of billing, analytics, and fraud detection. Advertisers pay based on the number of clicks their ads receive. If the system fails to count clicks accurately, advertisers are billed incorrectly, leading to disputes, revenue loss, and audit failures.

The system faces two primary scale challenges:

  1. Flash Ingestion Surges: Events like the Super Bowl trigger massive click spikes (e.g., hundreds of thousands of clicks per second). The ingestion layer must buffer these writes without dropping events.
  2. Duplicate Clicks (Deduplication): Network retries, mobile browser reloads, and click-spam bots generate duplicate events. The processing engine must deduplicate these events in real-time, ensuring each user action is counted exactly once.

πŸ” Core Requirements and Capacity Estimations

To establish a clear scope, we define the requirements and calculate the system capacity.

Functional Requirements

  • Ingest Clicks: Receive user click events from global browsers.
  • Aggregate Metrics: Count clicks per ad ID, grouped by 1-minute, 5-minute, and 1-hour windows.
  • Query Metrics: Support low-latency dashboard queries displaying ad performance metrics.
  • Detect Fraud: Identify and filter duplicate or robotic click-spam patterns.

Non-Functional Requirements

  • Exactly-Once Processing: Clicks must be counted exactly once to prevent billing errors.
  • Low-Latency Queries: Dashboards must load aggregation metrics in under 500 milliseconds.
  • Fault Tolerance: The system must recover state after node crashes without losing data.

Capacity Estimations

Let's estimate the scale of a global ad platform:

  • Daily Click Volume: 10,000,000,000 (10 billion) clicks per day.
  • Average Clicks/Second: 10,000,000,000 / 86,400 seconds = 115,740 clicks/sec.
  • Peak Click Volume: 500,000 clicks/sec during major events.
  • Event Size: Each click payload (ad_id, user_id, timestamp, ip, cost) is 200 bytes.
  • Raw Data Ingestion Rate (Peak): 500,000 clicks/sec * 200 bytes = 100 MB/sec.
  • Aggregated Storage Rate: 1,000,000 active ads. Aggregating counts per ad per minute results in 1,000,000 rows/minute. Row size = 50 bytes. Storage speed = 50 MB/minute.

βš™οΈ Core Mechanics: API, Schema, and Storage Architecture

Our system separates raw event ingestion from metric query paths.

API Design

We define the endpoints for click collection and dashboard retrieval:

EndpointHTTP MethodDescriptionInput ParametersReturn Format
/api/v1/clicksPOSTIngest a raw user ad clickad_id, user_id, timestamp, tokenHTTP 202 Accepted
/api/v1/ads/{id}/metricsGETFetch aggregated click metricsad_id, resolution (1m/5m/1h), rangeJSON timeseries metrics

Database Schema (Timeseries Store)

We store aggregated metrics in a timeseries-optimized database like ClickHouse or InfluxDB:

Table NameColumn NameData TypeKey TypePartitioning Strategy
AdMetricsMinutead_idVARCHAR(64)Primary KeyPartitioned by event_time (daily)
AdMetricsMinuteevent_timeTIMESTAMPPrimary KeyIndex on (ad_id, event_time)
AdMetricsMinuteclick_countBIGINT--
AdMetricsMinutetotal_costDECIMAL(18,4)--

Cache Schema (Deduplication Store)

To identify duplicate click events at the edge, we use Redis as an active deduplication cache:

Key PatternValue FormatTTLEviction PolicyPurpose
dedup:{ad_id}:{user_id}:{minute}15 minutesvolatile-lruEdge cache to detect and drop duplicate clicks

πŸ“Š Architectural Blueprint: High-Level System Flow

The diagram below maps the components of our ad click aggregator:

graph TD
    Client[User Browser] -->|Click Event| LB[Load Balancer]
    LB -->|Forward| API[Click Collection Service]
    API -->|Write Dedup Key| Redis[Deduplication Redis]
    API -->|Buffer Event| Kafka[Apache Kafka Cluster]
    Kafka -->|Stream Consumption| Flink[Apache Flink Engine]
    Flink -->|Write Aggregations| TSDB[ClickHouse Timeseries DB]
    Dashboard[Query Dashboard] -->|Get Metrics| QuerySvc[Query Service]
    QuerySvc -->|Read Metrics| TSDB

This system diagram illustrates the architecture of our ad click aggregation platform. Raw click events from browsers are received by a click collection service behind a load balancer. The service runs a quick deduplication check in Redis and immediately appends the event to an Apache Kafka broker. Apache Flink consumes from Kafka, maintaining stateful sliding window aggregations in-memory before flushing the final aggregates to a ClickHouse timeseries database.


🧠 Deep Dive: Exactly-Once Processing and Streaming Windows

Guaranteeing exact metrics under flash load requires coordinate states across the processing pipeline.

The Internals of Sliding Windows and Deduplication

To count clicks exactly once, the system uses two layers of defense:

  1. Edge Deduplication: When a click arrives, the Collection Service builds a fingerprint hash using (user_id + ad_id + timestamp_minute). It checks if this key exists in Redis using an atomic transaction. If present, it drops the click as a duplicate.
  2. Stream Engine Checkpoints: In case of node crashes inside the streaming engine, Apache Flink uses Chandy-Lamport state snapshotting. Flink injects checkpoint barriers into the event stream. When a partition processor encounters a barrier, it saves its current window state (e.g., intermediate click count) to durable storage (Amazon S3). If a processor crashes, Flink rolls back the stream to the last checkpoint, replaying Kafka events from that offset.

Performance Analysis of Stream Ingestion and Storage Writes

Directly writing raw clicks to a relational database creates an unsustainable IO bottleneck. By using Flink's in-memory windows, we group clicks locally before flushing. For example, 50,000 raw clicks/second are aggregated in RAM per ad. Flink only writes the final sum to ClickHouse once per minute. This reduces database writes from 50,000 IOPS to just 1,000 IOPS, ensuring the timeseries store is not overloaded.


πŸ“Š Write and Read Path Sequences

Write Path Flow (Click Ingestion & Processing)

  1. The client clicks an ad, triggering a POST /clicks request.
  2. The Collection Service validates the click token and checks the Redis deduplication cache.
  3. The Collection Service publishes the event to Apache Kafka using ad_id as the partition key.
  4. Flink reads from Kafka, validates watermarks, and aggregates metrics inside 1-minute time windows.
  5. Flink flushes the 1-minute aggregates to ClickHouse.

Read Path Flow (Dashboard Query)

  1. The dashboard client requests metrics for an ad: GET /ads/{id}/metrics.
  2. The Query Service intercepts the request.
  3. The Query Service runs a SELECT query against the ClickHouse aggregated tables.
  4. The query returns timeseries metrics to the client in under 100 milliseconds.

🌍 Real-World Implementation: Ad Click Processing at Scale

Real-world advertising giants like Google and Meta split processing into hot and cold paths:

  • The Hot Path (Real-time): Stream aggregators like Flink compute click metrics over short sliding windows to update dashboards and trigger real-time fraud alerts.
  • The Cold Path (Batch): Raw click events are saved to Hadoop or Snowflake. Nightly batch jobs reconcile the day's traffic to calculate final advertiser billing.

βš–οΈ Trade-offs and Failure Modes: Latency vs. Accuracy in Streaming

Choosing a streaming architecture requires balancing latency, accuracy, and operational overhead:

MetricFlink Stateful Processing (Kappa)Spark Structured Streaming (Micro-Batch)Lambda Architecture (Batch + Stream)
Aggregation LatencySub-second (real-time processing)1-5 seconds (micro-batching)Low for dashboard, high for billing reconciliation
Consistency GuaranteeExactly-once (Chandy-Lamport checkpoints)Exactly-once (Transactional logs)Eventually consistent (Batch overwrites stream)
Storage OverheadMedium (state size in RocksDB)LowHigh (duplicate raw storage in data lake and DB)
Operational ComplexityHighMediumVery High (requires maintaining two codebases)

Our design uses Flink stateful processing to achieve real-time metrics with exactly-once guarantees.


🧭 Decision Guide: Lambda vs. Kappa Architecture

Use this decision table to select the correct streaming architecture for your application.

SituationRecommendationAlternative
Billing applications requiring real-time updates and strict audit pathsKappa Architecture (Flink stateful stream)Replays source logs to recover state after failures.
Simple count metrics where latency is not criticalMicro-Batching (Spark Streaming)Simpler to operate and scale.
Complex machine learning pipelines utilizing raw log historiesLambda ArchitectureCombines batch history with real-time stream feeds.

πŸ§ͺ Practical Interview Execution: 45-Minute Delivery Strategy

Manage your time during a system design interview using this schedule:

  1. Minutes 0-5 (Clarify Requirements): Agree on ingestion scale, peak QPS, deduplication constraints, and query latencies.
  2. Minutes 5-15 (High-Level Architecture): Whiteboard the API, Kafka buffer, Flink processor, and timeseries database.
  3. Minutes 15-30 (Deep Dive): Explain how Flink handles exactly-once processing using checkpoints and watermarks.
  4. Minutes 30-40 (Deduplication): Detail how the edge Redis cache prevents double-billing of click spam.
  5. Minutes 40-45 (Trade-offs): Summarize design trade-offs, comparing Kappa architecture to Lambda configurations.

To configure Apache Flink for exactly-once processing:

  • Enable checkpointing with a 10-second interval, using RocksDB as the state backend.
  • Set the checkpointing mode to EXACTLY_ONCE.
  • Configure the Kafka consumer to read with read_committed isolation, ensuring only committed transactional messages are aggregated.

πŸ“š Lessons Learned: Production Streaming Pitfalls

Avoid these standard mistakes when deploying stream aggregators:

  • Ignoring Late-Arriving Data: Network delays can cause click events to arrive minutes late. Ensure your Flink watermarks allow for an acceptable late-arrival buffer (e.g., 5 minutes) to avoid dropping valid clicks.
  • Unbounded State Growth: If you aggregate metrics over long windows (e.g., 24 hours) without eviction rules, Flink's state size will exceed memory limits, causing out-of-memory crashes.
  • Weak Deduplication TTLs: Setting the Redis deduplication TTL too short (e.g., 30 seconds) allows duplicate clicks from network retries to slip through and double-bill clients.

πŸ“Œ Summary: The Ad Click Aggregator Cheat Sheet

  • Edge Deduplication: Filter click spam early using Redis atomic fingerprint lookups.
  • Stateful Streaming: Use Apache Flink to aggregate metrics in-memory before writing to the database.
  • Exactly-Once Processing: Enable Flink checkpoints and Kafka transactional commits.
  • Timeseries Storage: Flush aggregated metrics to a timeseries database to support low-latency dashboard queries.
  • Watermarks: Configure Flink watermarks to handle late-arriving events due to network latency.

AI-generated article quiz

Test your understanding

🧠

Ready to test what you just learned?

Generate four focused questions from this article. Answers include immediate explanations.

Guided series path

System Design Interview Prep

View all lessons β†’
Lesson 27 of 72

Reader feedback

Was this article useful?

Rate it if it helped, then continue with the next deep dive when you are ready.

Sign in to save your rating.