High-Level Design: Scaling a Concert Ticket Booking System under Flash Load
Design a scalable concert ticket booking system that handles massive traffic surges and prevents double-booking.

Abstract Algorithms
Helping engineers master software engineering topics.
TLDR: Designing a high-scale ticket booking system requires balancing high read traffic (seat map lookups) with extreme write concurrency (seat lock attempts) during popular concert drops. We achieve this using Redis-based temporary ticket locks and distributed queues.
π System Overview & Scale-Based Design Challenge
Imagine the ticketing drops for a global pop star. Within seconds of going live, a stadium with 50,000 seats receives over 1 million concurrent connection requests. If the system is not designed to handle this traffic, database locking bottlenecks occur immediately.
If multiple users attempt to reserve the same seat simultaneously, and the database relies on standard pessimistic transactions, threads stall waiting for locks. The connection pool starves, and the site crashes. Even worse, if locking is not handled correctly, different users might pay for the same seat, leading to double-bookings and operational failures.
The core challenge of a booking system is decoupling the high-traffic seat selection path from the actual payment transactions. This ensures users do not overload the relational database during seat selection, while guaranteeing that payments are processed securely and without conflicts.
π Core Requirements and Capacity Estimation
To establish a clear design scope, we define our requirements and estimate the system capacity.
Functional Requirements
- Search & View Event: Users can search for events and view available seats in real-time.
- Reserve Seats: Users can temporarily hold/lock seats for 10 minutes while they enter payment details.
- Confirm Booking: Users complete payment, converting the hold into a confirmed booking.
- Auto-Release Hold: If the 10-minute payment window expires without payment, the seats are made available to other users.
Non-Functional Requirements
- High Availability: High read availability for event searches and seat map lookups.
- Strict Consistency: No double-bookings allowed; the seat reservation lock must be atomic.
- Low Latency: Seat locks must be acknowledged in under 100 milliseconds under load.
Capacity Estimations
Let's calculate the load for a major ticket release:
- Active Users: 1,000,000 users attempting to buy tickets during the first 5 minutes of a major drop.
- Search/Read Traffic: Each user refreshes the seat map 3 times. Total search requests = 3,000,000. QPS = 3,000,000 / 300 seconds = 10,000 Read QPS.
- Reserve/Write Traffic: 100,000 reservation attempts in the first 5 minutes. QPS = 100,000 / 300 seconds = 333 Write QPS.
- Network Ingress: Seat map data size is 100 KB. Total read data transfer = 10,000 QPS * 100 KB = 1 GB/sec bandwidth.
- Storage Size: An event has 50,000 seats. Each seat record size is 100 bytes. Active event seat storage size = 5 MB.
βοΈ Core Mechanics: API, Schema, and Storage Architecture
Our system uses separate read and write paths to scale.
API Design
We define the core REST contract for the booking flow:
| Endpoint | HTTP Method | Description | Input Parameters | Return Format |
/api/v1/events/{id}/seats | GET | Fetch the current available seat map | event_id, show_time | JSON seat coordinate array |
/api/v1/reservations | POST | Temporarily hold selected seats | event_id, seat_ids, user_id | reservation_id, expires_at |
/api/v1/bookings | POST | Complete purchase and confirm booking | reservation_id, payment_token | booking_id, status |
Database Schema (Relational Store)
While read maps are cached, the system of record requires a relational database (PostgreSQL/MySQL) to manage consistency:
| Table Name | Column Name | Data Type | Key Type | Indexing Strategy |
| Events | event_id | VARCHAR(64) | Primary Key | - |
| Events | name, date | VARCHAR, TIMESTAMP | - | Index on date |
| Seats | seat_id | VARCHAR(64) | Primary Key | - |
| Seats | event_id | VARCHAR(64) | Foreign Key | Composite Index (event_id, status) |
| Seats | row, number | VARCHAR, INT | - | - |
| Seats | status | VARCHAR(16) | - | Enum: AVAILABLE, HELD, BOOKED |
| Bookings | booking_id | VARCHAR(64) | Primary Key | - |
| Bookings | user_id, price | VARCHAR, INT | - | - |
| Bookings | status | VARCHAR(16) | - | Enum: PENDING, CONFIRMED, FAILED |
Cache Schema (Redis Key-Value Design)
To support fast lock resolution during search surges:
| Key Pattern | Value Format | TTL | Eviction Policy | Purpose |
event_map:{event_id} | JSON string of coordinates | 5 seconds | volatile-lru | High-speed cache for read map lookups |
seat_lock:{event_id}:{seat_id} | user_id string | 10 minutes | noeviction | Distributed lock for seat reservation |
π Architectural Blueprint: High-Level System Flow
The diagram below maps the architecture and components of our booking platform:
graph TD
Client[User Browser] -->|Seat Map Query| CDN[Cloudflare CDN]
CDN -->|Cache Miss| API[API Gateway]
API -->|Read Path| Cache[Redis Cache Cluster]
API -->|Write Path| LockSvc[Locking Service]
LockSvc -->|Acquire Lock| RedisLock[Redis Distributed Lock]
LockSvc -->|Create Temp Hold| RDB[PostgreSQL Primary]
API -->|Confirm Purchase| PaySvc[Payment Service]
PaySvc -->|Publish Event| MQ[Apache Kafka]
MQ -->|Async Update| Worker[Worker Service]
Worker -->|Finalize Booking| RDB
This system diagram illustrates the architecture of our ticketing platform. Read requests for event seat maps are served directly by the CDN or a Redis cache cluster. Write requests (seat locks) are routed to a dedicated Locking Service that evaluates seat availability using Redis-based distributed locks. Once a seat is locked, the payment is processed asynchronously using Kafka messaging, and the final state is written to the PostgreSQL relational database by a worker service.
π§ Deep Dive: Solving Concurrency and Double Booking
Managing concurrency at this scale requires decoupling the locking mechanics from the database transactions.
The Internals of Distributed Locks and DB Transactions
To prevent two users from booking the same seat, we use Redis-based Distributed Locks using the Redlock algorithm or simple SETNX operations:
- When a user requests a seat lock, the Locking Service executes a SETNX command:
SET seat_lock:{event_id}:{seat_id} {user_id} NX PX 600000. This sets the key only if it does not exist, with an expiration time of 10 minutes. - If the command returns success, the user has acquired the lock. The status of the seat in the relational database is updated to
HELDusing a simple transaction. - If the command fails, the user is notified immediately that the seat is already locked.
This approach keeps database traffic low. We validate seat availability in Redis memory before executing database transactions, protecting the relational store from load spikes.
Performance Analysis of Ticket Locking and Queueing
To handle high payment confirmation traffic, we introduce a message queue (e.g., Apache Kafka). When a user clicks "Buy Now" and enters payment details:
- The system publishes a
PaymentInitiatedevent to Kafka and returns aPendingstatus to the client, freeing HTTP threads to handle other requests. - A dedicated Payment Service processes the transaction asynchronously, interacting with external gateways.
- Once payment succeeds, a worker updates the seat status to
BOOKEDin the database, removes the Redis lock, and sends a confirmation email.
If payment fails or the 10-minute window expires, the Redis key is deleted, and the seat status is reset to AVAILABLE automatically.
π Write and Read Path Sequences
Write Path Flow (Seat Locking)
- The client sends a
POST /reservationsrequest to the API Gateway. - The Locking Service attempts to acquire a Redis lock for the seat using
SETNX. - If successful, the Locking Service updates the seat status to
HELDin the database and writes a temporary record. - The system returns a success status with a 10-minute expiration countdown.
- If the lock attempt fails, the system returns a conflict error (409) in under 50 milliseconds.
Read Path Flow (Seat Map Fetch)
- The client sends a
GET /events/{id}/seatsrequest. - The request is intercepted by the CDN. If the seat map cache is warm (less than 5 seconds old), it is returned immediately.
- If it is a cache miss, the request hits the Read Service.
- The Read Service fetches seat availability from Redis, falls back to the database on a miss, and updates the cache.
π Real-World Implementation: Ticketmaster and Ticketfly
Real-world ticket distributors split their systems into three domains:
- Queue-It Integration: Waiting rooms that rate-limit incoming users, protecting downstream APIs from traffic spikes during drops.
- In-Memory Locks: Using caching technologies like Redis or Memcached to handle rapid lock evaluation, ensuring database connections do not exhaust.
- Payment Handlers: Using asynchronous architectures with Kafka to throttle payment requests, preventing transactional systems from overloading.
βοΈ Trade-offs and Failure Modes: Optimistic vs Pessimistic Locking
Selecting a locking strategy involves balancing consistency and system throughput:
| Strategy | Performance Under Load | Database Impact | Error Handling |
| Optimistic Locking | High throughput, low latency | Low (no database locks held) | High retry rate for users (many conflict updates) |
| Pessimistic Locking | Low throughput (threads block) | High (database connection pool starves) | Low error rate, but risks system crashes |
| Redis Distributed Locking | High throughput, very low latency | Extremely low (validations happen in-memory) | Requires managing lock lease renewals |
Our design uses Redis distributed locking to achieve high performance while maintaining strict data consistency.
π§ Decision Guide: Cache-Aside vs Queue-Based Booking
Use this decision table to guide system design choices based on scale and consistency requirements.
| Situation | Recommendation | Alternative |
| Highly anticipated events with extreme traffic spikes | Redis Distributed Locks + Queue-Based Payment | Prevents database overload during flash sales. |
| Regular event scheduling with low concurrent bookings | Standard Relational Database Transactions | Simpler to implement and maintain. |
| High read traffic, but low write volume | CDN Caching + Optimistic Database Locking | Simple caching without distributed lock overhead. |
π§ͺ Practical Interview Execution: 45-Minute Delivery Strategy
When presenting this design in an interview, manage your time using this schedule:
- Minutes 0-5 (Clarify Requirements): Establish scale expectations (Active users, QPS) and write functional requirements on the whiteboard.
- Minutes 5-15 (High-Level Architecture): Sketch the CDN, Gateway, Read/Write split services, and database layers.
- Minutes 15-30 (Deep Dive): Explain how you prevent double-booking using Redis distributed locks. Write out the exact Redis commands and DB tables.
- Minutes 30-40 (Asynchronous Payments): Detail the Kafka payment flow, handling edge cases like network timeouts during gateway calls.
- Minutes 40-45 (Trade-offs): Summarize the design's trade-offs, discussing optimistic locking and partition strategies for database scaling.
π οΈ Apache Kafka: Messaging Configuration
In high-concurrency booking systems, Apache Kafka is configured with partition keys set to event_id. This ensures that all transactions for a specific concert drop are processed in order by the same worker instance, preventing write conflicts.
We configure the topic with replication factor 3 and acks=all to guarantee that message commits are persisted across multiple broker instances, protecting the system against broker failures during drops.
π Lessons Learned: Production Scaling Pitfalls
Avoid these standard mistakes when deploying booking platforms to production:
- Setting Long Lock Durations: Keeping seat locks active for too long (e.g., 30 minutes) allows users to tie up inventory without purchasing, frustrating other customers. Keep locks short (10 minutes max).
- Missing Lock Expiration Handlers: Ensure your lock expiration process automatically resets seat statuses in the database. If the cleanup worker fails, seats can remain locked permanently.
- Direct Database Seat Queries: Never query the relational database directly to build the seat map interface for customers. Use memory-based caches to prevent database crashes.
π Summary: High-Scale Booking Cheat Sheet
- Redis Locks: Use Redis distributed locks for fast, memory-based seat validation.
- Decoupled Paths: Separate the high-traffic seat selection path from the transactional payment system.
- Asynchronous Processing: Use message queues to process payments asynchronously, protecting backend systems from load spikes.
- Short Holds: Limit seat lock durations to 10 minutes to maintain high inventory turnover.
- Read Caches: Serve seat map read queries from CDNs and caches to protect databases from concurrent traffic.
AI-generated article quiz
Test your understanding
Ready to test what you just learned?
Generate four focused questions from this article. Answers include immediate explanations.
Guided series path
System Design Interview Prep
Reader feedback
Was this article useful?
Rate it if it helped, then continue with the next deep dive when you are ready.
Sign in to save your rating.
Article metadata