System Design HLD Example: Proximity Service (Yelp)
A practical interview-ready HLD for finding nearby restaurants, hotels, and points of interest.
Abstract AlgorithmsTLDR: A proximity service (Yelp/Google Places) encodes every business location into a geohash string, indexes it in a B-tree column, and queries the center cell plus its 8 neighbors to guarantee no boundary misses. Redis
GEORADIUScaches hot-cell results for <2ms reads. Geographic sharding by geohash prefix keeps write load balanced globally. The hard problems are not the algorithm β they are the cell boundary edge case, the Manhattan hot-cell overload, and cache invalidation on address changes.
You open Yelp and search "coffee near me." Within 100ms, Yelp returns the 10 highest-rated coffee shops within 1 mile of your GPS coordinates, ranked by rating. Behind that response is a spatial index serving 100 million queries per day across 200 million indexed businesses. The naΓ―ve approach β a SQL range scan on latitude and longitude columns β fails at this scale: at 200 million businesses a bounding-box query returns tens of thousands of candidates before a single distance calculation, and latency climbs to seconds.
Understanding how to design this system teaches you geospatial indexing, location-aware caching, and geographic sharding β three patterns that appear verbatim in Uber driver matching, Google Maps nearby search, DoorDash delivery zone lookup, and every food delivery app you have used. By the end of this walkthrough you will know why geohash beats raw lat/lon queries, when to use a quadtree instead, and how Redis GEORADIUS eliminates database load for the hottest search cells.
π Why a SQL Range Query Can't Find Your Coffee Shop
Actors and Use Cases
| Actor | Role |
| End user | Searches "restaurants near me," views business details, submits reviews |
| Business owner | Creates and updates a business profile (name, address, hours, photos) |
| Platform service | Indexes location data; enforces rate limits; aggregates ratings |
Core user journeys in scope:
- Nearby search β User sends current lat/lon + radius + category filter; system returns the top-N ranked businesses sorted by rating
- Business detail β User selects a result; system returns the full profile including rating, hours, and photos served via CDN
- Add/update business β Owner submits a business; system validates, geocodes the address, persists the record, and updates the spatial index
- Write review β User submits a 1β5 star rating and text; system updates the running average rating on the business record
- View photos β User browses business photos delivered from a CDN edge node
Out of Scope (v1 boundary)
| Excluded feature | Why it is out of scope |
| Personalized ML ranking (collaborative filtering) | Depends on proximity as a prerequisite; a separate ranking service layer |
| Fraud detection for fake reviews | Separate async pipeline; does not touch the spatial index |
| Real-time traffic or live opening hours | Third-party data enrichment; not part of the location index design |
| Global active-active multi-region writes | Phase 2 evolution; single-region design is sufficient for the interview |
The Core Problem: Two-Dimensional Distance Is Not B-Tree Friendly
A database B-tree index is a one-dimensional sorted structure. A latitude index efficiently finds all rows where lat BETWEEN 37.78 AND 37.80, and a longitude index finds rows where lon BETWEEN -122.42 AND -122.40 β but no single B-tree covers both simultaneously. The query planner picks one index and scans millions of rows in memory to apply the second filter. At 200 million businesses a 1-mile bounding-box query returns roughly 50,000 candidates before any distance calculation, and end-to-end latency climbs to seconds.
The solution is a spatial index β a data structure that maps two-dimensional coordinates into a one-dimensional key while preserving spatial locality. Geohashing is the most widely adopted approach at database scale because it maps cleanly to SQL B-tree indexes and enables geographic sharding without additional infrastructure.
π Geohashing: How a Location Becomes a Sortable String
Geohash is a geocoding system that recursively divides the Earth into a grid of rectangular cells and encodes each cell as a base32 string. A longer string means a smaller cell and higher coordinate precision.
How encoding works in four steps:
- Divide the latitude range
[-90, 90]in half. Encode0if the target latitude is in the lower half,1if in the upper half. - Divide the longitude range
[-180, 180]the same way. - Interleave the latitude and longitude bits (they alternate position).
- Convert every 5 bits to a single base32 character using the alphabet
0123456789bcdefghjkmnpqrstuvwxyz.
Geohash precision reference table:
| Length (chars) | Cell width | Cell height | Practical use |
| 1 | Β±2,500 km | Β±2,500 km | Continent-level grouping |
| 4 | Β±39 km | Β±20 km | City-level grouping |
| 6 | Β±1.2 km | Β±0.6 km | Neighborhood-level β ideal for a 1-mile radius search |
| 8 | Β±38 m | Β±19 m | Street-level precision |
| 10 | Β±1.2 m | Β±0.6 m | Address-level precision |
For a 1-mile radius search, precision 6 is the right choice: each cell is roughly 1.2 km Γ 0.6 km. Union Square, San Francisco at (37.7874Β°N, 122.4080Β°W) encodes to geohash 9q8yyk. Blue Bottle Coffee two blocks away encodes to 9q8yyh β a different cell with the shared prefix 9q8yy, reflecting their physical proximity.
The critical insight: Two nearby locations share a geohash prefix. A SQL query WHERE geohash6 LIKE '9q8yy%' retrieves all businesses in that neighborhood in a single index scan β turning the two-dimensional problem back into a one-dimensional prefix lookup.
The boundary problem and its fix: A business sitting precisely on a cell boundary has a geohash that only matches one side of the boundary. Querying only the center cell will silently miss it even if it is within the search radius. The fix is always to query the center cell plus its 8 surrounding neighbors β 9 cells total β which guarantees complete coverage for any radius smaller than a single cell width.
Geohash neighborhood grid (center = user location "9q8yyk"):
ββββββββββββ¬βββββββββββ¬βββββββββββ
β 9q8yy4 β 9q8yy5 β 9q8yy7 β
ββββββββββββΌβββββββββββΌβββββββββββ€
β 9q8yyh β 9q8yyk β 9q8yym β β center
ββββββββββββΌβββββββββββΌβββββββββββ€
β 9q8yyu β 9q8yyv β 9q8yyw β
ββββββββββββ΄βββββββββββ΄βββββββββββ
Querying all 9 cells guarantees no boundary misses within 1 mile.
Standard geohash libraries expose a neighbors() function that returns the 8 adjacent cells in O(1) time (Python: pygeohash.neighbors, Java: com.github.davidmoten:geo).
βοΈ The Three-Step Search: Encode, Expand, Rank
Every proximity search β regardless of whether it runs against PostgreSQL, Elasticsearch, or Redis β resolves in three deterministic steps:
Step 1 β Encode. Convert the user's GPS coordinates to a geohash at the appropriate precision. Precision 6 for a 1-mile search; precision 5 (cell β 5 km Γ 5 km) for a 5-mile search; precision 4 for a 25-mile city-wide search.
Step 2 β Expand. Compute the 8 neighboring geohash cells using the neighbors() library call. Now you have 9 cells covering the full search area with guaranteed boundary coverage.
Step 3 β Query and rank. Execute a prefix scan against the business index for all 9 cells. Pull back the candidate set (typically a few hundred businesses per cell in a dense city), apply the Haversine formula for exact distance filtering, and sort by rating to produce the final ranked list.
-- PostgreSQL: 9-cell geohash IN query (runs against a B-tree index on geohash6)
SELECT
id, name, lat, lon, rating, review_count,
( 6371 * acos(
cos(radians(:userLat)) * cos(radians(lat))
* cos(radians(lon) - radians(:userLon))
+ sin(radians(:userLat)) * sin(radians(lat))
)) AS dist_km
FROM businesses
WHERE geohash6 IN (
'9q8yyk', '9q8yy4', '9q8yy5', '9q8yy7',
'9q8yyh', '9q8yym', '9q8yyu', '9q8yyv', '9q8yyw'
)
AND category = :category
AND is_active = TRUE
HAVING dist_km <= 1.6 -- 1 mile β 1.6 km
ORDER BY rating DESC
LIMIT 10;
The geohash6 column carries a standard B-tree index. An IN (9 values) clause performs 9 index seeks, each touching only the matching rows for that cell β typically a few hundred rows in a dense urban cell, a handful in rural cells. The Haversine calculation happens in the application layer after the candidate set is fetched, not in the database. This is deliberate: SQL Haversine without a spatial index forces a full table compute; geohash restricts the candidate set to a manageable size first.
Expanding the search radius: If the initial 9-cell search returns fewer than 10 results (rural area), expand to a wider geohash prefix (precision 5) and repeat. This "zoom-out" loop runs at most 2β3 times in practice.
π§ Deep Dive: Capacity, Design Goals, and the Data Structures Behind the Index
Capacity Estimation
| Dimension | Assumption | Derivation |
| DAU | 100M users | Industry benchmark for a Yelp-scale service |
| Businesses indexed | 200M | Global active businesses including restaurants, hotels, shops |
| Search QPS (steady) | ~11,500 reads/sec | 100M DAU Γ 10 searches/day Γ· 86,400 s |
| Search QPS (peak) | ~50,000 reads/sec | 5Γ steady-state burst (lunch, Friday evening) |
| Write QPS | ~100 writes/sec | New businesses + edits; 100:1 read/write ratio |
| Business record size | ~1 KB | Name, address, lat/lon, geohash6, category, rating, review count |
| Total business storage | ~200 GB | 200M Γ 1 KB |
| Review storage | ~50 TB | Estimated 100B cumulative reviews Γ ~500 B each |
| Hot search cache | ~10 GB Redis | Top 5% of cells handle 90% of queries (Zipf distribution) |
| Photos storage | ~500 TB | Served from CDN; out of scope for the index design |
Key insight: This is an overwhelmingly read-heavy workload. The search:write ratio is 100:1, which means the right strategy is to optimize the read path relentlessly β caching, read replicas, and a cache-friendly index structure β rather than optimizing write throughput.
Design Goals
| Goal | Measurable target | Architecture decision it drives |
| Search latency | p95 < 100ms end-to-end | Geohash B-tree index + Redis cache-first search path |
| Cache hit rate | β₯ 90% for major metro cells | GEORADIUS for hot cells; PostgreSQL replica for cold cells |
| Spatial correctness | Zero boundary misses | Always query 9 cells (center + 8 neighbors) |
| Write latency | p95 < 500ms for new business creation | Async spatial index update decoupled from the primary DB write |
| Availability | 99.9% on the search path | Read replicas; Redis Cluster failover; graceful degradation to DB |
| Geographic scale | Global; all 200M businesses indexed | Shard by LEFT(geohash6, 3) β 32Β³ possible shard keys |
The Internals: Geohash vs. Quadtree Data Structures
Geohash (B-tree on a VARCHAR(6) column):
Every business row stores a geohash6 CHAR(6) column. A standard B-tree index on this column makes WHERE geohash6 IN (9 values) an efficient 9-seek operation with O(log N) per seek. This approach slots into any SQL database without extensions and supports geographic sharding by taking the first 3 characters as a shard key β producing 32Β³ = 32,768 possible shard ranges that naturally distribute global geography without hotspots.
Quadtree (in-memory spatial partition tree):
A quadtree recursively partitions a bounding box into 4 equal quadrants. When a quadrant's business count exceeds a threshold (e.g., 100), it splits further. A proximity search traverses the tree to find all quadrants overlapping the search radius. The key advantage is density-adaptive partitioning: a Manhattan quadrant is a much smaller geographic area than a Montana quadrant because they both hold at most 100 businesses. The critical disadvantage is that quadtrees live in memory β they are hard to shard across machines and expensive to rebuild from persistent storage on restart.
Rule of thumb: Geohash for persistent business data backed by a SQL database; quadtree for ephemeral, high-frequency in-memory data (Uber driver positions update every 4 seconds and the full active-driver dataset fits in one machine's RAM).
| Index type | Storage | Update cost | Shard-friendly | Best for |
| Geohash + B-tree | SQL column | O(1) UPDATE | β Yes β prefix sharding | Business directories (static) |
| Quadtree | In-memory tree | O(log N) re-insert | β Hard | Driver / courier positions (dynamic) |
| PostGIS R-tree | SQL extension | O(log N) | βͺ With effort | Complex polygon intersections |
| Redis GEORADIUS | In-memory sorted set | O(log N) GEOADD | βͺ Per-node limit | Hot-path cache (<50M points/node) |
Performance Analysis: Where the Search Gets Slow
| Pressure point | Symptom | Root cause | Mitigation |
| Dense cell overload (the Manhattan problem) | 9-cell query returns 100K rows | Precision-6 cell is too coarse for dense urban areas | Increase precision to 7 (cell β 150m Γ 150m) in high-density zones |
| Boundary miss | Known businesses absent from results near cell edges | Searching only the center cell | Always query 9 cells β non-negotiable correctness requirement |
| Cache miss storm | Latency spikes during concerts or sporting events | Thousands of cold searches hit the same DB cells simultaneously | Pre-warm Redis for upcoming events; increase TTL for major metro cells |
| Write amplification during bulk import | Index update latency spikes | Every INSERT updates DB + spatial index synchronously | Batch spatial index updates via async queue; synchronous only for address changes |
| Stale cached results after business closes | Closed businesses appear in search | Cache TTL (10 min) has not expired after business deactivation | Publish a cache invalidation event on every status change; invalidate search:results:<geohash6>:* |
The largest performance win at this scale is not the choice of spatial algorithm β it is the Redis cache layer in front of the database. A single metropolitan geohash cell queried 10,000 times per second can be served from a single Redis key rather than triggering 10,000 separate PostgreSQL index scans.
π High-Level Architecture: From Mobile Client to Ranked Results
flowchart TD
A[Mobile / Web Client] -->|"GET /search?lat&lon&radius&category"| B[API Gateway]
B --> C[Location Search Service]
C -->|"GEORADIUS geo:businesses 1mi ASC"| D[Redis Geo Cache]
D -->|"cache hit ~90%"| E[Ranked Results Response]
D -->|cache miss| F[Business DB Read Replica]
F -->|"geohash6 IN 9 cells + category filter"| G[Businesses Table + geohash6 B-tree index]
G --> C
C -->|"haversine filter + rating sort + SET cache EX 600"| E
E -->|JSON response| A
H[Business Owner] -->|"POST /businesses"| I[Business Service]
I -->|"validate + geocode to lat/lon"| J[Business DB Primary]
I -->|"GEOADD geo:businesses lon lat biz_id"| D
J -->|async domain event| K[Index Updater Service]
K -->|"UPDATE geohash6 + invalidate cache cell"| J
L[Reviewer] -->|"POST /reviews/:id"| M[Review Service]
M -->|"UPDATE businesses SET rating=avg, review_count=count+1"| J
M -->|"DEL search:results::*"| D
This diagram shows the read path (top) and write path (bottom) of the proximity service. On the read path, the Location Search Service checks Redis first with GEORADIUS: roughly 90% of searches for popular metropolitan cells return from Redis in under 2ms without touching PostgreSQL. On a cache miss, the service falls back to the Business DB read replica, executes the 9-cell geohash IN query, applies the Haversine filter in the application layer, sorts by rating, caches the result in Redis for 10 minutes, and returns the ranked list. On the write path, new businesses and address updates flow to the PostgreSQL primary through the Business Service, which also issues a GEOADD to keep Redis warm. Review submissions update the aggregate rating on the business record and invalidate the affected cache cells β so rating changes appear in search results within one cache TTL window.
Data Model
-- Businesses table (sharded by LEFT(geohash6, 3) across read replicas)
CREATE TABLE businesses (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(255) NOT NULL,
address TEXT NOT NULL,
lat DOUBLE PRECISION NOT NULL,
lon DOUBLE PRECISION NOT NULL,
geohash6 CHAR(6) NOT NULL, -- precision 6 β 1.2km Γ 0.6km cell
category VARCHAR(64) NOT NULL,
rating NUMERIC(3,2) NOT NULL DEFAULT 0.00,
review_count INT NOT NULL DEFAULT 0,
is_active BOOLEAN NOT NULL DEFAULT TRUE,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Primary spatial index β supports the 9-cell IN query
CREATE INDEX idx_businesses_geohash6 ON businesses (geohash6);
-- Composite index β avoids a filter step when category is always provided
CREATE INDEX idx_businesses_geohash6_category ON businesses (geohash6, category);
-- Reviews table
CREATE TABLE reviews (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
business_id UUID NOT NULL REFERENCES businesses(id),
user_id UUID NOT NULL,
rating SMALLINT NOT NULL CHECK (rating BETWEEN 1 AND 5),
body TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_reviews_business ON reviews (business_id, created_at DESC);
Redis key layout:
| Key pattern | Data structure | Value | TTL |
geo:businesses | Sorted Set (GEOADD) | Member = business_id; score = geo-encoded lon/lat | No TTL β persistent warm index |
search:results:<geohash6>:<category> | String (JSON) | Ranked array of business summaries | 10 minutes |
business:detail:<id> | Hash | Full business profile fields | 30 minutes |
Write Path: Adding a New Business
Business Owner β POST /businesses
β Business Service
β Validate required fields; geocode address β (lat, lon)
β Compute geohash6 = geohash.encode(lat, lon, precision=6)
β INSERT INTO businesses (id, name, lat, lon, geohash6, category, ...) [DB Primary]
β GEOADD geo:businesses lon lat biz_id [Redis]
β DEL search:results:<geohash6>:* [Redis invalidation]
β 201 Created { businessId, name }
Read Path: Searching Nearby Restaurants
User β GET /search?lat=37.787&lon=-122.408&radius=1mi&category=coffee
β Location Search Service
β userGeohash = geohash.encode(37.787, -122.408, 6) β "9q8yyk"
β Check: GET search:results:9q8yyk:coffee
Cache HIT β deserialize + return top-10 in ~2ms total
Cache MISS β
β cells = [userGeohash] + neighbors(userGeohash) (9 cells)
β SELECT businesses WHERE geohash6 IN (cells) AND category='coffee'
β Filter by haversine distance β€ 1.6 km in application layer
β Sort by rating DESC β take top 10
β SET search:results:9q8yyk:coffee <json> EX 600
β 200 OK [{ id, name, dist_km, rating, review_count }, ...]
π Real-World Application: How Yelp, Uber, and Google Maps Solve Proximity
Real-world systems make different spatial index choices based on their specific update patterns and query shapes:
| Company | Index approach | Why this choice |
| Yelp | Geohash column + Elasticsearch geo_point | Business data is mostly static; a B-tree on geohash6 handles bulk search; Elasticsearch layers full-text ranking on top |
| Uber (driver matching) | Custom in-memory quadtree | Driver GPS updates every 4 seconds β writing every update to a DB is too expensive; the full active-driver dataset fits in one machine's RAM |
| Google Maps / Places | S2 Geometry (Hilbert curve cells) | S2 cells are equal-area spherical polygons with no boundary distortion at poles; Google needs global-scale analytics and precise spherical distance calculations |
| DoorDash / Instacart | PostGIS with R-tree index | Delivery zones are irregular polygons (not circles); PostGIS handles native polygon intersection without manual cell math |
| Airbnb | Elasticsearch geo_distance filter | Accommodation search combines text ranking with geo filtering; Elasticsearch natively combines both in one query |
The common thread: static data at scale β geohash in SQL. Dynamic, high-frequency data β in-memory spatial structure. Complex polygon geometry β PostGIS or S2. Choose the tool that matches the actual query shape and update rate of your data, not the tool with the most impressive name.
βοΈ Trade-offs & Failure Modes: Boundary Crossings, Hot Cells, and Stale Indexes
Failure Mode 1: Cell Boundary Miss (The Most Common Bug)
A business located 10 meters across a geohash cell boundary will not appear in a single-cell WHERE geohash6 = 'X' query, even if it is well within the user's search radius. In unit tests this is invisible β test data rarely falls exactly on a boundary. In production, real users consistently miss businesses near their location.
Fix: Query 9 cells unconditionally. There is no configuration toggle for this β single-cell proximity queries are incorrect by definition.
Failure Mode 2: The Manhattan Hot-Cell Problem
At precision 6, a single geohash cell in Manhattan or central Tokyo contains thousands of active businesses. A 9-cell query returning 50,000 candidates before the Haversine filter is a latency bomb: the candidate fetch alone can take 200ms, violating the 100ms SLO.
Two mitigations applied together:
- Increase geohash precision dynamically for dense areas. Use precision 7 (cell β 150m Γ 150m) for cells above a business-count threshold (e.g., >500 businesses per cell). The 9-cell query still runs, but each cell is 4Γ smaller.
- Cache cell results aggressively. In a dense city, the same cell+category combination is queried thousands of times per minute. A 10-minute Redis TTL means the expensive DB scan runs once every 600 seconds, not 600,000 times.
Failure Mode 3: Stale Spatial Index After Business Closes
A business that permanently closes retains its old geohash in the index until invalidated. Users see a closed business as an active search result.
Fix: The Business Service sends a cache invalidation event on every is_active transition, which deletes search:results:<geohash6>:* from Redis. The next query fetches from DB, which now has is_active = FALSE on the row, and the business is excluded from results. Cache freshness within 10 minutes (one TTL window) is the accepted trade-off.
Consistency and Trade-off Summary
| Dimension | Design choice | Rationale |
| Search result freshness | Eventual (10-minute cache TTL) | New businesses appear within one TTL window; near-real-time not required |
| Rating consistency | Strong on write (synchronous UPDATE businesses SET rating=β¦) | Rating is user-visible; stale ratings directly hurt trust |
| Read vs. write scaling | Scale read replicas independently of the primary | 100:1 read/write ratio; replica fan-out absorbs search load with no write path changes |
| Redis vs. PostgreSQL | Redis for hot path; PostgreSQL as source of truth | Redis is ephemeral; PostgreSQL is durable β never treat Redis as the only copy |
π§ Decision Guide: Geohash, Quadtree, or PostGIS?
| Situation | Recommendation |
| Static business directory with >10M records | Geohash column + B-tree index; shard DB by geohash prefix |
| Frequently moving objects (drivers, couriers, scooters) | In-memory quadtree or S2; avoid writing every GPS ping to a relational DB |
| Delivery zones, service areas, or polygon-shaped regions | PostGIS with R-tree; native polygon intersection without manual cell math |
| Redis as primary hot-path index | GEOADD/GEORADIUS β ideal for β€50M points per Redis node |
| Starting fresh with <1M businesses | PostGIS β simpler operational story, rich feature set, no custom encoding logic |
| Dense urban cells causing latency overload | Increase geohash precision from 6 β 7 for affected cells; combine with Redis cell caching |
| Need to combine proximity with full-text search | Elasticsearch geo_point + full-text query in a single request |
| Avoid when | Never use raw lat/lon range scans without a spatial index at >1M rows |
π§ͺ Practical Example: Searching "Coffee Near Union Square, SF"
A user at Union Square, San Francisco (37.7874Β°N, 122.4080Β°W) opens Yelp and taps "Coffee, 1 mile." Here is the full resolution trace:
Step 1 β Encode user location:
import pygeohash as geohash
user_geohash = geohash.encode(37.7874, -122.4080, precision=6)
# user_geohash = "9q8yyk"
Step 2 β Expand to 9 cells:
neighbors = geohash.neighbors("9q8yyk")
all_cells = ["9q8yyk"] + list(neighbors.values())
# all_cells = ["9q8yyk", "9q8yy4", "9q8yy5", "9q8yy7",
# "9q8yyh", "9q8yym", "9q8yyu", "9q8yyv", "9q8yyw"]
# These 9 cells cover roughly 4 kmΒ² around Union Square
Step 3a β Redis fast path (cache hit, ~90% of requests):
GEORADIUS geo:businesses -122.4080 37.7874 1 mi
ASC COUNT 50 WITHCOORD WITHDIST
-- Returns 48 business IDs sorted by distance
-- Application re-ranks top 50 by rating β returns top 10
-- Total latency: ~1.8ms
Step 3b β Database fallback (cache miss, ~10% of requests):
SELECT id, name, lat, lon, rating,
(6371 * acos(cos(radians(37.7874)) * cos(radians(lat))
* cos(radians(lon) - radians(-122.4080))
+ sin(radians(37.7874)) * sin(radians(lat)))) AS dist_km
FROM businesses
WHERE geohash6 IN ('9q8yyk','9q8yy4','9q8yy5','9q8yy7',
'9q8yyh','9q8yym','9q8yyu','9q8yyv','9q8yyw')
AND category = 'coffee'
AND is_active = TRUE
HAVING dist_km <= 1.6
ORDER BY rating DESC
LIMIT 10;
-- 9 B-tree index scans β ~280 candidate rows β ~80 pass Haversine filter
-- Total latency: ~25ms
Step 4 β Cache the result for subsequent requests:
SET search:results:9q8yyk:coffee
'[{"id":"biz_001","name":"Blue Bottle Coffee","dist_km":0.12,"rating":4.6}, ...]'
EX 600
-- Next 10 minutes: all identical queries return from Redis in ~2ms
Latency summary:
- Cache hit path: ~2ms Redis + ~5ms serialization/network = ~7ms total β
- Cache miss path: ~25ms SQL + ~5ms network + ~2ms Redis write = ~32ms total β
Both paths comfortably meet the 100ms design goal.
π οΈ Redis Geo Commands: How They Power Proximity Search at Yelp Scale
Redis ships a native geospatial data structure built on geohash encoding, stored internally as a sorted set with geo-encoded scores. For the hot path of a proximity service, Redis handles spatial queries without any additional infrastructure.
Indexing a business (on create or address update):
GEOADD geo:businesses -122.4080 37.7874 "biz_001"
-- Format: GEOADD <key> <longitude> <latitude> <member>
-- Redis encodes (lon, lat) as a 52-bit geohash and stores it as the sorted set score
-- Multiple businesses can be added in one call:
GEOADD geo:businesses
-122.4080 37.7874 "biz_001"
-122.4123 37.7901 "biz_002"
-122.4055 37.7856 "biz_003"
Searching within 1 mile of a user:
GEORADIUS geo:businesses -122.4080 37.7874 1 mi
ASC COUNT 50 WITHCOORD WITHDIST
-- Returns list of [member, distance, [lon, lat]] tuples:
-- [
-- ["biz_001", "0.12", [-122.40800, 37.78739]],
-- ["biz_002", "0.31", [-122.41230, 37.79010]],
-- ...
-- ]
-- ASC = sorted nearest-first; COUNT 50 = candidate cap; application re-ranks by rating
Looking up the stored position of a specific business:
GEOPOS geo:businesses "biz_001"
-- Returns: [[-122.40800142288208, 37.78739983518099]]
-- Slight precision loss (~0.6 mm) from the 52-bit geohash encoding is acceptable
Expanding the search radius when results are sparse (rural areas):
-- First try: 1 mile
GEORADIUS geo:businesses :lon :lat 1 mi ASC COUNT 10
-- If fewer than 10 results, expand to 5 miles:
GEORADIUS geo:businesses :lon :lat 5 mi ASC COUNT 10
When to use Redis Geo vs. the geohash column in PostgreSQL:
| Dimension | Redis GEORADIUS | PostgreSQL geohash column |
| Latency | ~1β2ms (fully in-memory) | ~10β30ms (disk-backed) |
| Dataset size limit | ~50M points per Redis node | Unlimited (sharded across nodes) |
| Durability | AOF/RDB as secondary backup | Primary source of truth |
| Complex filters | Limited (distance + count only) | Full SQL (JOIN, category, is_active, rating) |
| Role in this system | Hot-path cache layer | Source of truth + cold-miss search path |
The production pattern: PostgreSQL is the source of truth for all 200M businesses; Redis geo:businesses holds the warm geo index for the hottest 5β10M businesses in major metros. A Redis miss falls back to PostgreSQL. GEOADD is called synchronously on every business create and address change to keep the Redis index current.
For a full deep-dive on Redis data structures, persistence modes, and cluster configuration, see System Design: Caching and Asynchronism.
π Lessons Learned from Building Location-Aware Systems
Always query 9 geohash cells, never just the center. A single-cell query produces systematic missing-data bugs that are invisible in unit tests (test data rarely lands exactly on a boundary) but reproduced continuously by real users near cell edges. Make the 9-cell expansion unconditional.
Cache at the cell level, not the business level. Business-level caches require one invalidation per business per change. Cell-level caches require one key deletion per affected cell β dramatically simpler invalidation logic with a much higher hit rate.
Choose geohash precision based on search radius, not intuition. Precision 6 for 1-mile, precision 5 for 5-mile, precision 4 for 25-mile. Too coarse causes hot-cell overload; too fine requires querying too many cells to cover the radius.
Geohash is approximate; Haversine is exact. The 9-cell query intentionally over-fetches candidates. Always apply the Haversine formula in the application layer to filter to the true radius before ranking. Skipping this step returns businesses at the corners of the bounding grid that are outside the actual radius.
The hard part is not the spatial algorithm β it is the invalidation strategy. Encoding a geohash is trivial (one library call). Knowing when to expire a cached cell after a business closes, moves, or changes category requires deliberate event-driven architecture.
Separate the hot write path from the cold index rebuild. Address changes and new business creation must update the spatial index synchronously (otherwise users immediately see missing results). Bulk re-indexing jobs, metadata enrichment, and rating aggregation can be async β decoupling them prevents import jobs from degrading search latency.
π TLDR β Summary & Key Takeaways: Proximity Service in Five Decisions
- Geohash encodes (lat, lon) into a sortable string that fits natively in a SQL B-tree index. A prefix query across 9 cells is the entire search operation β no spatial extension required.
- Always query 9 cells (center + 8 neighbors) β single-cell queries produce correctness bugs at cell boundaries. This is non-negotiable.
- Redis
GEORADIUSserves the hot path with ~2ms latency for major metro cells; PostgreSQL is the cold fallback and the sole source of truth. Never treat Redis as the only copy of the location data. - Shard by geohash prefix (
LEFT(geohash6, 3)) to distribute geographic write and read load evenly β 32K possible shard keys maps naturally to global geography without manual region boundaries. - The Manhattan hot-cell problem (one cell, millions of businesses) is solved by increasing geohash precision from 6 to 7 in density-detected zones and caching cell-level search results aggressively in Redis.
π Practice Quiz
Why does a standard B-tree index on separate
latandloncolumns fail to support efficient proximity queries at 200M businesses?- A) B-trees do not support floating-point columns
- B) A B-tree is a one-dimensional structure and cannot simultaneously index two spatial dimensions; the planner must apply one filter in memory
- C) Databases do not allow composite indexes across geographic coordinate columns
- D) Lat/lon columns require special geospatial encoding before they can be indexed
Correct Answer: B
You search for "coffee within 1 mile" using only the user's center geohash cell (precision 6). A highly-rated coffee shop 40 meters from the user is not returned. What is the most likely explanation?
- A) The geohash precision is too fine, causing the coffee shop's cell to be excluded
- B) The coffee shop's rating is below the minimum threshold
- C) The coffee shop sits in an adjacent cell, and the search did not include the 8 neighboring cells
- D) Redis
GEORADIUSdoes not support sub-mile radius searches
Correct Answer: C
Yelp receives 40,000 search queries per second for "restaurants in downtown Manhattan." Each query hits PostgreSQL in 30ms. Which single change delivers the greatest latency reduction?
- A) Add a second B-tree index on the
ratingcolumn alongside the geohash index - B) Cache search results keyed by
geohash6 + categoryin Redis with a 10-minute TTL - C) Increase the PostgreSQL connection pool size to 2,000 connections
- D) Migrate all business data from PostgreSQL to a document database
Correct Answer: B
- A) Add a second B-tree index on the
A food-delivery startup must index both restaurant locations (mostly static, 5M records) and active delivery couriers (GPS position updated every 3 seconds, 50K active at peak). Which combination of spatial indexes is most appropriate?
- A) Geohash B-tree column for both β consistent tooling simplifies operations
- B) PostGIS R-tree for restaurants; in-memory quadtree for couriers
- C) Geohash B-tree column for restaurants; in-memory quadtree or Redis GEOADD for couriers
- D) Redis GEOADD for both β low latency is the priority for all location lookups
Correct Answer: C
Open-ended challenge: A sold-out stadium concert begins and 60,000 attendees simultaneously search for "food near [stadium address]" β all landing on the same geohash cell. Describe two distinct architectural mechanisms that prevent this event-driven spike from overwhelming the database, identify which one you would deploy first, and explain the reasoning behind that prioritization.
π Related Posts

Written by
Abstract Algorithms
@abstractalgorithms
More Posts
LangChain Tools and Agents: The Classic Agent Loop
π― Quick TLDR: The Classic Agent Loop TLDR: LangChain's @tool decorator plus AgentExecutor give you a working tool-calling agent in about 30 lines of Python. The ReAct loop β Thought β Action β Observation β drives every reasoning step. For simple l...
LangChain RAG: Retrieval-Augmented Generation in Practice
β‘ TLDR: RAG in 30 Seconds TLDR: RAG (Retrieval-Augmented Generation) fixes the LLM knowledge-cutoff problem by fetching relevant documents at query time and injecting them as context. With LangChain you build the full pipeline β load β split β embed...
LangChain Memory: Conversation History and Summarization
TLDR: LLMs are stateless β every API call starts fresh. LangChain memory classes (Buffer, Window, Summary, SummaryBuffer) explicitly inject history into each call, and RunnableWithMessageHistory is the modern LCEL replacement for the legacy Conversat...

LangChain 101: Chains, Prompts, and LLM Integration
TLDR: LangChain's LCEL pipe operator (|) wires prompts, models, and output parsers into composable chains β swap OpenAI for Anthropic or Ollama by changing one line without touching the rest of your c
