CQRS Pattern: Separating Write Models from Query Models at Scale
Design independent command and query paths to scale reads without weakening write correctness.
Abstract AlgorithmsAI-assisted content. This post may have been written or enhanced with AI tools. Please verify critical information independently.
TLDR: CQRS works when read and write workloads diverge, but only with explicit freshness budgets and projection reliability. The hard part is not separating models — it is operating lag, replay, and rollback safely.
An e-commerce platform's order service was running 47-table JOINs to serve its summary page — because reads and writes shared the same normalized model. Dashboards, search, and payment-status polls all hit the same write store. Adding read indexes slowed writes; write locks stalled dashboards. Response times hit 4 seconds. CQRS separates the write model (normalized, enforces invariants) from the read model (denormalized, shaped per consumer) so each path can be optimized independently.
If you design services where reads outnumber writes or different consumers need different data shapes, CQRS is the pattern that lets you scale and tune each side without undermining correctness on the other.
Worked example — one committed write event feeds two independent read models:
Order placed → PostgreSQL write store (normalized, source of truth)
↓ outbox event
┌────────────────────────────────────┐
▼ ▼
Customer timeline view Finance export view
(Redis, keyed by customer_id) (Elasticsearch, keyed by SKU + date)
Neither read store is written in the request path. Projection workers listen to the event stream and update each view independently with their own checkpoints.
📖 Why CQRS Exists: Protect Write Truth While Shaping Fast Reads
Teams usually reach for CQRS after the same failure pattern repeats: one transactional model is asked to serve invariants, dashboards, search, timelines, and external API reads all at once. The result is often lock contention on the write path, expensive joins on the read path, and emergency cache layers that hide freshness problems instead of fixing them.
In architecture reviews, CQRS should answer four operational questions before anyone sketches a read model:
- Which business rules must stay synchronous on the write path?
- How stale can each read surface be before users notice?
- How will projections replay without corrupting a newer view?
- Which signal tells on-call that reads are behind before support tickets arrive?
| Pressure on the system | CQRS response | What operators still need |
| Heavy read fan-out hurts write latency | Separate query store optimized for access patterns | Freshness budget per read surface |
| Search and timelines need denormalized views | Projection workers build purpose-fit models | Replay checkpoints and backfill runbook |
| Write invariants must stay strict | Command side remains the only source of truth | Clear rule that queries never invent state |
| Different teams own different read workloads | Independent projections per domain consumer | Ownership for lag, schema, and recovery |
🔍 The Boundary Model: Command Side, Event Stream, Projection Side
At a practical level, CQRS is not just two databases. It is a contract about where truth is written and how derivative views are built.
| Building block | Responsibility | Failure to avoid |
| Command API and validators | Enforce invariants and reject invalid state transitions | Allowing read-side shortcuts to mutate source-of-truth data |
| Transactional write store | Commit the durable business truth | Hiding partial writes behind async cleanup |
| Outbox or change stream | Publish committed change events exactly once from the write boundary | Dual-writing query stores in the request path |
| Projection workers | Convert events into read-optimized views with checkpoints | Losing ordering, checkpoints, or idempotency |
📊 Command-Query Split: Write Side vs Read Side
flowchart LR
subgraph Write Side
CMD[Command] --> CH[Command Handler]
CH --> WDB[(Write Store)]
WDB --> OBX[Outbox / CDC]
end
subgraph Read Side
OBX --> PW[Projection Worker]
PW --> RDB[(Query Store)]
RDB --> QH[Query Handler]
QH --> QRY[Query Response]
end
This flowchart draws the CQRS architectural boundary in its simplest form: commands flow through a handler to a write store, and a projection worker asynchronously propagates committed changes to a separate query store that serves read responses. The critical visual detail is that the outbox or CDC mechanism sits at the boundary between the write and read subgraphs, making the write-to-read propagation explicit and decoupled. The takeaway is that the query store is always a derivative view — it has no write authority and no business logic, only materialized projections of what the write side has already committed.
⚙️ How the Write Path and Read Path Stay Separate
A healthy CQRS flow looks boring on purpose:
- The command handler validates the requested state change against current write-side rules.
- The transaction commits the write and records an outbox event or change record in the same durability boundary.
- A relay publishes that committed event to projection workers.
- Each projection updates its own read model with a checkpoint or last-event watermark.
- Queries read from the specialized store and expose freshness if they are allowed to be slightly behind.
| Control point | What it protects | Common mistake |
| Write authority | Business invariants stay in one place | Letting query code bypass validation |
| Outbox or change stream | Write commit and event emission stay atomic | Publishing to the bus before the transaction commits |
| Projection checkpoint | Replay stays monotonic and resumable | Reprocessing old events without ordering guardrails |
📊 Write Side Sequence: Command to Projection
sequenceDiagram
participant C as Client
participant CA as Command API
participant CH as Command Handler
participant WS as Write Store
participant OB as Outbox
participant PW as Projection Worker
C->>CA: POST /orders (command)
CA->>CH: validate and handle
CH->>WS: commit to write store
WS->>OB: record outbox event (atomic)
OB->>PW: relay committed event
PW->>PW: update read model with checkpoint
CA-->>C: 202 Accepted
This sequence diagram traces a single command from the client through validation, transactional commit, outbox recording, and projection update. The key design detail is that the write store records the outbox event in the same transaction as the business write, ensuring the projection worker never misses a change even if the relay crashes between commit and publish. The takeaway is that atomicity between the write and the outbox entry is what makes CQRS eventually consistent rather than occasionally lost.
📊 Read Side Sequence: Query to Response
sequenceDiagram
participant C as Client
participant QA as Query API
participant RM as Read Model
participant QS as Query Store
C->>QA: GET /orders/timeline
QA->>RM: query read model
RM->>QS: fetch from query store
QS-->>RM: projection data
RM-->>QA: shaped response
QA-->>C: timeline response (may lag write by seconds)
This sequence diagram shows that a query is a pure read path: the client calls the Query API, which fetches directly from a pre-built projection in the query store, with no contact with the write side or the event bus. The response may lag the write side by seconds, and the sequence explicitly notes that staleness, which is the correct behavior for a CQRS read model. The takeaway is that the query path's simplicity and performance come directly from accepting eventual consistency — the read model never needs to acquire locks or resolve write-side conflicts.
🧠 Deep Dive: Lag, Replay, and Projection Safety
The Internals: Write Authority, Checkpoints, and Read Staleness
The write side is the only place where business invariants should be enforced. Projection workers are downstream materializers; they should never be asked to resolve conflicts that belong to the command model.
That matters during failure. A replayed projection should rebuild a view from committed events, not guess what the latest truth is. Operators usually need three durable markers:
- a source-of-truth commit version or event ID,
- a per-projection checkpoint,
- a freshness budget that maps lag into user impact.
A common failure pattern is to let the application dual-write the transactional store and the read store in the same request. It feels simpler until retries, partial commits, or timeouts produce two truths. CQRS only pays off when the query model is clearly derivative.
Performance Analysis: Metrics That Expose CQRS Trouble Early
| Metric | Why it matters |
| Command commit p95 | Shows whether read concerns are leaking back into the write path |
| Projection lag by consumer | Identifies which read surface is drifting, not just that something is behind |
| Stale-read budget burn | Converts lag into business impact for on-call prioritization |
| Replay throughput | Predicts recovery time after projection outage or bad deploy |
Average lag is not enough. One projection serving customer timelines can be healthy while another powering finance exports is hours behind. CQRS observability has to stay projection-specific, otherwise the dashboard looks green while one business surface is effectively down.
🚨 Operator Field Note: Freshness Budgets Fail Before Correctness Does
In incident reviews, the first visible symptom of CQRS trouble is usually outdated reads, not corrupted writes. Support tickets say a status is old long before anyone proves the write path is wrong.
| Runbook clue | What it usually means | First operator move |
| Command succeeded but the read screen is stale | Projection worker is behind or stuck on one poison event | Compare latest committed event ID with the projection checkpoint, then quarantine the failing event |
| Replay backlog grows after deployment | New projection code is slower or incompatible with old events | Freeze expansion and benchmark replay throughput before retrying the rollout |
| One read model is hours behind while others are healthy | Lag is consumer-specific, not broker-wide | Scale or repair the affected projection rather than treating the whole bus as degraded |
| Users only in one region see stale data | Read-store replication or consumer placement is uneven | Check regional checkpoint skew before invalidating global caches |
Operators usually find that the most valuable architecture review artifact is a freshness table per read surface: acceptable lag, pager threshold, and replay procedure.
📊 CQRS Flow: Commit Once, Project Many
flowchart TD
A[Command API] --> B[Validate business rule]
B --> C[Transactional write store]
C --> D[Outbox or change stream]
D --> E[Projection worker]
E --> F[Query store]
F --> G[API or UI read path]
E --> H[Projection checkpoint]
C --> I[Committed version token]
I --> G
This flowchart shows the full CQRS data pipeline from a single command API entry point through transactional write, outbox relay, and projection worker to multiple specialized query stores. Two additional outputs — a committed version token returned to the read path and a projection checkpoint — make the consistency and replay properties of the pipeline explicit. The key takeaway is that a single committed write fans out to as many read models as the system needs, each optimized independently, all derived from the same immutable source of truth in the write store.
🌍 Real-World Applications: Realistic Scenario: Order Service With Timeline and Search Views
Consider an order platform with three very different read workloads:
- customer-facing order timelines,
- support-agent search,
- finance reconciliation exports.
The write path needs strict invariants around payment capture and fulfillment state. The read side needs different storage and indexing strategies.
| Constraint | Design decision | Trade-off |
| Payment and fulfillment state must be correct | PostgreSQL remains the write authority | Write model stays normalized and not optimized for search |
| Support needs flexible search by email, SKU, and carrier | Elasticsearch projection for support queries | Search view may lag behind committed state |
| Customer app needs fast timeline lookups | Redis or document-style read model keyed by order ID | Another projection to operate and replay |
| Finance needs auditable exports | Batch projection with checkpointed replay | Higher recovery cost if event lineage is weak |
⚖️ Trade-offs & Failure Modes: Trade-offs and Failure Modes
| Failure mode | Symptom | Root cause | First mitigation |
| Dual-write temptation | Writes succeed but one read store disagrees silently | Query store updated directly from request code | Move projection updates behind an outbox or change stream |
| No freshness budget | Teams argue whether stale data is an incident | Lag has no product-defined threshold | Define per-surface freshness SLOs |
| Replay poisoning | Projection cannot recover after bad event or schema change | Events are not versioned or handlers are not idempotent | Add versioned event handlers and quarantined replays |
| Read-store sprawl | Every team adds its own view with no ownership | CQRS used as permission to duplicate data endlessly | Require owner, SLO, and replay plan per projection |
CQRS is worth the cost when read workloads are truly different. If the only goal is maybe faster later, teams usually end up with more systems and the same old ambiguity.
🧭 Decision Guide: When CQRS Earns Its Complexity
| Situation | Recommendation |
| Reads are simple CRUD and freshness must be immediate | Stay with one transactional model |
| Writes require strict invariants but reads diverge heavily | Adopt CQRS for the bounded domain causing pain |
| Search, analytics, and timelines need different shapes | Add projections with explicit lag budgets |
| Team cannot yet operate replay and projection recovery | Delay CQRS until operational tooling exists |
Start with one domain that already hurts, such as orders or billing, and prove that lag, replay, and recovery are manageable before expanding the pattern elsewhere.
🧪 Practical Example: Order Service With Axon Framework
The order scenario from the previous section maps directly to Axon's programming model. Commands flow in through CommandGateway, the aggregate persists events to the event store, and the projection worker builds the timeline read model asynchronously.
flowchart LR
A[REST POST /orders] --> B[CommandGateway]
B --> C[OrderAggregate]
C --> D[EventStore]
D --> E[EventBus]
E --> F[@EventHandler Projection Worker]
F --> G[QueryStore]
G --> H[QueryGateway]
H --> I[REST GET /timeline]
Maven dependency
<dependency>
<groupId>org.axonframework</groupId>
<artifactId>axon-spring-boot-starter</artifactId>
<version>4.9.3</version>
</dependency>
Commands and queries as plain records
Commands carry intent; queries carry selection criteria. Neither holds logic.
public record PlaceOrderCommand(String orderId, String customerId, long totalCents) {}
public record GetOrderTimelineQuery(String customerId) {}
Write side: aggregate enforces invariants, never touches the read store
AggregateLifecycle.apply() commits the event to the Axon event store. The projection worker receives it asynchronously — the aggregate never writes to a query table directly.
@Aggregate
public class OrderAggregate {
@AggregateIdentifier
private String orderId;
@CommandHandler
public OrderAggregate(PlaceOrderCommand cmd) {
AggregateLifecycle.apply(
new OrderPlacedEvent(cmd.orderId(), cmd.customerId(), cmd.totalCents())
);
}
@EventSourcingHandler
public void on(OrderPlacedEvent event) {
this.orderId = event.orderId();
}
}
Read side: projection worker materializes the timeline view
The @EventHandler builds the read model from every committed OrderPlacedEvent. The @QueryHandler answers timeline requests dispatched by QueryGateway — the REST layer never reaches the write store.
@Component
public class OrderTimelineProjection {
private final OrderTimelineRepository repo;
@EventHandler
public void on(OrderPlacedEvent event, @Timestamp Instant timestamp) {
repo.save(new OrderTimelineEntry(
event.orderId(), event.customerId(), "PLACED", timestamp
));
}
@QueryHandler
public List<OrderTimelineEntry> handle(GetOrderTimelineQuery query) {
return repo.findByCustomerIdOrderByTimestampDesc(query.customerId());
}
}
Query REST controller
@RestController
@RequestMapping("/orders")
public class OrderQueryController {
private final QueryGateway queryGateway;
@GetMapping("/timeline/{customerId}")
public CompletableFuture<List<OrderTimelineEntry>> getTimeline(
@PathVariable String customerId) {
return queryGateway.query(
new GetOrderTimelineQuery(customerId),
ResponseTypes.multipleInstancesOf(OrderTimelineEntry.class)
);
}
}
🛠️ Axon Framework and EventStoreDB: CQRS on the JVM
Axon Framework is a Java framework built specifically for CQRS, event sourcing, and DDD aggregates on Spring Boot. It provides CommandGateway, QueryGateway, @CommandHandler, @QueryHandler, and @EventHandler — the complete wiring for a CQRS command-and-query split. EventStoreDB is a purpose-built append-only event database with server-side projections, used as the event log backend for event-sourced Axon aggregates.
Axon Framework solves the CQRS problem by enforcing the command/query boundary in code rather than convention: commands flow through CommandGateway to aggregates that enforce invariants; queries flow through QueryGateway to projection handlers that serve read models. The framework owns aggregate lifecycle, event serialization, checkpointing, and replay — teams write domain logic, not plumbing.
The code examples in the 🧪 Practical Example section above show the full flow. Below is the minimal wiring that makes the separation enforceable:
// ---- Command side: send a command and get a confirmation ----
@RestController
@RequestMapping("/orders")
public class OrderCommandController {
private final CommandGateway commandGateway;
@PostMapping
public CompletableFuture<String> placeOrder(@RequestBody PlaceOrderRequest req) {
// CommandGateway routes to OrderAggregate's @CommandHandler
// Returns the aggregate identifier on success
return commandGateway.send(
new PlaceOrderCommand(UUID.randomUUID().toString(),
req.customerId(), req.totalCents())
);
}
}
// ---- Query side: read the projection built from committed events ----
@RestController
@RequestMapping("/orders")
public class OrderQueryController {
private final QueryGateway queryGateway;
@GetMapping("/timeline/{customerId}")
public CompletableFuture<List<OrderTimelineEntry>> getTimeline(
@PathVariable String customerId) {
// QueryGateway routes to OrderTimelineProjection's @QueryHandler
// Never touches the write store
return queryGateway.query(
new GetOrderTimelineQuery(customerId),
ResponseTypes.multipleInstancesOf(OrderTimelineEntry.class)
);
}
}
The CommandGateway and QueryGateway beans are auto-configured by axon-spring-boot-starter — no manual wiring needed. Adding EventStoreDB as the backend replaces the default JPA event store with a true append-only log that supports server-side subscriptions and projection checkpointing.
For a full deep-dive on Axon Framework and EventStoreDB, a dedicated follow-up post is planned.
📚 Lessons Learned
- CQRS is a control boundary between write truth and read convenience, not a blanket microservices rule.
- Freshness budgets and replay tooling matter as much as the schema design.
- Projection-specific lag is a better signal than one global event-bus metric.
- Dual-writing the read model from the request path removes most of CQRS's safety benefits.
- A projection without an owner, checkpoint, and recovery plan is operational debt.
📌 TLDR: Summary & Key Takeaways
- Keep the write model authoritative and let read models stay derivative.
- Define freshness, replay, and rollback rules before scaling projections.
- Observe lag per read surface, not only at the broker or topic level.
- Use CQRS where read shapes materially diverge from write correctness needs.
- Treat projection recovery as a first-class runbook, not an afterthought.
🔗 Related Posts
- Microservices Data Patterns Saga Outbox CQRS And Event Sourcing
- Integration Architecture Patterns Orchestration Choreography And Schema Contracts
- System Design Message Queues And Event Driven Architecture
- System Design Data Modeling And Schema Evolution
- Understanding Consistency Patterns An In Depth Analysis
Test Your Knowledge
Ready to test what you just learned?
AI will generate 4 questions based on this article's content.

Written by
Abstract Algorithms
@abstractalgorithms
More Posts
RAG vs Fine-Tuning: When to Use Each (and When to Combine Them)
TLDR: RAG gives LLMs access to current knowledge at inference time; fine-tuning changes how they reason and write. Use RAG when your data changes. Use fine-tuning when you need consistent style, tone, or domain reasoning. Use both for production assi...
Fine-Tuning LLMs with LoRA and QLoRA: A Practical Deep-Dive
TLDR: LoRA freezes the base model and trains two tiny matrices per layer — 0.1 % of parameters, 70 % less GPU memory, near-identical quality. QLoRA adds 4-bit NF4 quantization of the frozen base, enabling 70B fine-tuning on 2× A100 80 GB instead of 8...
Build vs Buy: Deploying Your Own LLM vs Using ChatGPT, Gemini, and Claude APIs
TLDR: Use the API until you hit $10K/month or a hard data privacy requirement. Then add a semantic cache. Then evaluate hybrid routing. Self-hosting full model serving is only cost-effective at > 50M tokens/day with a dedicated MLOps team. The build ...
Watermarking and Late Data Handling in Spark Structured Streaming
TLDR: A watermark tells Spark Structured Streaming: "I will accept events up to N minutes late, and then I am done waiting." Spark tracks the maximum event time seen per partition, takes the global minimum across all partitions, subtracts the thresho...
