All Posts

CQRS Pattern: Separating Write Models from Query Models at Scale

Design independent command and query paths to scale reads without weakening write correctness.

Abstract AlgorithmsAbstract Algorithms
··14 min read

AI-assisted content.

TLDR: CQRS works when read and write workloads diverge, but only with explicit freshness budgets and projection reliability. The hard part is not separating models — it is operating lag, replay, and rollback safely.

An e-commerce platform's order service was running 47-table JOINs to serve its summary page — because reads and writes shared the same normalized model. Dashboards, search, and payment-status polls all hit the same write store. Adding read indexes slowed writes; write locks stalled dashboards. Response times hit 4 seconds. CQRS separates the write model (normalized, enforces invariants) from the read model (denormalized, shaped per consumer) so each path can be optimized independently.

If you design services where reads outnumber writes or different consumers need different data shapes, CQRS is the pattern that lets you scale and tune each side without undermining correctness on the other.

Worked example — one committed write event feeds two independent read models:

Order placed → PostgreSQL write store (normalized, source of truth)
           ↓ outbox event
    ┌────────────────────────────────────┐
    ▼                                    ▼
Customer timeline view        Finance export view
(Redis, keyed by customer_id) (Elasticsearch, keyed by SKU + date)

Neither read store is written in the request path. Projection workers listen to the event stream and update each view independently with their own checkpoints.

📖 Why CQRS Exists: Protect Write Truth While Shaping Fast Reads

Teams usually reach for CQRS after the same failure pattern repeats: one transactional model is asked to serve invariants, dashboards, search, timelines, and external API reads all at once. The result is often lock contention on the write path, expensive joins on the read path, and emergency cache layers that hide freshness problems instead of fixing them.

In architecture reviews, CQRS should answer four operational questions before anyone sketches a read model:

  • Which business rules must stay synchronous on the write path?
  • How stale can each read surface be before users notice?
  • How will projections replay without corrupting a newer view?
  • Which signal tells on-call that reads are behind before support tickets arrive?
Pressure on the systemCQRS responseWhat operators still need
Heavy read fan-out hurts write latencySeparate query store optimized for access patternsFreshness budget per read surface
Search and timelines need denormalized viewsProjection workers build purpose-fit modelsReplay checkpoints and backfill runbook
Write invariants must stay strictCommand side remains the only source of truthClear rule that queries never invent state
Different teams own different read workloadsIndependent projections per domain consumerOwnership for lag, schema, and recovery

🔍 The Boundary Model: Command Side, Event Stream, Projection Side

At a practical level, CQRS is not just two databases. It is a contract about where truth is written and how derivative views are built.

Building blockResponsibilityFailure to avoid
Command API and validatorsEnforce invariants and reject invalid state transitionsAllowing read-side shortcuts to mutate source-of-truth data
Transactional write storeCommit the durable business truthHiding partial writes behind async cleanup
Outbox or change streamPublish committed change events exactly once from the write boundaryDual-writing query stores in the request path
Projection workersConvert events into read-optimized views with checkpointsLosing ordering, checkpoints, or idempotency

📊 Command-Query Split: Write Side vs Read Side

flowchart LR
  subgraph Write Side
    CMD[Command] --> CH[Command Handler]
    CH --> WDB[(Write Store)]
    WDB --> OBX[Outbox / CDC]
  end
  subgraph Read Side
    OBX --> PW[Projection Worker]
    PW --> RDB[(Query Store)]
    RDB --> QH[Query Handler]
    QH --> QRY[Query Response]
  end

This flowchart draws the CQRS architectural boundary in its simplest form: commands flow through a handler to a write store, and a projection worker asynchronously propagates committed changes to a separate query store that serves read responses. The critical visual detail is that the outbox or CDC mechanism sits at the boundary between the write and read subgraphs, making the write-to-read propagation explicit and decoupled. The takeaway is that the query store is always a derivative view — it has no write authority and no business logic, only materialized projections of what the write side has already committed.

⚙️ How the Write Path and Read Path Stay Separate

A healthy CQRS flow looks boring on purpose:

  1. The command handler validates the requested state change against current write-side rules.
  2. The transaction commits the write and records an outbox event or change record in the same durability boundary.
  3. A relay publishes that committed event to projection workers.
  4. Each projection updates its own read model with a checkpoint or last-event watermark.
  5. Queries read from the specialized store and expose freshness if they are allowed to be slightly behind.
Control pointWhat it protectsCommon mistake
Write authorityBusiness invariants stay in one placeLetting query code bypass validation
Outbox or change streamWrite commit and event emission stay atomicPublishing to the bus before the transaction commits
Projection checkpointReplay stays monotonic and resumableReprocessing old events without ordering guardrails

📊 Write Side Sequence: Command to Projection

sequenceDiagram
  participant C as Client
  participant CA as Command API
  participant CH as Command Handler
  participant WS as Write Store
  participant OB as Outbox
  participant PW as Projection Worker
  C->>CA: POST /orders (command)
  CA->>CH: validate and handle
  CH->>WS: commit to write store
  WS->>OB: record outbox event (atomic)
  OB->>PW: relay committed event
  PW->>PW: update read model with checkpoint
  CA-->>C: 202 Accepted

This sequence diagram traces a single command from the client through validation, transactional commit, outbox recording, and projection update. The key design detail is that the write store records the outbox event in the same transaction as the business write, ensuring the projection worker never misses a change even if the relay crashes between commit and publish. The takeaway is that atomicity between the write and the outbox entry is what makes CQRS eventually consistent rather than occasionally lost.

📊 Read Side Sequence: Query to Response

sequenceDiagram
  participant C as Client
  participant QA as Query API
  participant RM as Read Model
  participant QS as Query Store
  C->>QA: GET /orders/timeline
  QA->>RM: query read model
  RM->>QS: fetch from query store
  QS-->>RM: projection data
  RM-->>QA: shaped response
  QA-->>C: timeline response (may lag write by seconds)

This sequence diagram shows that a query is a pure read path: the client calls the Query API, which fetches directly from a pre-built projection in the query store, with no contact with the write side or the event bus. The response may lag the write side by seconds, and the sequence explicitly notes that staleness, which is the correct behavior for a CQRS read model. The takeaway is that the query path's simplicity and performance come directly from accepting eventual consistency — the read model never needs to acquire locks or resolve write-side conflicts.

🧠 Deep Dive: Lag, Replay, and Projection Safety

The Internals: Write Authority, Checkpoints, and Read Staleness

The write side is the only place where business invariants should be enforced. Projection workers are downstream materializers; they should never be asked to resolve conflicts that belong to the command model.

That matters during failure. A replayed projection should rebuild a view from committed events, not guess what the latest truth is. Operators usually need three durable markers:

  • a source-of-truth commit version or event ID,
  • a per-projection checkpoint,
  • a freshness budget that maps lag into user impact.

A common failure pattern is to let the application dual-write the transactional store and the read store in the same request. It feels simpler until retries, partial commits, or timeouts produce two truths. CQRS only pays off when the query model is clearly derivative.

Performance Analysis: Metrics That Expose CQRS Trouble Early

MetricWhy it matters
Command commit p95Shows whether read concerns are leaking back into the write path
Projection lag by consumerIdentifies which read surface is drifting, not just that something is behind
Stale-read budget burnConverts lag into business impact for on-call prioritization
Replay throughputPredicts recovery time after projection outage or bad deploy

Average lag is not enough. One projection serving customer timelines can be healthy while another powering finance exports is hours behind. CQRS observability has to stay projection-specific, otherwise the dashboard looks green while one business surface is effectively down.

🚨 Operator Field Note: Freshness Budgets Fail Before Correctness Does

In incident reviews, the first visible symptom of CQRS trouble is usually outdated reads, not corrupted writes. Support tickets say a status is old long before anyone proves the write path is wrong.

Runbook clueWhat it usually meansFirst operator move
Command succeeded but the read screen is staleProjection worker is behind or stuck on one poison eventCompare latest committed event ID with the projection checkpoint, then quarantine the failing event
Replay backlog grows after deploymentNew projection code is slower or incompatible with old eventsFreeze expansion and benchmark replay throughput before retrying the rollout
One read model is hours behind while others are healthyLag is consumer-specific, not broker-wideScale or repair the affected projection rather than treating the whole bus as degraded
Users only in one region see stale dataRead-store replication or consumer placement is unevenCheck regional checkpoint skew before invalidating global caches

Operators usually find that the most valuable architecture review artifact is a freshness table per read surface: acceptable lag, pager threshold, and replay procedure.

📊 CQRS Flow: Commit Once, Project Many

flowchart TD
  A[Command API] --> B[Validate business rule]
  B --> C[Transactional write store]
  C --> D[Outbox or change stream]
  D --> E[Projection worker]
  E --> F[Query store]
  F --> G[API or UI read path]
  E --> H[Projection checkpoint]
  C --> I[Committed version token]
  I --> G

This flowchart shows the full CQRS data pipeline from a single command API entry point through transactional write, outbox relay, and projection worker to multiple specialized query stores. Two additional outputs — a committed version token returned to the read path and a projection checkpoint — make the consistency and replay properties of the pipeline explicit. The key takeaway is that a single committed write fans out to as many read models as the system needs, each optimized independently, all derived from the same immutable source of truth in the write store.

🌍 Real-World Applications: Realistic Scenario: Order Service With Timeline and Search Views

Consider an order platform with three very different read workloads:

  • customer-facing order timelines,
  • support-agent search,
  • finance reconciliation exports.

The write path needs strict invariants around payment capture and fulfillment state. The read side needs different storage and indexing strategies.

ConstraintDesign decisionTrade-off
Payment and fulfillment state must be correctPostgreSQL remains the write authorityWrite model stays normalized and not optimized for search
Support needs flexible search by email, SKU, and carrierElasticsearch projection for support queriesSearch view may lag behind committed state
Customer app needs fast timeline lookupsRedis or document-style read model keyed by order IDAnother projection to operate and replay
Finance needs auditable exportsBatch projection with checkpointed replayHigher recovery cost if event lineage is weak

⚖️ Trade-offs & Failure Modes: Trade-offs and Failure Modes

Failure modeSymptomRoot causeFirst mitigation
Dual-write temptationWrites succeed but one read store disagrees silentlyQuery store updated directly from request codeMove projection updates behind an outbox or change stream
No freshness budgetTeams argue whether stale data is an incidentLag has no product-defined thresholdDefine per-surface freshness SLOs
Replay poisoningProjection cannot recover after bad event or schema changeEvents are not versioned or handlers are not idempotentAdd versioned event handlers and quarantined replays
Read-store sprawlEvery team adds its own view with no ownershipCQRS used as permission to duplicate data endlesslyRequire owner, SLO, and replay plan per projection

CQRS is worth the cost when read workloads are truly different. If the only goal is maybe faster later, teams usually end up with more systems and the same old ambiguity.

🧭 Decision Guide: When CQRS Earns Its Complexity

SituationRecommendation
Reads are simple CRUD and freshness must be immediateStay with one transactional model
Writes require strict invariants but reads diverge heavilyAdopt CQRS for the bounded domain causing pain
Search, analytics, and timelines need different shapesAdd projections with explicit lag budgets
Team cannot yet operate replay and projection recoveryDelay CQRS until operational tooling exists

Start with one domain that already hurts, such as orders or billing, and prove that lag, replay, and recovery are manageable before expanding the pattern elsewhere.

🧪 Practical Example: Order Service With Axon Framework

The order scenario from the previous section maps directly to Axon's programming model. Commands flow in through CommandGateway, the aggregate persists events to the event store, and the projection worker builds the timeline read model asynchronously.

flowchart LR
  A[REST POST /orders] --> B[CommandGateway]
  B --> C[OrderAggregate]
  C --> D[EventStore]
  D --> E[EventBus]
  E --> F[@EventHandler Projection Worker]
  F --> G[QueryStore]
  G --> H[QueryGateway]
  H --> I[REST GET /timeline]

Maven dependency

<dependency>
  <groupId>org.axonframework</groupId>
  <artifactId>axon-spring-boot-starter</artifactId>
  <version>4.9.3</version>
</dependency>

Commands and queries as plain records

Commands carry intent; queries carry selection criteria. Neither holds logic.

public record PlaceOrderCommand(String orderId, String customerId, long totalCents) {}
public record GetOrderTimelineQuery(String customerId) {}

Write side: aggregate enforces invariants, never touches the read store

AggregateLifecycle.apply() commits the event to the Axon event store. The projection worker receives it asynchronously — the aggregate never writes to a query table directly.

@Aggregate
public class OrderAggregate {
    @AggregateIdentifier
    private String orderId;

    @CommandHandler
    public OrderAggregate(PlaceOrderCommand cmd) {
        AggregateLifecycle.apply(
            new OrderPlacedEvent(cmd.orderId(), cmd.customerId(), cmd.totalCents())
        );
    }

    @EventSourcingHandler
    public void on(OrderPlacedEvent event) {
        this.orderId = event.orderId();
    }
}

Read side: projection worker materializes the timeline view

The @EventHandler builds the read model from every committed OrderPlacedEvent. The @QueryHandler answers timeline requests dispatched by QueryGateway — the REST layer never reaches the write store.

@Component
public class OrderTimelineProjection {
    private final OrderTimelineRepository repo;

    @EventHandler
    public void on(OrderPlacedEvent event, @Timestamp Instant timestamp) {
        repo.save(new OrderTimelineEntry(
            event.orderId(), event.customerId(), "PLACED", timestamp
        ));
    }

    @QueryHandler
    public List<OrderTimelineEntry> handle(GetOrderTimelineQuery query) {
        return repo.findByCustomerIdOrderByTimestampDesc(query.customerId());
    }
}

Query REST controller

@RestController
@RequestMapping("/orders")
public class OrderQueryController {
    private final QueryGateway queryGateway;

    @GetMapping("/timeline/{customerId}")
    public CompletableFuture<List<OrderTimelineEntry>> getTimeline(
            @PathVariable String customerId) {
        return queryGateway.query(
            new GetOrderTimelineQuery(customerId),
            ResponseTypes.multipleInstancesOf(OrderTimelineEntry.class)
        );
    }
}

🛠️ Axon Framework and EventStoreDB: CQRS on the JVM

Axon Framework is a Java framework built specifically for CQRS, event sourcing, and DDD aggregates on Spring Boot. It provides CommandGateway, QueryGateway, @CommandHandler, @QueryHandler, and @EventHandler — the complete wiring for a CQRS command-and-query split. EventStoreDB is a purpose-built append-only event database with server-side projections, used as the event log backend for event-sourced Axon aggregates.

Axon Framework solves the CQRS problem by enforcing the command/query boundary in code rather than convention: commands flow through CommandGateway to aggregates that enforce invariants; queries flow through QueryGateway to projection handlers that serve read models. The framework owns aggregate lifecycle, event serialization, checkpointing, and replay — teams write domain logic, not plumbing.

The code examples in the 🧪 Practical Example section above show the full flow. Below is the minimal wiring that makes the separation enforceable:

// ---- Command side: send a command and get a confirmation ----
@RestController
@RequestMapping("/orders")
public class OrderCommandController {
    private final CommandGateway commandGateway;

    @PostMapping
    public CompletableFuture<String> placeOrder(@RequestBody PlaceOrderRequest req) {
        // CommandGateway routes to OrderAggregate's @CommandHandler
        // Returns the aggregate identifier on success
        return commandGateway.send(
            new PlaceOrderCommand(UUID.randomUUID().toString(),
                                  req.customerId(), req.totalCents())
        );
    }
}

// ---- Query side: read the projection built from committed events ----
@RestController
@RequestMapping("/orders")
public class OrderQueryController {
    private final QueryGateway queryGateway;

    @GetMapping("/timeline/{customerId}")
    public CompletableFuture<List<OrderTimelineEntry>> getTimeline(
            @PathVariable String customerId) {
        // QueryGateway routes to OrderTimelineProjection's @QueryHandler
        // Never touches the write store
        return queryGateway.query(
            new GetOrderTimelineQuery(customerId),
            ResponseTypes.multipleInstancesOf(OrderTimelineEntry.class)
        );
    }
}

The CommandGateway and QueryGateway beans are auto-configured by axon-spring-boot-starter — no manual wiring needed. Adding EventStoreDB as the backend replaces the default JPA event store with a true append-only log that supports server-side subscriptions and projection checkpointing.

For a full deep-dive on Axon Framework and EventStoreDB, a dedicated follow-up post is planned.

📚 Lessons Learned

  • CQRS is a control boundary between write truth and read convenience, not a blanket microservices rule.
  • Freshness budgets and replay tooling matter as much as the schema design.
  • Projection-specific lag is a better signal than one global event-bus metric.
  • Dual-writing the read model from the request path removes most of CQRS's safety benefits.
  • A projection without an owner, checkpoint, and recovery plan is operational debt.

📌 TLDR: Summary & Key Takeaways

  • Keep the write model authoritative and let read models stay derivative.
  • Define freshness, replay, and rollback rules before scaling projections.
  • Observe lag per read surface, not only at the broker or topic level.
  • Use CQRS where read shapes materially diverge from write correctness needs.
  • Treat projection recovery as a first-class runbook, not an afterthought.
Share

Test Your Knowledge

🧠

Ready to test what you just learned?

AI will generate 4 questions based on this article's content.

Abstract Algorithms

Written by

Abstract Algorithms

@abstractalgorithms