All Posts

System Design API Design for Interviews: Contracts, Idempotency, and Pagination

Design APIs interviewers trust by focusing on resource boundaries, request contracts, and failure-safe behavior.

Abstract AlgorithmsAbstract Algorithms
ยทยท8 min read
Share
Share on X / Twitter
Share on LinkedIn
Copy link

TLDR: In system design interviews, API design is not a list of HTTP verbs. It is a contract strategy: clear resource boundaries, stable request and response shapes, pagination, idempotency, error semantics, and versioning decisions that survive scale and failures.

TLDR: Good API design reduces ambiguity for clients and prevents operational incidents when traffic grows.

๐Ÿ“– Why API Design Is an Architecture Decision, Not a Syntax Exercise

Many candidates treat API design as a mechanical step.

  • "Use REST."
  • "Add GET and POST."
  • "Return JSON."

That is not enough for a strong system design answer.

An API is the boundary between independent systems. Once clients integrate, changing that boundary is expensive. Interviewers listen for whether you think about API contracts as long-lived, evolving interfaces under failure, retries, and partial outages.

If you came from System Design Interview Basics, this is the deeper follow-up to step "identify core entities and APIs."

Weak API answerStrong API answer
Lists endpoints quicklyExplains resource model and constraints first
Ignores retries and duplicatesSpecifies idempotency behavior
Omits paginationDesigns for growth and bounded responses
Returns generic errorsDefines structured error semantics

A practical rule: if your API contract does not explicitly handle retries, pagination, and failures, it is not ready for production scale.

๐Ÿ” The API Contract Checklist You Should Apply in Every Interview

You can use a reusable checklist to keep API design systematic.

  1. Define the resource model and identifiers.
  2. Define the core operations per resource.
  3. Define request and response fields with explicit constraints.
  4. Define idempotency and retry behavior.
  5. Define pagination and filtering.
  6. Define error model and status semantics.
  7. Define versioning strategy.
Contract elementWhy it mattersExample
Resource identityAvoids accidental duplicate recordsorder_id, user_id, message_id
Idempotency keyMakes retries safeIdempotency-Key header on create payment
Pagination cursorPrevents unbounded scansnext_cursor for timeline API
Error code taxonomyImproves client handlingINVALID_ARGUMENT, RATE_LIMITED, CONFLICT
VersioningEnables non-breaking evolution/v1/orders or media-type versioning

This checklist sounds simple, but it covers most production-grade API risks candidates forget in interviews.

โš™๏ธ API Design Patterns That Prevent Common Failure Modes

Pattern 1: Resource-first endpoint design

Instead of action-heavy endpoints like /createOrder, design around resources:

  • POST /orders
  • GET /orders/{order_id}
  • GET /orders?customer_id=...

This keeps semantics predictable and easier to evolve.

Pattern 2: Idempotent writes for retry safety

Client retries are inevitable during network failures. Without idempotency, retries can create duplicate side effects.

For create operations with financial or inventory impact, require an idempotency key:

RequestBehavior
First POST /payments with key abc-123Charge created
Retry POST /payments with same key abc-123Return original result, do not double-charge

Pattern 3: Cursor-based pagination

Offset pagination (page=1000) becomes slow and unstable at scale. Cursor pagination is often better for time-ordered datasets.

{
  "items": [ ... ],
  "next_cursor": "eyJjcmVhdGVkX2F0IjoiMjAyNi0wMy0xMlQxMjowMDowMFoifQ=="
}

Pattern 4: Structured errors

Avoid free-form strings as your main failure contract.

{
  "error": {
    "code": "RATE_LIMITED",
    "message": "Too many requests",
    "retry_after_ms": 1200
  }
}

Clients can automate retry/backoff behavior only when errors are machine-readable.

๐Ÿง  Deep Dive: Translating Product Behavior Into Stable API Contracts

API contracts are where product semantics become system boundaries.

The Internals: Validation, Idempotency Store, and Backward Compatibility

At runtime, robust API services usually implement these internal mechanisms:

  • Request validation layer for schema and semantic rules.
  • Idempotency key store for safe retried writes.
  • Serialization logic with explicit field defaults.
  • Contract tests to prevent accidental breaking changes.

A create-order flow often looks like this:

  1. Validate request fields.
  2. Check idempotency key in fast store.
  3. If seen, return previous response.
  4. If new, execute transaction and persist response mapping.

That internal idempotency mapping is often the difference between a resilient API and an incident-prone one.

Backward compatibility also matters. Once mobile clients are released, forcing instant upgrades is usually unrealistic. That is why additive changes (new optional fields) are safer than breaking changes (renamed required fields).

Performance Analysis: Contract Shape, Latency, and Client Efficiency

API performance is not only server speed. Contract shape affects client behavior.

  • Large payloads increase bandwidth and client parsing overhead.
  • Chatty APIs (many small calls) increase network round trips.
  • Missing filter support causes over-fetching.
  • Missing projection support causes unnecessary payload size.
Performance concernContract-level fix
Over-fetchingAdd fields projection or specialized read models
Large list responsesUse cursor pagination and sensible page limits
Retry stormsReturn explicit retry hints and enforce idempotency
N+1 client callsAdd batch endpoints where meaningful

In interviews, saying "I will design the API so clients can fetch exactly what they need" demonstrates both performance awareness and API empathy.

๐Ÿ“Š API Lifecycle Flow From Client Request to Stable Response

flowchart TD
    A[Client request] --> B[Schema and semantic validation]
    B --> C{Idempotency key present?}
    C -->|Yes| D[Check idempotency store]
    D --> E{Seen before?}
    E -->|Yes| F[Return previous response]
    E -->|No| G[Execute business transaction]
    C -->|No| G
    G --> H[Persist result and emit event]
    H --> I[Return structured response]

This flowchart captures the API contract philosophy: consistent validation, safe retries, and predictable responses.

๐ŸŒ Real-World Applications: Payments, Feeds, and Internal Microservices

Payments API: idempotency is non-negotiable because duplicate charges are business-critical incidents.

Timeline/feed API: pagination and filtering matter most, because reads dominate and data size grows continuously.

Internal microservice APIs: strict schemas, backward compatibility, and explicit error contracts reduce coordination cost between teams.

Different domains stress different contract dimensions, but the checklist remains stable.

โš–๏ธ Trade-offs & Failure Modes: How API Contracts Break at Scale

Failure modeSymptomRoot causeFirst mitigation
Duplicate side effectsDouble payments or duplicate ordersNon-idempotent retriesRequire idempotency keys
Pagination inconsistencyMissing or repeated records across pagesOffset pagination on mutable datasetsCursor-based pagination
Client breakage on deployOld app versions failBreaking response changesAdditive, versioned evolution
Ambiguous error handlingClients retry incorrectlyUnstructured errorsMachine-readable error taxonomy
Slow mobile performanceLarge payloads and high battery useOver-fetchingAdd projections, filters, and compact views

The best interview answer names at least one failure mode and one mitigation tied to API contract design.

๐Ÿงญ Decision Guide: REST, RPC, and Contract Complexity

SituationRecommendation
Public web API with diverse clientsREST/HTTP with explicit versioning and error contracts
Internal high-throughput service-to-service callsRPC/gRPC with strict schemas
Event-driven ingestion APIsAsync acknowledgment plus idempotent processing
Rapidly changing product surfaceStable v1 with additive fields, delayed hard breaks

If you need protocol-level trade-offs, pair this post with System Design Protocols: REST, RPC, and TCP/UDP.

๐Ÿงช Practical Example: Design a Create-and-List Orders API

Suppose your interview prompt includes order creation and order history.

You can propose:

  • POST /v1/orders with idempotency key.
  • GET /v1/orders/{order_id} for direct lookup.
  • GET /v1/orders?customer_id=...&cursor=...&limit=... for history.

Define contract constraints:

FieldConstraint
customer_idRequired, immutable
items[]At least 1 line item
currencyISO-4217 code
limit1 to 100

Define errors:

  • INVALID_ARGUMENT for malformed request.
  • CONFLICT for state conflicts.
  • RATE_LIMITED when quotas trigger.
  • INTERNAL for unexpected server errors.

This is strong interview content because it combines API shape, correctness safety, and client usability.

๐Ÿ“š Lessons Learned

  • API design is about long-lived contracts, not endpoint naming alone.
  • Idempotency and pagination should be first-class concerns in write and list APIs.
  • Structured error semantics improve reliability more than verbose messages.
  • Backward-compatible changes are cheaper than forced version migrations.
  • Good API design reduces both operational incidents and client complexity.

๐Ÿ“Œ Summary & Key Takeaways

  • Start API design with resource boundaries and request contracts.
  • Build retry safety through idempotent write semantics.
  • Use cursor pagination for large, mutable datasets.
  • Return structured errors that clients can act on.
  • Treat versioning as an evolution strategy, not an afterthought.

๐Ÿ“ Practice Quiz

  1. Why is idempotency critical for write endpoints in distributed systems?

A) It reduces payload size
B) It ensures retries do not create duplicate side effects
C) It removes the need for validation

Correct Answer: B

  1. Which pagination approach is generally safer for large, changing datasets?

A) Offset pagination only
B) Cursor-based pagination
C) No pagination

Correct Answer: B

  1. What is the biggest risk of breaking response fields without versioning?

A) Better performance
B) Client compatibility failures in production
C) Lower storage costs

Correct Answer: B

  1. Open-ended challenge: if your API must support mobile clients with slow networks and strict battery budgets, which contract decisions would you change first and why?
Abstract Algorithms

Written by

Abstract Algorithms

@abstractalgorithms

More Posts

System Design Service Discovery and Health Checks: Routing Traffic to Healthy Instances

TLDR: Service discovery is how clients find the right service instance at runtime, and health checks are how systems decide whether an instance should receive traffic. Together, they turn dynamic infrastructure from guesswork into deterministic routi...

โ€ข8 min read

System Design Roadmap: A Complete Learning Path from Basics to Advanced Architecture

TLDR: This roadmap organizes every system-design-tagged post in this repository into learning groups and a recommended order. It is designed for interview prep and practical architecture thinking, from fundamentals to scaling, reliability, and implem...

โ€ข10 min read

System Design Observability, SLOs, and Incident Response: Operating Systems You Can Trust

TLDR: Observability is how you understand system behavior from telemetry, SLOs are explicit reliability targets, and incident response is the execution model when those targets are at risk. Together, they convert operational chaos into measurable, re...

โ€ข8 min read

System Design Message Queues and Event-Driven Architecture: Building Reliable Asynchronous Systems

TLDR: Message queues and event-driven architecture let services communicate asynchronously, absorb bursty traffic, and isolate failures. The core design challenge is not adding a queue. It is defining delivery semantics, retry behavior, and idempoten...

โ€ข8 min read