System Design API Design for Interviews: Contracts, Idempotency, and Pagination
Design APIs interviewers trust by focusing on resource boundaries, request contracts, and failure-safe behavior.
Abstract AlgorithmsTLDR: In system design interviews, API design is not a list of HTTP verbs. It is a contract strategy: clear resource boundaries, stable request and response shapes, pagination, idempotency, error semantics, and versioning decisions that survive scale and failures.
TLDR: Good API design reduces ambiguity for clients and prevents operational incidents when traffic grows.
๐ Why API Design Is an Architecture Decision, Not a Syntax Exercise
Many candidates treat API design as a mechanical step.
- "Use REST."
- "Add GET and POST."
- "Return JSON."
That is not enough for a strong system design answer.
An API is the boundary between independent systems. Once clients integrate, changing that boundary is expensive. Interviewers listen for whether you think about API contracts as long-lived, evolving interfaces under failure, retries, and partial outages.
If you came from System Design Interview Basics, this is the deeper follow-up to step "identify core entities and APIs."
| Weak API answer | Strong API answer |
| Lists endpoints quickly | Explains resource model and constraints first |
| Ignores retries and duplicates | Specifies idempotency behavior |
| Omits pagination | Designs for growth and bounded responses |
| Returns generic errors | Defines structured error semantics |
A practical rule: if your API contract does not explicitly handle retries, pagination, and failures, it is not ready for production scale.
๐ The API Contract Checklist You Should Apply in Every Interview
You can use a reusable checklist to keep API design systematic.
- Define the resource model and identifiers.
- Define the core operations per resource.
- Define request and response fields with explicit constraints.
- Define idempotency and retry behavior.
- Define pagination and filtering.
- Define error model and status semantics.
- Define versioning strategy.
| Contract element | Why it matters | Example |
| Resource identity | Avoids accidental duplicate records | order_id, user_id, message_id |
| Idempotency key | Makes retries safe | Idempotency-Key header on create payment |
| Pagination cursor | Prevents unbounded scans | next_cursor for timeline API |
| Error code taxonomy | Improves client handling | INVALID_ARGUMENT, RATE_LIMITED, CONFLICT |
| Versioning | Enables non-breaking evolution | /v1/orders or media-type versioning |
This checklist sounds simple, but it covers most production-grade API risks candidates forget in interviews.
โ๏ธ API Design Patterns That Prevent Common Failure Modes
Pattern 1: Resource-first endpoint design
Instead of action-heavy endpoints like /createOrder, design around resources:
POST /ordersGET /orders/{order_id}GET /orders?customer_id=...
This keeps semantics predictable and easier to evolve.
Pattern 2: Idempotent writes for retry safety
Client retries are inevitable during network failures. Without idempotency, retries can create duplicate side effects.
For create operations with financial or inventory impact, require an idempotency key:
| Request | Behavior |
First POST /payments with key abc-123 | Charge created |
Retry POST /payments with same key abc-123 | Return original result, do not double-charge |
Pattern 3: Cursor-based pagination
Offset pagination (page=1000) becomes slow and unstable at scale. Cursor pagination is often better for time-ordered datasets.
{
"items": [ ... ],
"next_cursor": "eyJjcmVhdGVkX2F0IjoiMjAyNi0wMy0xMlQxMjowMDowMFoifQ=="
}
Pattern 4: Structured errors
Avoid free-form strings as your main failure contract.
{
"error": {
"code": "RATE_LIMITED",
"message": "Too many requests",
"retry_after_ms": 1200
}
}
Clients can automate retry/backoff behavior only when errors are machine-readable.
๐ง Deep Dive: Translating Product Behavior Into Stable API Contracts
API contracts are where product semantics become system boundaries.
The Internals: Validation, Idempotency Store, and Backward Compatibility
At runtime, robust API services usually implement these internal mechanisms:
- Request validation layer for schema and semantic rules.
- Idempotency key store for safe retried writes.
- Serialization logic with explicit field defaults.
- Contract tests to prevent accidental breaking changes.
A create-order flow often looks like this:
- Validate request fields.
- Check idempotency key in fast store.
- If seen, return previous response.
- If new, execute transaction and persist response mapping.
That internal idempotency mapping is often the difference between a resilient API and an incident-prone one.
Backward compatibility also matters. Once mobile clients are released, forcing instant upgrades is usually unrealistic. That is why additive changes (new optional fields) are safer than breaking changes (renamed required fields).
Performance Analysis: Contract Shape, Latency, and Client Efficiency
API performance is not only server speed. Contract shape affects client behavior.
- Large payloads increase bandwidth and client parsing overhead.
- Chatty APIs (many small calls) increase network round trips.
- Missing filter support causes over-fetching.
- Missing projection support causes unnecessary payload size.
| Performance concern | Contract-level fix |
| Over-fetching | Add fields projection or specialized read models |
| Large list responses | Use cursor pagination and sensible page limits |
| Retry storms | Return explicit retry hints and enforce idempotency |
| N+1 client calls | Add batch endpoints where meaningful |
In interviews, saying "I will design the API so clients can fetch exactly what they need" demonstrates both performance awareness and API empathy.
๐ API Lifecycle Flow From Client Request to Stable Response
flowchart TD
A[Client request] --> B[Schema and semantic validation]
B --> C{Idempotency key present?}
C -->|Yes| D[Check idempotency store]
D --> E{Seen before?}
E -->|Yes| F[Return previous response]
E -->|No| G[Execute business transaction]
C -->|No| G
G --> H[Persist result and emit event]
H --> I[Return structured response]
This flowchart captures the API contract philosophy: consistent validation, safe retries, and predictable responses.
๐ Real-World Applications: Payments, Feeds, and Internal Microservices
Payments API: idempotency is non-negotiable because duplicate charges are business-critical incidents.
Timeline/feed API: pagination and filtering matter most, because reads dominate and data size grows continuously.
Internal microservice APIs: strict schemas, backward compatibility, and explicit error contracts reduce coordination cost between teams.
Different domains stress different contract dimensions, but the checklist remains stable.
โ๏ธ Trade-offs & Failure Modes: How API Contracts Break at Scale
| Failure mode | Symptom | Root cause | First mitigation |
| Duplicate side effects | Double payments or duplicate orders | Non-idempotent retries | Require idempotency keys |
| Pagination inconsistency | Missing or repeated records across pages | Offset pagination on mutable datasets | Cursor-based pagination |
| Client breakage on deploy | Old app versions fail | Breaking response changes | Additive, versioned evolution |
| Ambiguous error handling | Clients retry incorrectly | Unstructured errors | Machine-readable error taxonomy |
| Slow mobile performance | Large payloads and high battery use | Over-fetching | Add projections, filters, and compact views |
The best interview answer names at least one failure mode and one mitigation tied to API contract design.
๐งญ Decision Guide: REST, RPC, and Contract Complexity
| Situation | Recommendation |
| Public web API with diverse clients | REST/HTTP with explicit versioning and error contracts |
| Internal high-throughput service-to-service calls | RPC/gRPC with strict schemas |
| Event-driven ingestion APIs | Async acknowledgment plus idempotent processing |
| Rapidly changing product surface | Stable v1 with additive fields, delayed hard breaks |
If you need protocol-level trade-offs, pair this post with System Design Protocols: REST, RPC, and TCP/UDP.
๐งช Practical Example: Design a Create-and-List Orders API
Suppose your interview prompt includes order creation and order history.
You can propose:
POST /v1/orderswith idempotency key.GET /v1/orders/{order_id}for direct lookup.GET /v1/orders?customer_id=...&cursor=...&limit=...for history.
Define contract constraints:
| Field | Constraint |
customer_id | Required, immutable |
items[] | At least 1 line item |
currency | ISO-4217 code |
limit | 1 to 100 |
Define errors:
INVALID_ARGUMENTfor malformed request.CONFLICTfor state conflicts.RATE_LIMITEDwhen quotas trigger.INTERNALfor unexpected server errors.
This is strong interview content because it combines API shape, correctness safety, and client usability.
๐ Lessons Learned
- API design is about long-lived contracts, not endpoint naming alone.
- Idempotency and pagination should be first-class concerns in write and list APIs.
- Structured error semantics improve reliability more than verbose messages.
- Backward-compatible changes are cheaper than forced version migrations.
- Good API design reduces both operational incidents and client complexity.
๐ Summary & Key Takeaways
- Start API design with resource boundaries and request contracts.
- Build retry safety through idempotent write semantics.
- Use cursor pagination for large, mutable datasets.
- Return structured errors that clients can act on.
- Treat versioning as an evolution strategy, not an afterthought.
๐ Practice Quiz
- Why is idempotency critical for write endpoints in distributed systems?
A) It reduces payload size
B) It ensures retries do not create duplicate side effects
C) It removes the need for validation
Correct Answer: B
- Which pagination approach is generally safer for large, changing datasets?
A) Offset pagination only
B) Cursor-based pagination
C) No pagination
Correct Answer: B
- What is the biggest risk of breaking response fields without versioning?
A) Better performance
B) Client compatibility failures in production
C) Lower storage costs
Correct Answer: B
- Open-ended challenge: if your API must support mobile clients with slow networks and strict battery budgets, which contract decisions would you change first and why?
๐ Related Posts

Written by
Abstract Algorithms
@abstractalgorithms
More Posts
System Design Service Discovery and Health Checks: Routing Traffic to Healthy Instances
TLDR: Service discovery is how clients find the right service instance at runtime, and health checks are how systems decide whether an instance should receive traffic. Together, they turn dynamic infrastructure from guesswork into deterministic routi...
System Design Roadmap: A Complete Learning Path from Basics to Advanced Architecture
TLDR: This roadmap organizes every system-design-tagged post in this repository into learning groups and a recommended order. It is designed for interview prep and practical architecture thinking, from fundamentals to scaling, reliability, and implem...
System Design Observability, SLOs, and Incident Response: Operating Systems You Can Trust
TLDR: Observability is how you understand system behavior from telemetry, SLOs are explicit reliability targets, and incident response is the execution model when those targets are at risk. Together, they convert operational chaos into measurable, re...
System Design Message Queues and Event-Driven Architecture: Building Reliable Asynchronous Systems
TLDR: Message queues and event-driven architecture let services communicate asynchronously, absorb bursty traffic, and isolate failures. The core design challenge is not adding a queue. It is defining delivery semantics, retry behavior, and idempoten...
