Distributing traffic across multiple servers to maximize throughput, minimize latency, and avoid overload. Covers algorithms: round-robin, least connections, IP hashing.
- Design a load balancer for a video streaming service (Netflix-like)
- Implement a consistent hashing-based load balancer that handles node failures
- Design URL shortener with load balancing across 3 regions
Relational vs non-relational databases — when to choose each, trade-offs in consistency, scalability, and schema flexibility.
- Design a social graph — justify SQL vs NoSQL choice
- Design a product catalog for e-commerce: when does NoSQL win?
- Schema design for a multi-tenant SaaS app
Designing operations that can be safely retried without causing unintended side effects. Critical for payment systems, message queues, and distributed APIs.
- Design idempotent payment processing — handle duplicate charges
- Build a retry mechanism for a distributed order service
- Design an email notification system that guarantees exactly-once delivery
Async communication via Kafka, RabbitMQ, SQS. Topics: pub/sub, dead letter queues, ordering guarantees, consumer groups, and backpressure.
- Design a notification system using Kafka (email, SMS, push)
- Design an order processing pipeline with guaranteed delivery
- Build a job queue for a distributed task scheduler
Consistency, Availability, and Partition Tolerance — you can only guarantee two. Understanding CP vs AP systems and the PACELC extension.
- Design a banking system — justify your CAP trade-offs
- Compare DynamoDB (AP) vs HBase (CP) for a shopping cart
- Design a distributed key-value store, explain partition handling
API design principles: versioning, pagination, error handling, authentication patterns, rate limiting, and documentation best practices.
- Design the Twitter API — tweets, follows, timeline endpoints
- Design a paginated search API for a 1B-item catalog
- Design a public API with versioning strategy (v1 → v2 migration)
MapReduce, Spark, Flink, Kafka Streams. Lambda vs Kappa architecture. When to process data in bulk vs real-time event streams.
- Design a real-time fraud detection system (stream processing)
- Design a nightly analytics pipeline for 100M daily events
- Design YouTube's view count system (eventual vs real-time)
Cache-aside, write-through, write-behind, read-through. Where to cache: client, CDN, app server, DB. Cache invalidation and stampede problems.
- Design Instagram's feed cache — what to cache, when to invalidate
- Handle cache stampede in a high-traffic flash sale
- Design a multi-layer cache (L1/L2/L3) for a search engine
Event-driven HTTP callbacks. Delivery guarantees, retry logic, signature verification, fan-out patterns, and webhooks vs polling vs SSE.
- Design a webhook delivery system with retries and failure handling
- Design GitHub Actions triggers — webhook fan-out to 100k subscribers
- Build webhook signature verification for a payment provider
SLAs, the "nines" (99.9% vs 99.99%), redundancy, failover, health checks, circuit breakers, and designing for high availability across regions.
- Design a 99.99% available global payment API
- Design multi-region failover for a ride-sharing app
- Calculate availability for a system with 5 dependent services
Horizontal partitioning by key, range, or hash. Hotspot problems, cross-shard queries, rebalancing, and the difference between sharding and partitioning.
- Design sharding strategy for a 10TB user database
- Handle a hot shard in a social media timeline system
- Design cross-shard transaction handling for a banking app
Probabilistic data structure for membership tests. Space-efficient "maybe yes, definitely no" filter. Used in databases, CDNs, and duplicate detection.
- Use a Bloom filter to reduce DB lookups for non-existent users
- Design duplicate URL detection for a web crawler (Googlebot-scale)
- Implement Bloom filter for "safe browsing" malicious URL detection
Stateless services scale horizontally easily; stateful ones need sticky sessions or external state stores. Implications for microservices, k8s, and auth.
- Refactor a stateful session service to be horizontally scalable
- Design stateless auth with JWTs replacing server-side sessions
- Design a real-time game server — where must state live?
Paxos, Raft consensus, gossip protocols, vector clocks, two-phase commit (2PC), Lamport timestamps, and leader election algorithms.
- Design a leader election service using Raft for a distributed DB
- Implement vector clocks for conflict resolution in a distributed KV store
- Design failure detection using gossip protocol
Single entry point for microservices. Handles auth, rate limiting, routing, SSL termination, request aggregation, and observability (Kong, AWS API GW).
- Design an API gateway for a microservices e-commerce platform
- Add rate limiting + JWT auth to an existing gateway
- Design request aggregation to reduce mobile client round-trips
Forward proxy (client anonymity, filtering) vs reverse proxy (server protection, load balancing, caching). NGINX, HAProxy, and Envoy use cases.
- Configure NGINX as a reverse proxy with SSL termination
- Design a corporate forward proxy with content filtering
- Compare: API Gateway vs reverse proxy — when to use each?
Advanced: dynamic sharding, directory-based sharding, resharding without downtime, celebrity/hotspot problem, and global vs local indexes.
- Shard Discord's message store for 1T+ messages
- Redesign Twitter's tweet storage with sharding by user ID
- Design zero-downtime resharding for a growing startup DB
Real-time communication: polling, long-polling, SSE, WebSockets. Trade-offs in connection overhead, latency, scalability, and browser support.
- Design WhatsApp's real-time messaging (WebSockets vs long polling)
- Design live sports score updates for 10M concurrent users
- Design a collaborative doc editor (Google Docs) — real-time sync
Hash ring for distributing keys across nodes with minimal remapping when nodes join/leave. Virtual nodes for balance. Used in Cassandra, DynamoDB, Memcached.
- Design a distributed cache using consistent hashing (like Memcached)
- Handle node failure and rebalancing in a Dynamo-style KV store
- Design CDN server selection with consistent hashing
Choosing between communication protocols: REST for simplicity, GraphQL for flexible queries, gRPC for performance, tRPC for type-safe TS full-stack.
- Design a GitHub-like API — justify REST vs GraphQL choice
- Build inter-service communication for microservices with gRPC
- Design a mobile app API — optimize for bandwidth with GraphQL
Redis vs Memcached. Cache tiers (CPU L1-L3, app cache, distributed cache). Cache hit ratio, TTL, cold start, and thundering herd problem at scale.
- Design a leaderboard using Redis sorted sets
- Design distributed session storage with Redis cluster
- Handle thundering herd on cache miss for a viral post
Vertical (scale-up) vs horizontal (scale-out) scaling. Auto-scaling policies, database read replicas, stateless service scaling, and cost implications.
- Scale a monolith to 10M users — what breaks first?
- Design auto-scaling for a flash sale that gets 100x traffic spike
- Scale Twitter to handle the Super Bowl second-by-second
LRU, LFU, FIFO, MRU, Random, TTL-based. When to use each, implementation complexity, and how Redis implements these under the hood.
- Implement LRU cache with O(1) get/put (LeetCode #146)
- Implement LFU cache (LeetCode #460)
- Design a CDN cache eviction policy for video segments
Choosing the right DB: OLTP vs OLAP, columnar stores, time-series DBs, graph DBs, full-text search. Replication, leader/follower, read replicas.
- Choose databases for Uber: trips, drivers, payments, analytics
- Design database replication for 99.99% read availability
- Design a time-series DB for IoT sensor data (1M writes/sec)
JSON Web Tokens for stateless auth. Header/payload/signature structure, signing algorithms (HS256 vs RS256), token refresh patterns, and revocation challenges.
- Design JWT-based auth with refresh token rotation
- Handle JWT revocation without a token blacklist at scale
- Design SSO across microservices using JWTs
Microservices vs monolith vs SOA. Service decomposition, inter-service communication, service discovery, circuit breakers, and service mesh (Istio).
- Decompose an e-commerce monolith into microservices
- Design service discovery for 50+ microservices
- Implement circuit breaker pattern for a payment service
Concurrency = dealing with multiple things at once; Parallelism = doing multiple things at once. Threads, goroutines, event loops, locks, deadlocks, race conditions.
- Design a thread-safe rate limiter (TokenBucket with mutex)
- Design concurrent image processing pipeline without race conditions
- Implement optimistic vs pessimistic locking for a booking system
Tracking and propagating DB changes in real-time using WAL/binlog. Tools: Debezium, Kafka Connect. Powers data sync, search indexing, cache invalidation.
- Sync PostgreSQL changes to Elasticsearch in real-time using CDC
- Invalidate cache on DB writes using CDC pipeline
- Design an audit log system using CDC (zero-impact on app code)
Atomicity, Consistency, Isolation, Durability. Isolation levels (Read Uncommitted → Serializable), phantom reads, dirty reads, 2PL, MVCC, and BASE vs ACID.
- Design a bank transfer — ensure atomicity across 2 account rows
- Identify and fix phantom read in a ticket reservation system
- Design distributed transactions with Saga pattern (no 2PC)
Content Delivery Networks — edge caching, PoP servers, anycast routing, push vs pull CDN, cache purging, and using CDN for dynamic content.
- Design a CDN strategy for Netflix video delivery
- Design cache invalidation when a user updates their profile photo
- Optimize a global e-commerce site with CDN for 50ms P99 latency
Synchronous (blocking) vs asynchronous (non-blocking) communication. Callbacks, promises, async/await, event-driven architectures, and temporal coupling.
- Convert a sync order processing API to async with callbacks
- Design an image resizing service using async queues
- Design email sending — sync vs async trade-offs in checkout flow
Token Bucket, Leaky Bucket, Fixed Window, Sliding Window Log, Sliding Window Counter. Distributed rate limiting with Redis. Choosing the right algorithm.
- Design rate limiter for Twitter API (user + IP + global limits)
- Implement distributed token bucket rate limiter using Redis
- Design DDoS protection layer with adaptive rate limiting
RESTful constraints: statelessness, uniform interface, HATEOAS, resource naming, HTTP methods/status codes, and REST maturity model (Richardson).
- Design RESTful API for a blog (CRUD + pagination + filtering)
- Design proper HTTP status codes for a payment API error taxonomy
- Design a HATEOAS-compliant API for a workflow engine
Protocol Buffers vs JSON, HTTP/2 vs HTTP/1.1, bidirectional streaming, code generation, browser support limitations, and performance benchmarks.
- Choose REST vs gRPC for: public API, internal microservices, mobile
- Design a real-time bidirectional chat with gRPC streaming
- Migrate a REST internal API to gRPC — justify the decision
Designing for failure: circuit breakers, bulkheads, retries with exponential backoff, timeouts, graceful degradation, chaos engineering, and disaster recovery.
- Design circuit breaker for payment service with graceful degradation
- Design chaos engineering test plan for a ride-sharing app
- Design disaster recovery with RPO < 1min and RTO < 5min