Key Question
What makes gRPC different from traditional RPC?
Deep Dive
gRPC is Googleβs open-source RPC framework built on HTTP/2 and Protocol Buffers. Itβs not just βRPC with Protobufβ β the streaming model fundamentally changes whatβs possible.
Four Service Types
gRPC defines four kinds of RPC calls, each useful for different patterns.
1. Unary RPC β Standard request-response. Client sends one message, server replies with one.
Client βββ Request βββΊ Server
βββ Response ββ
2. Server-Streaming β Client sends one request; server pushes a stream of responses.
Client βββ Request βββΊ Server
βββ Response 1 ββ
βββ Response 2 ββ
βββ Response 3 ββ (server closes stream)
Useful for: real-time feed, large result sets, progress updates.
3. Client-Streaming β Client pushes a stream of messages; server replies once.
Client βββ Request 1 βββΊ Server
βββ Request 2 βββΊ
βββ Request 3 βββΊ
βββ Response ββββ
Useful for: batch upload, sensor data, large file upload.
4. Bidirectional Streaming β Both sides send independent streams simultaneously.
Client βββ Request 1 βββΊ Server
βββ Request 2 βββΊ
βββ Response 1 ββ
βββ Request 3 βββΊ
βββ Response 2 ββ
βββ Response 3 ββ
Useful for: chat, real-time collaboration, streaming analytics.
Unary vs Streaming: Latency Difference
Unary (one request per item):
ββββ ββ Req 1 βββΊ ββββ
βC Lβ βββ Res 1 ββ βS β
βL Iβ ββ Req 2 βββΊ βE β (N round trips = 2N network delays)
βI Eβ βββ Res 2 ββ βR β
βE Nβ ββ Req 3 βββΊ βV β
βN Tβ βββ Res 3 ββ βER β
ββββ ββββ
Streaming (single connection, multiplexed):
ββββ ββββ Req 1 βββββΊ ββββ
βC Lβ ββββ Req 2 βββββΊ βS β
βL Iβ ββββ Res 1 ββββ βE β (1 RTT setup + stream)
βI Eβ ββββ Req 3 βββββΊ βR β
βE Nβ ββββ Res 2 ββββ βV β
βN Tβ ββββ Res 3 ββββ βER β
ββββ ββββ
With unary, N requests = N separate HTTP/2 round trips. With streaming, the connection is established once and messages flow without per-request overhead.
Why HTTP/2 Matters
| Feature | HTTP/1.1 | HTTP/2 |
|---|---|---|
| Multiplexing | No (one request per connection) | Yes (multiple streams on one connection) |
| Header compression | No | Yes (HPACK) |
| Server push | No | Yes |
| Binary framing | No | Yes |
| Stream priority | No | Yes |
gRPC multiplexes many RPC calls over a single TCP connection. No head-of-line blocking. No connection storms when a service restarts.
Real-World gRPC
gRPC is the default RPC framework for:
- Microservices β standard internal communication (e.g., Netflix, Square, Lyft)
- Kubernetes APIs β
kubectltalks to the API server via gRPC - etcd β distributed key-value store uses gRPC for all client communication
- Envoy proxy β uses gRPC for xDS configuration APIs
service UserService {
// Unary
rpc GetUser (GetUserRequest) returns (User);
// Server-streaming: get all users matching a filter
rpc ListUsers (ListUsersRequest) returns (stream User);
// Bidirectional streaming: real-time user updates
rpc WatchUsers (stream UserAction) returns (stream UserEvent);
}
Check Your Understanding
-
A ride-sharing app needs to send real-time GPS coordinates from driver to server every 100ms. Which gRPC streaming type should they use?
-
Why does HTTP/2 multiplexing reduce the βthundering herdβ problem when a microservice restarts?
-
You have an existing REST API. Should you rewrite it in gRPC? When is it worth the migration cost?
The βSo What?β
gRPCβs streaming model isnβt just a performance optimization β it enables entirely new patterns: real-time dashboards, streaming ML inference, live collaboration, event sourcing. If youβre building microservices today, gRPC is the default choice for internal communication. REST still wins for external/public APIs where browsers and mobile clients are the consumers (but even that is changing with gRPC-Web and Connect).
βοΈ Exercises
Exercises: Architectural Models & RPC
Exercise 1: Architecture Choice
A team is building a file-sharing application for 100,000 users. Files are read-heavy (mostly downloads), users are geographically distributed, and there is no budget for centralized infrastructure.
- Which architectural model (client-server, P2P, multi-tier, hybrid) would you recommend?
- What specific design decisions would you make to handle the read-heavy workload?
- What are the top three problems youβd need to solve?
Exercise 2: RPC Call Failure Analysis
A client calls deductBalance(userID: "u42", amount: 50.00) via RPC. The client stub sends the request to the server. For each scenario below, state what happens and whether the clientβs balance is correct:
- The client stub marshals the request and sends it. The server receives it, unmarshals, calls the function (which deducts $50 from the DB), but the server crashes before sending the response.
- The server receives the request, processes it, and sends the response. The response is lost in the network. The client times out and throws an exception.
- The client sends the request. The server processes it and sends the response. The client receives the response. Everything works β but the network duplicates the request and the server processes it twice.
Exercise 3: Protobuf Field Evolution
You have this protobuf schema deployed in production:
message Order {
string order_id = 1;
string user_id = 2;
float total = 3;
}
You want to add a string coupon_code = 4 field. Some old server instances still running donβt know about field 4.
- Will old servers crash when they receive a message with
coupon_codeset? - A client built from the old schema processes a message that has
coupon_code. What does the client see? - What happens if you later delete field 3 (
total) and reuse its number for a newint64 total_cents = 3?
ποΈ View Solutions
Solutions: Architectural Models & RPC
Exercise 1: Architecture Choice
-
Recommended model: Hybrid with P2P as the primary model β users share files directly with each other β plus a small set of super-peers (or a lightweight tracker) for discovery.
- Pure P2P (BitTorrent-style) handles read-heavy workloads naturally: popular files are replicated across many peers, distributing the download load.
- A tracker (a lightweight centralized component) solves the discovery problem β where to find each file.
- A small number of βseedβ servers can ensure unpopular files remain available (no single point of failure because seeds are optional).
-
Design decisions:
- Chunk files into pieces so peers can download different chunks from different peers in parallel.
- Use content-addressed storage (hash of the file content as the identifier) to verify integrity.
- Implement a tit-for-tat incentive mechanism (you share, you get faster downloads).
-
Top three problems:
- Discovery: How do peers find each other and learn which files are available? (Tracker nodes + DHT)
- Churn: Peers join and leave constantly. How do you maintain availability of rare files? (Replication factor, redundancy)
- Trust: How do you prevent peers from serving corrupted data? (Content hashing, cryptographic verification)
Exercise 2: RPC Call Failure Analysis
-
Correct? β No. The server deducted $50 but crashed before the client got confirmation. The client assumes the call failed and may retry. You now have a duplicate deduction (or at least a $50 mismatch unless the operation is idempotent). This is the classic βat-most-once vs at-least-onceβ dilemma. Solution: make
deductBalanceidempotent (using a request ID or idempotency key). -
Correct? β The server did deduct $50. The client got an exception and doesnβt know whether the deduction happened. This is the exactly-once is impossible problem in distributed systems. The client must check the balance or retry with idempotency.
-
Correct? β No. The balance is deducted twice β $100 total instead of $50. The server processed the request twice because the transport layer delivered a duplicate. Solution: deduplication at the server (track recently seen request IDs).
Key insight: In all three cases, the client cannot trivially know the correct balance without additional mechanisms (idempotency keys, at-least-once delivery with dedup, transactional outboxes).
Exercise 3: Protobuf Field Evolution
-
Will old servers crash? No. Protobuf is designed for forward compatibility. Old servers that donβt know about field 4 will simply ignore the unknown bytes. The message is self-describing enough that unknown fields are skipped during deserialization.
-
What does the old client see? The
coupon_codefield will be absent (default empty string). The client seesorder.coupon_code == "". The data is not lost β if the client re-serializes the message, the unknown bytes for field 4 are preserved and passed through. -
What happens if you delete field 3 and reuse its number? Disaster. Old servers still running will interpret the new
total_centsfield as the oldtotalfield and read garbage. Never reuse a field number. Instead, mark the field asreserved 3;in the new schema β this prevents accidental reuse.