Reflection Real Systems · Redis Cluster

Key Question

Where does Redis Cluster deliberately deviate from textbook distributed systems design, and why?

Deep Dive

Redis Cluster is intentionally not a linearizable distributed database. Its design choices are optimized for a specific use case: in-memory caching and low-latency operations at the cost of consistency guarantees.

Deviation 1: No Consensus for Writes

Textbook distributed databases use consensus (Raft, Paxos) or quorums (Dynamo-style) to ensure writes are durable. Redis Cluster does neither. A single master owns a slot, and writes are not confirmed by any other node.

// Redis: single master decides
set(key: string, value: string): boolean {
  const node = this.getNodeForKey(key)
  if (!node || !node.isAlive) return false
  node.data.set(key, value)  // No quorum, no consensus
  return true
}

Contrast with Cassandra, where a write at QUORUM requires floor(N/2) + 1 confirmations before responding. Redis’s approach is faster but provides no durability guarantee.

Deviation 2: Manual Resharding

In Dynamo/Cassandra, adding a node triggers automatic token redistribution. In Redis Cluster, resharding is manual — an operator must run CLUSTER SETSLOT commands to migrate slots. This is not an oversight; it is a design decision to give operators control over data movement.

Redis’s philosophy: predictable manual operations are better than unpredictable automatic ones, especially when moving terabytes of in-memory data.

Deviation 3: No Multi-Key Transactions Across Slots

Redis supports transactions (MULTI/EXEC) and Lua scripts, but they only work atomically within a single slot. Operations that span slots require client-side coordination or Redis’s cross-slot “proxy” mode (which is not truly atomic).

// This works (both keys have same hash tag):
SET {shard:1}:balance:100 50
SET {shard:1}:balance:200 150

// This DOES NOT work atomically:
SET user:100:balance 50   // slot 123
SET user:200:balance 150  // slot 456

Cassandra’s approach is different: it supports lightweight transactions (compare-and-set) on a single partition using Paxos, but also does not support multi-partition transactions.

Deviation 4: No Read Repair

Cassandra and Riak use read repair: when a read detects stale replicas, it corrects them in the background. Redis Cluster has no read repair. If a replica falls behind, reads from it return stale data until replication catches up. The absence of read repair is consistent with Redis’s philosophy: keep the code simple and fast.

Key Takeaways

Redis Cluster trades durability for performance at every decision point.
Manual resharding gives operators control but adds operational burden.
No cross-slot transactions is a hard constraint of the slot model.
Redis is PA/EL + performance — the most extreme example of “availability over consistency” among the systems studied in this module.

Full Source

View or download the complete implementation: redis-cluster.ts

Exercises

If Redis Cluster used synchronous replication (write to master + wait for replica ACK), how much would write latency increase?
Why would automatic resharding be more dangerous for in-memory data than for disk-based data?
Under what conditions might you choose Redis Cluster over Cassandra despite the weaker consistency?

👁️ View Solutions

At minimum, one additional network round trip between master and replica. In the same data center: +0.5–1ms. Across regions: +50–200ms. For a cache that typically handles 10K+ writes/second, this is catastrophic.
Resharding means moving data between nodes. For in-memory data, this means shipping gigabytes of hot data over the network, consuming both bandwidth and CPU on both ends. An automatic rebalance could trigger cascading failures if it saturates the cluster bus. Manual control lets operators schedule migrations during low-traffic windows.
Choose Redis Cluster when: (a) you need sub-millisecond read/write latency, (b) data loss of the last few seconds is acceptable, (c) you need rich data structures (lists, sets, sorted sets, streams) that Cassandra doesn’t provide, (d) your dataset fits entirely in memory.

✏️ Exercises

Redis Cluster — Exercises

Exercise 1

You have a 6-node Redis Cluster (3 masters, 3 replicas). A master node handling slots 5461–10922 crashes. Its replica is promoted. What keys are lost?

Exercise 2

Given the key set: user:100, {shard:a}:profile, order:{2024}:001, session:abc, compute the hash slot for each using CRC16 mod 16384. Which keys would land on the same node if you use hash tags?

Exercise 3

Redis Cluster’s gossip protocol requires each node to maintain its own epoch (configuration version). When a new node joins, its epoch is 0. The cluster leader assigns it a higher epoch. Why is this epoch necessary? (Hint: think about stale routing tables.)

Exercise 4

Compare the slot bitmap approach (Redis) with the token ring approach (Cassandra). Which requires more metadata to route a request? Which handles node addition more smoothly?

👁️ View Solutions

Only the keys that were written to the crashed master but not yet replicated to the replica. Since replication is asynchronous, the window is typically 1–100 milliseconds of writes. The keys on the surviving masters are unaffected.
Hash slot computation:
- CRC16("user:100") mod 16384 = slot X
- CRC16("shard:a") mod 16384 = slot Y (only “shard:a” due to {})
- CRC16("2024") mod 16384 = slot Z (only “2024” due to {})
- CRC16("session:abc") mod 16384 = slot W
- {shard:a}:profile and {shard:a}:name would share the same slot. order:{2024}:001 and order:{2024}:002 would share a different slot.
The epoch prevents stale routing information from overwriting fresh data. If a partitioned node wakes up with an old epoch and broadcasts its slot map, nodes with a higher epoch know to ignore it. This is functionally identical to Raft’s term number.
Redis uses a fixed-size 16384-entry array — O(1) lookup, 2 KB per node. Cassandra’s token ring uses a sorted list of tokens with binary search — O(log N) lookup. Redis is simpler at the cost of manual rebalancing; Cassandra is more complex but handles additions and removals automatically.