Reflection Real Systems · Redis Cluster

Key Question

How does a Redis Cluster node know which node owns a given key, and what happens when the cluster topology changes?

Deep Dive

The redis-cluster.ts file implements a simplified Redis Cluster. The core data structure is straightforward: an array of 16384 entries, each pointing to the master node responsible for that slot.

The Routing Table

class RedisCluster {
  slotTable: (RedisClusterNode | null)[] = new Array(16384).fill(null)

  getNodeForKey(key: string): RedisClusterNode | null {
    const slot = hashSlot(key)
    return this.slotTable[slot]
  }
}

When a client sends a GET or SET, the cluster computes hashSlot(key) and looks up the slot table. If the client is connected to the wrong node (because it has a stale routing table), that node returns a MOVED error with the correct node’s address.

Assigning Slots to Nodes

Slots are assigned in contiguous ranges. In the simulation, we divide 16384 by 3:

const slotsPerNode = Math.floor(16384 / 3)
cluster.assignSlots(m1, Array.from({ length: slotsPerNode }, (_, i) => i))
cluster.assignSlots(m2, Array.from({ length: slotsPerNode }, (_, i) => i + slotsPerNode))

In production, slot assignment is managed by CLUSTER ADDSLOTS or CLUSTER SETSLOT, and there is no automatic rebalancing — an operator must run redis-cli --cluster rebalance.

The Replication Gap

The simulation reveals Redis’s most consequential trade-off:

// After M1 fails and R1 is promoted:
// Keys that were on M1 but not yet replicated to R1 are LOST

The simulation output shows KEY MISSING for keys that were on the failed master. This demonstrates the cost of asynchronous replication: durability is sacrificed for single-digit millisecond write latency.

The Cluster Bus

In production, Redis Cluster nodes communicate over a separate cluster bus (port = service port + 10000). They gossip about:

Node health (ping/pong)
Slot ownership changes
Configuration epochs (a Lamport-style counter)
Failover votes

The cluster bus uses a binary protocol, not Redis’s text protocol, for efficiency.

Key Takeaways

The routing table is a simple array — hash(key) mod 16384 → direct lookup.
MOVED redirects let clients eventually converge on the correct topology.
Replication is asynchronous; failover loses un-replicated writes.
Slot migration is manual and operator-driven.

Full Source

View or download the complete implementation: redis-cluster.ts

Exercises

Run the simulation. Change slotsPerNode to distribute slots unevenly (e.g., 8000, 4000, 4384). What happens to key distribution?
Why does the simulation show KEY MISSING after failover?
Add a REPLICATE command to the simulation that synchronizes data from master to replica after write operations. What trade-off does this introduce?

👁️ View Solutions

The node with 8000 slots stores nearly 50% of all keys, becoming a bottleneck. Uneven slot distribution causes hot spots. This is why Redis Cluster operators monitor slot distribution and rebalance regularly.
The replica (R1) was promoted to master but never received the last writes that M1 processed. Since replication is asynchronous (the write is sent to the replica but not awaited), the window between “client gets ACK” and “replica has the data” is where data loss occurs.
Synchronous replication (wait for replica ACK) adds latency equal to one network round trip per write. This is the trade-off Redis makes: fast writes with potential data loss vs. slower writes with guaranteed durability. Redis calls this WAIT — it exists but is opt-in.

✏️ Exercises

Redis Cluster — Exercises

Exercise 1

You have a 6-node Redis Cluster (3 masters, 3 replicas). A master node handling slots 5461–10922 crashes. Its replica is promoted. What keys are lost?

Exercise 2

Given the key set: user:100, {shard:a}:profile, order:{2024}:001, session:abc, compute the hash slot for each using CRC16 mod 16384. Which keys would land on the same node if you use hash tags?

Exercise 3

Redis Cluster’s gossip protocol requires each node to maintain its own epoch (configuration version). When a new node joins, its epoch is 0. The cluster leader assigns it a higher epoch. Why is this epoch necessary? (Hint: think about stale routing tables.)

Exercise 4

Compare the slot bitmap approach (Redis) with the token ring approach (Cassandra). Which requires more metadata to route a request? Which handles node addition more smoothly?

👁️ View Solutions

Only the keys that were written to the crashed master but not yet replicated to the replica. Since replication is asynchronous, the window is typically 1–100 milliseconds of writes. The keys on the surviving masters are unaffected.
Hash slot computation:
- CRC16("user:100") mod 16384 = slot X
- CRC16("shard:a") mod 16384 = slot Y (only “shard:a” due to {})
- CRC16("2024") mod 16384 = slot Z (only “2024” due to {})
- CRC16("session:abc") mod 16384 = slot W
- {shard:a}:profile and {shard:a}:name would share the same slot. order:{2024}:001 and order:{2024}:002 would share a different slot.
The epoch prevents stale routing information from overwriting fresh data. If a partitioned node wakes up with an old epoch and broadcasts its slot map, nodes with a higher epoch know to ignore it. This is functionally identical to Raft’s term number.
Redis uses a fixed-size 16384-entry array — O(1) lookup, 2 KB per node. Cassandra’s token ring uses a sorted list of tokens with binary search — O(log N) lookup. Redis is simpler at the cost of manual rebalancing; Cassandra is more complex but handles additions and removals automatically.