Reflection Real Systems · Comparison

Key Question

How do you choose between Redis Cluster, Cassandra, MongoDB, and Riak — four systems that all claim to be “distributed databases” but work fundamentally differently?

Deep Dive

The Decision Space

Every distributed database is a set of trade-offs. The right choice depends on your workload pattern, consistency requirements, and operational constraints.

Dimension 1: Write Model

System	Write Model	Conflict Resolution	Write Scalability
Redis Cluster	Single-primary per hash slot	No conflicts (serial)	Linear (add shards)
MongoDB	Single-primary per replica set	No conflicts (serial)	Bottlenecked by primary
Cassandra	Leaderless (any node)	LWW (timestamp)	Linear (add nodes)
Riak	Leaderless (any vnode)	Vector clock / LWW / CRDT	Linear (add nodes)

Decision rule: If you need >10K writes/sec per data partition, prefer a leaderless system (Cassandra, Riak). If write integrity (no lost data) is paramount, prefer single-primary (Redis, MongoDB) or CRDT-backed Riak.

Dimension 2: Consistency Model

System	Default Consistency	Tunable?	Reads During Partition
Redis Cluster	Strong (single-primary)	No	Only on accessible shards
MongoDB	Strong (primary reads)	Yes (read concern)	Only on accessible replicas
Cassandra	Tunable (CL)	Yes (per-query)	Yes (AP with CL=ONE)
Riak	Tunable (r/w/dw)	Yes (per-query)	Yes (AP by default)

Decision rule: If your application needs strong consistency (banking, inventory), use single-primary or CL=ALL. If you can tolerate stale reads for lower latency, use AP systems.

Dimension 3: Query Model

System	Query Interface	Secondary Indexes	Aggregation
Redis Cluster	Key-Value commands	No (use RedisSearch)	Lua scripts
MongoDB	Rich query language	Yes	Aggregation pipeline
Cassandra	CQL (SQL-like)	Yes (limited)	Allow filtering
Riak	Key-Value only	Yokozuna (add-on)	MapReduce (Riak Pipe)

Decision rule: If you need complex queries, MongoDB wins. If you need predictable key-based access, any system works. Redis and Riak are KV-only — don’t use them for ad-hoc query workloads.

Dimension 4: Operational Model

System	Point of Failure	Scaling	Repair
Redis Cluster	Primary per shard	Manual resharding	No repair (sync replication)
MongoDB	Primary per replica set	Shard key planning	Oplog window monitoring
Cassandra	None (peer-to-peer)	Linear (add nodes)	Repair needed regularly
Riak	None (peer-to-peer)	Linear (add vnodes)	Anti-entropy (AAE trees)

Decision rule: If you have a small ops team, MongoDB’s operational tooling and cloud support are best. If you need to scale to 100+ nodes, Cassandra’s peer-to-peer design is most proven.

Key Takeaways

Redis Cluster is best for caching/sessions where consistency matters and data fits in memory.
Cassandra is best for high-throughput writes with predictable key access.
MongoDB is best for rich query workloads with moderate write throughput.
Riak is best for maximum availability with automatic conflict resolution via CRDTs.

Full Source

View or download the complete implementation: comparison.ts

Exercises

You’re building a real-time leaderboard for a gaming platform. Which system do you choose? Why?
You’re building a global e-commerce product catalog with hundreds of thousands of SKUs. Explain your choice.
You’re building a distributed session store for 50 million users. Compare all four systems for this use case.

👁️ View Solutions

Redis Cluster is the clear winner. Leaderboards use sorted sets (ZADD/ZRANGE), and Redis’s sorted set data structure is purpose-built for ranking. MongoDB could work but needs more code. Cassandra and Riak would require building the ranking externally. Redis Cluster handles the memory requirement with sharding.
MongoDB is the best choice. Product catalogs need rich queries (search by category, price range, brand). MongoDB’s document model maps naturally to products (varying attributes). Cassandra could work if queries are pre-defined. Redis and Riak require building a separate search index. If the catalog is read-mostly and fits in memory, Redis with RedisSearch is also viable.
Session store comparison: Redis Cluster wins on speed and simplicity (single-key lookups, TTL expiration). MongoDB has richer tooling but adds latency. Cassandra can handle 50M sessions but has unpredictable read latency (compaction). Riak’s vector clocks are unnecessary for sessions (LWW is fine). All four CAN work. Redis Cluster is the most common choice for this exact workload.

✏️ Exercises

Module 8: Comparison — Exercises

Exercise 1

Match the scenario to the recommended system:

Scenario	System
A. Real-time chat (1M concurrent users, low latency)	?
B. Product catalog (50K SKUs, faceted search)	?
C. User sessions (50M users, TTL-based expiry)	?
D. IoT sensor data (1M writes/sec, time-series)	?
E. Shopping cart (high consistency, multi-device)	?

Exercise 2

Your team chose Cassandra for a new project. Six months later, reads are getting slower and disk usage is growing faster than expected. What’s likely happening? What do you check?

Exercise 3

Explain the trade-off between Redis Cluster’s “no conflicts” guarantee and its “asynchronous replication” behavior. How can a write be acknowledged and still lost?

Exercise 4

A managed cloud service (MongoDB Atlas, Amazon MemoryDB, Amazon Keyspaces) eliminates operational burden. Does this change the decision tree? When would you still run a system yourself?

👁️ View Solutions

A → Redis Cluster (pub/sub + sorted sets for presence, in-memory speed). B → MongoDB (rich query, faceted search, varying product attributes). C → Redis Cluster (TTL key expiration, sub-ms lookup). D → Cassandra (write throughput, time-series data model with clustering). E → Riak with CRDTs OR Redis with careful locking — Redis has no conflict resolution; Riak’s CRDT merge preserves all operations.
Tombstones. Deletes in Cassandra create tombstones that occupy space and slow down reads until compaction removes them. Check: nodetool cfstats (look for dropped tombstones, SSTable count), nodetool compactionstats (compaction backlog), and nodetool tablestats for read latency. Fix: increase compaction throughput, adjust gc_grace_seconds, and run targeted repair to clear tombstones.
Redis Cluster guarantees no conflicts within a shard because all operations are serialized through one primary. But the primary replicates asynchronously to its replica. If the primary accepts a write, acknowledges it to the client, and fails before the replica receives it, the write is lost. The “no conflict” guarantee is about concurrent access (two clients writing different values), not durability. Fix: use Redis with WAIT to ensure synchronous replication, or use a different system.
Managed services change the decision tree significantly: MongoDB Atlas removes ops burden and makes MongoDB viable for more use cases. Amazon MemoryDB (Redis-compatible with durable replication) solves the async-replication-loss problem. Amazon Keyspaces (Cassandra-compatible) removes repair burden. Run your own only when: (a) Data sovereignty requires on-premise deployment. (b) Your throughput requirements exceed managed service limits. (c) Cost optimization (at sufficient scale, self-hosted is cheaper). (d) You need features the managed service doesn’t support.