Key Question
When should Cassandra be your default choice for a distributed database?
Deep Dive
Cassandra occupies a specific niche in the distributed database landscape: high-volume writes with survivable consistency.
The Case for Cassandra
Cassandra excels in three specific scenarios:
1. Time-series data at scale
- IoT sensor data: millions of devices write every second.
- Cassandra’s append-heavy write model and TWCS compaction handle this perfectly.
- Each write is a single append to the commit log + memtable. No reads, no seeks.
2. Multi-region deployments
- Cassandra supports active-active replication across data centers.
- Writes in US-EAST go to EU-WEST within seconds.
- Each region accepts writes independently — no cross-region coordination during the write path.
3. High-availability requirements
- No single point of failure. Any node can coordinate any operation.
- Rolling upgrades are standard: take down nodes one at a time, the cluster keeps serving.
When NOT to Use Cassandra
- Transactions across partitions: Cassandra does not support ACID transactions across rows.
- Strongly consistent reads at high throughput: CL=ALL reads are slow. If you need consistent reads for 95% of queries, Cassandra is the wrong choice.
- Ad-hoc queries: Cassandra’s query model requires careful index design. You cannot easily add a new query pattern without creating a new table or index.
- Small datasets (<100 GB per node): Cassandra’s overhead (compaction, gossip, commit log) is significant. For small datasets, PostgreSQL with replication is simpler and faster.
The Operational Truth
Cassandra is not a set-it-and-forget-it database. Running it well requires:
- Monitoring GC pause times (Cassandra is JVM-based)
- Tuning compaction strategies per table
- Managing token distribution (especially when adding nodes)
- Handling “hot spots” where a single partition key absorbs too many writes
The “So What?”
Cassandra’s design philosophy: “availability first, consistency when you need it.” It is the production embodiment of the PA/EL quadrant. If your application can tolerate stale reads in exchange for always-available writes, Cassandra is the most battle-tested option in the ecosystem.
Full Source
View or download the complete implementation: cassandra.ts
Exercises
- A social media analytics platform ingests 100K events/second. Each event is a timestamp + user ID + metric. Would you recommend Cassandra? Why?
- Your startup needs a database for a multiplayer game leaderboard. Cassandra or Redis? Justify.
- What monitoring metrics would you track to detect Cassandra compaction issues before they cause an outage?
👁️ View Solutions
- Yes — this is the ideal Cassandra use case. Time-series data with high write volume. Use TWCS compaction with a 1-day window. Set RF=3 and CL=QUORUM for writes. Reads can be CL=ONE since analytics queries tolerate staleness.
- Cassandra is not the best choice for leaderboards. Redis’s sorted sets (
ZADD/ZRANGE) provide sub-millisecond leaderboard operations. Cassandra would require a custom implementation using counters or a materialized view, with higher latency. Use Redis for the leaderboard and Cassandra for event persistence. - Key metrics: (a) Pending compaction tasks — if this grows, nodes fall behind. (b) Disk space before/after compaction — if SSTables aren’t shrinking, compaction is stuck. (c) Read latency p99 — stale compaction causes reads to scan more SSTables. (d) GC pause time — compaction triggers GC, GC triggers node timeouts.
✏️ Exercises
Apache Cassandra — Exercises
Exercise 1
You have a 6-node Cassandra cluster with RF=3. Node A is the coordinator for a write to key K with CL=QUORUM. The replicas are Nodes B, C, D. Node C is down. How many nodes must acknowledge the write for it to succeed? Which nodes?
Exercise 2
Describe the sequence of events when a Cassandra read at CL=QUORUM discovers that one of the three replicas has a stale value. What happens to the stale replica? What happens to the client?
Exercise 3
Cassandra’s hinted handoff stores hints on the coordinator. If the coordinator fails before delivering the hints, what happens? How does this differ from Riak’s approach?
Exercise 4
A developer configures R+W ≤ RF (e.g., W=ONE, R=ONE with RF=3). They expect “eventual consistency.” Describe a scenario where a read returns a value that was explicitly overwritten 10 seconds ago.
👁️ View Solutions
-
QUORUM =
floor(3/2)+1 = 2nodes. The write succeeds if 2 of the 3 replicas (B, C, D) acknowledge. Since C is down, the coordinator sends to B and D. If both are up:ok: true. The coordinator also stores a hint for C, to be delivered when C recovers. -
(a) The coordinator sends read requests to all RF replicas. (b) Two replicas return
value=v2 @ ts=100. One returnsvalue=v1 @ ts=50. (c) The coordinator seests=100 > ts=50and returnsv2to the client. (d) In the background, the coordinator sends a write to the stale replica:SET key=v2 @ ts=100. (e) Subsequent reads from any replica returnv2. The client sees the latest value immediately. The repair is asynchronous from the client’s perspective. -
The hints are lost. Cassandra’s hinted handoff is stored locally. If the coordinator crashes before delivering them, the data is gone. Riak’s approach is similar — both rely on the coordinator surviving. Neither provides durable hinted handoff. This is why read repair is essential: it catches the inconsistency on the next read.
-
Scenario: A user updates their profile (
SET name="Alice"→ written to Node A only with CL=ONE). Before replication propagates, another user reads from Node B (which still hasname="Alice (old)"). The read succeeds at CL=ONE (only Node B needs to respond) and returns the stale value. Even though the write was 10 seconds ago, gossip may not have propagated yet. This is “eventual” — there is no time guarantee.