Distributed & Decentralized Systems Curriculum
Reflection Real Systems · Riak KV

Key Question

Riak has a small but dedicated production footprint. What lessons emerge from teams that ran it in production, and what killed its mainstream adoption?

Deep Dive

Pitfall 1: The “No Siblings” Surprise

Teams often start with LWW mode in Riak, expecting it to behave like a simple key-value store. The inevitable conflict that slips through to production causes data corruption that’s hard to detect:

// User profile update, LWW mode:
// Client A writes: { "email": "alice@new.com" }   @ T=100
// Client B writes: { "email": "alice@old.com" }   @ T=99
// LWW picks T=100 → "alice@new.com" (correct)

// But with concurrent updates:
// Client A writes: { "name": "Alice" }   @ T=100
// Client B writes: { "email": "bob@test" } @ T=100
// LWW picks whichever timestamp is higher → partial overwrite!
// { "name": "Alice" } OR { "email": "bob@test" }
// NOT: { "name": "Alice", "email": "bob@test" }

Lesson: LWW at the object level loses atomicity. The database doesn’t understand your schema — it doesn’t know that name and email are separate fields. This is why Riak 2.0 added CRDTs (map CRDT avoids this problem).

Pitfall 2: MapReduce at Scale

Riak’s built-in MapReduce (via Riak Pipe) is powerful but dangerous. It streams data from multiple vnodes to a reduce phase on the coordinator. If the result set is large, the coordinator runs out of memory:

// Riak Pipe architecture:
// Map phase: each vnode processes its local data (parallel)
// Reduce phase: coordinator aggregates all results (single node)
// Problem: coordinator memory = O(result count)
// Solution: pipelined reduce (chain reduce across nodes)

Lesson: Large MapReduce jobs in Riak require careful result set sizing. For production analytics, most teams exported data to a dedicated analytics store.

Riak’s Production Footprint

Riak was most successful in:

  1. Telecommunications (Basho’s original market) — high availability requirements, tolerant of eventually consistent data.
  2. Gaming — session storage where vector clock conflicts are rare and LWW works well.
  3. Content management — storing blobs with metadata, where key-based access is natural.

It was least successful in:

  1. Financial systems — too AP, not enough consistency guarantees.
  2. Analytics — no query language made ad-hoc analysis impossible.
  3. High-scale web apps (Facebook, Twitter scale) — Cassandra’s CQL and linear scaling won.

What Killed Riak?

Basho Technologies (the company behind Riak) shut down in 2017. The causes:

  1. Market mismatch: Developers wanted SQL-like query (Cassandra CQL, MongoDB query language), not pure KV with application-level merge logic.
  2. Erlang barrier: Hard to hire Erlang developers. The community was smaller than Java (Cassandra) or C++ (MongoDB).
  3. Basho’s licensing: Open-core model (Riak EE for enterprise features). Amazon, Google, Microsoft built managed services for Cassandra and MongoDB. Riak had no managed cloud service.
  4. Cassandra’s momentum: By 2015, Cassandra had won the Dynamo wars. Netflix, Apple, and Twitter were all-in on Cassandra.

Riak’s Legacy

Riak is still in production at many companies (via the open-source fork, or Basho’s successor company). Its design lives on in:

  • AntidoteDB — CRDT-based geo-replicated database, inspired by Riak.
  • Cassandra’s improved repair — anti-entropy and hinted handoff refinements.
  • Distributed systems education — Riak remains the best teaching tool for Dynamo concepts.

Key Takeaways

  • LWW is dangerous without understanding object-level atomicity.
  • MapReduce at scale requires careful resource management.
  • Riak failed as a business because market fit (query language, hiring, cloud managed services) mattered more than technical purity.
  • Riak’s Dynamo implementation remains the gold standard for teaching distributed systems fundamentals.

Full Source

View or download the complete implementation: riak.ts

Exercises

  1. You’re designing a user profile service. How do you avoid the LWW partial-overwrite problem? What data model would you use?
  2. Why did Basho open-source Riak but charge for Riak EE? How did this compare to MongoDB’s and Cassandra’s approaches?
  3. The “Erlang niche” is often cited as Riak’s weakness. But Erlang’s BEAM VM is arguably better for distributed systems than Java’s JVM. Explain both sides.

👁️ View Solutions

  1. Three approaches: (a) Use CRDT maps — the map CRDT merges independently updated fields without conflict; name and email updates would converge to {name: "Alice", email: "bob@test"}. (b) Use LWW with the whole document — each write sends the FULL document, not partial updates. This works but wastes bandwidth and creates race conditions. (c) Use sibling resolution — accept that Riak will return siblings and merge them at the application layer. This is the most correct but most complex approach.
  2. Basho’s open-core model (free community edition, paid enterprise) was common in the 2010s. MongoDB used the same model (community server free, enterprise + Atlas paid). Cassandra was fully open-source (Apache license) — DataStax made money on support and managed services. MongoDB’s approach (free community + paid Atlas) won because Atlas made MongoDB easy. Riak had no equivalent of Atlas.
  3. Pro-Erlang: BEAM gives you actor-model concurrency, fault isolation (supervisor trees), hot code swapping, and soft real-time GC. These are ideal for a distributed database. Pro-JVM: Java has a massive ecosystem, thousands of libraries, easy hiring, and excellent tooling (JMX, profilers). BEAM is niche. The practical reality is that hiring and ecosystem matter more than VM elegance for a database’s commercial success.

✏️ Exercises

Riak KV — Exercises

Exercise 1

A Riak bucket is configured with n_val=3 and r=2, w=2. Three physical nodes host the data. One node is down. Can you still serve reads? Can you still serve writes? Why?

Exercise 2

You have microservices A, B, and C, all writing to the same key cart:user42. A writes at T=1, B writes concurrently at T=2 (without seeing A’s write), and C writes at T=3 (without seeing either). How many siblings does a subsequent read return? What does each sibling’s vector clock look like?

Exercise 3

How does hinted handoff interact with the w consistency level in Riak? If w=2 and only 1 node is available, does the write succeed?

Exercise 4

Explain why LWW mode in Riak can silently drop data. Give a concrete example involving a counter (not a user profile).


👁️ View Solutions

  1. Yes to both. n_val=3 means the prefList has 3 vnodes. With one node down (hosting ~1/3 of the vnodes), 2 vnodes are available. r=2 and w=2 can both be satisfied by 2 available replicas. However, if the down node has 2 of the 3 vnodes in the prefList (unlikely with random distribution but possible), the quorum wouldn’t be met. This is why vnodes are spread across physical nodes — to minimize this scenario.

  2. Three siblings. A’s clock: [A:1]. B’s clock: [B:1]. C’s clock: [C:1]. Each vector clock is a single-entry clock with the writing service’s counter. None are causally related (each has entries the others don’t). The read must return all three siblings. The client must resolve them — likely by merging the cart contents from all three.

  3. In Riak, hinted handoff counts toward the w acknowledgment. If the coordinator writes to 1 available replica and stores 1 hint (to be delivered later), it has satisfied w=2. The write succeeds even though only 1 node physically stores the data. This is “sloppy quorum” — from the Dynamo paper. The durability, however, is weak: the hint is only on the coordinator and could be lost.

  4. Counter example: two region servers both increment a visitor counter:

    • Server US: counter = 1000 @ T=100
    • Server EU: counter = 500 @ T=101
    • LWW picks T=101 → counter = 500
    • The US counter’s 1000 increments are LOST.

    With a CRDT counter (PN Counter), the result would be 1500 (correct — both increments are preserved). This is the canonical case for Riak CRDTs over LWW.