Distributed & Decentralized Systems Curriculum
Consistency Trade offs · Eventual Consistency

Key Question

Does “eventual consistency” mean I can write data and then immediately fail to read it?

Deep Dive

Short answer: no, not if the system implements session guarantees. Eventual consistency in its pure form means a write might not be visible to anyone for an unbounded time. But that’s terrible for user experience. If you post a comment and then refresh the page to see it missing — that’s a UX disaster.

Enter session guarantees — a family of per-client consistency promises that make eventual consistency feel consistent to each individual user, even though the system as a whole is eventually consistent.

There are four session guarantees, defined by Terry et al. (1994):

1. Read-Your-Writes (RYW)

After a client completes a write, any subsequent read by that same client will see the write’s effect. Implementation options:

  • Sticky sessions: Route the client to the same replica for all operations. The replica has the latest value for this client.
  • Client-supplied timestamp: The client sends its last write timestamp with each read. The server blocks the read until it has caught up to that timestamp.
  • Session token: The server returns a token on each write. The client includes it on reads, and the server uses it to verify it’s caught up.
Without RYW:
  Client: [POST comment "hello"] → [GET comments] → returns []

With RYW:
  Client: [POST comment "hello"] → [GET comments (with token)] → returns ["hello"]

2. Monotonic Reads

After a client reads a version of the data, subsequent reads will never return an older version. This prevents the “time travel” effect:

Bad (no monotonic reads):
  Read 1: comments = ["hello"]     ← from replica R1 (up-to-date)
  Read 2: comments = []            ← from replica R2 (behind)
  User sees: "my comment vanished!"

Good (monotonic reads):
  Read 1: comments = ["hello"]     ← from R1
  Read 2: comments = ["hello"]     ← from R1 (or R2, if caught up)
  User sees: "still there, good"

Implementation: the client passes the highest version it has seen; replicas must serve at least that version.

3. Monotonic Writes

A client’s writes are applied in the order they were issued. This prevents the “lost update” problem:

Bad (no monotonic writes):
  Client intends: write X=1, then write X=2
  Replica R1 receives: X=2 (then X=1)  ← reordered!
  Final state: X=1 (wrong)

Good (monotonic writes):
  Replica R1 receives: X=1, then X=2
  Final state: X=2 (correct)

Implementation: sequence numbers per client. Replicas apply writes in sequence number order.

4. Writes-Follow-Reads

If a client reads value V from an object, any subsequent write to that object must occur at a version that is at least as recent as V. This prevents writing “based on” stale data:

Client reads X=5 from replica R1
Client writes X=10  (intended: X = 5 + 5)
Without writes-follow-reads:
  The write might be applied at a replica that has X=3
  Final state: X=10, but the "+5" was based on X=5, yet the
  "base" at the target replica is X=3, so the update is nonsensical.

Implementation: the client passes the version it read; the write is only accepted if the target replica’s version is at least that version.

The user-facing scenario:

Time ──────────────────────────────────────────────>
User posts comment:    [---POST /comment "cool!"---]
                        |
                     Write accepted by R1 (primary for this session)
                        |
User refreshes page:    [---GET /comments---]
                        |
                     Without RYW: R2 handles request, returns []
                     With RYW: Server ensures client sees own write

Session guarantees transform eventual consistency from “painful for users” to “practically usable.” Each user sees a consistent view of their own actions, even though different users might see slightly different states at the same time.

Check Your Understanding

  1. How would you implement read-your-writes without sticky sessions?
  2. A user reads comments, sees 5 comments, refreshes immediately, and sees 3 comments. Which session guarantee is violated?
  3. Can monotonic reads be implemented without client state? Why or why not?

The “So What?”

Read-your-writes is essential for any user-facing application built on an eventually consistent database. Without it, users will “lose” their own data (posts, comments, cart items) immediately after saving them. Every major eventually consistent system (Cassandra, DynamoDB, Riak) supports session guarantees. When configuring your database client, make sure session guarantees are enabled — they’re usually off by default for performance reasons.


✏️ Exercises

Eventual Consistency — Exercises

Exercise 1

Under the BASE model, is it acceptable for a system to return stale data indefinitely (i.e., never converge)? Explain your answer using the definition of “eventual consistency.”

Exercise 2

You’re building a commenting system on an eventually consistent database. You want to ensure a user always sees their own comment immediately after posting, but you cannot use sticky sessions (the load balancer doesn’t support it). How would you implement read-your-writes?

Exercise 3

In a gossip protocol with 100 nodes, where each round involves each node contacting one random peer (push only), how many rounds are needed for 99% of nodes to receive an update? Assume the update starts at a single node.

Exercise 4

You’re building a distributed shopping cart that must never lose items (even under concurrent writes from different devices). Would you choose LWW or vector clock siblings? Explain your reasoning. Under what conditions would LWW be the better choice?

👁️ View Solutions

Eventual Consistency — Solutions

Solution 1

No, it is not acceptable. The “E” in BASE stands for “Eventual consistency,” which has a specific formal meaning: there exists a time T such that for all t ≥ T, all reads return the same value. The word “eventual” means convergence is guaranteed (though the time bound is unspecified). If the system never converges, it violates even the weak guarantees of BASE.

A system that never converges would be “inconsistent” in the worst sense — different clients would permanently see different values. This is not eventual consistency; it’s just inconsistency. Systems like Dynamo and Cassandra guarantee eventual convergence through anti-entropy, gossip, read repair, and conflict resolution mechanisms. They do not guarantee “when” but they guarantee “that.”

The practical answer: your system must eventually converge. If it doesn’t, you have a bug in your anti-entropy or conflict resolution logic.

Solution 2

You can implement read-your-writes without sticky sessions using client-provided version tokens:

  1. Write phase: When the user posts a comment, the server accepts the write and returns a version token (e.g., a timestamp or a write ID).

    POST /comment {"text": "hello"}
      → Response 201: {"comment_id": 42, "version_token": "T1001"}
  2. Read phase: The client includes the version token in subsequent GET requests:

    GET /comments?after_token=T1001
  3. Server side: The coordinating node checks its replica. If the replica’s version is behind T1001, it either:

    • Forwards the read to a replica that has seen the write (coordination).
    • Blocks the read until the replica catches up via anti-entropy or a direct read-from-primary.
    • Returns a “try again” response (least desirable — but the client can retry).

Alternative approach: Client-side compensation. The client sends the POST, gets the response, and optimistically adds the comment to the local UI state. Even if the subsequent GET doesn’t return it, the UI shows it. This is a UX trick, not true RYW, but it works surprisingly well for commenting systems.

Solution 3

With push-only gossip, the infection spreads roughly exponentially. The number of infected nodes after k rounds starting from 1 infected node follows:

  • Round 0: 1 node infected
  • Round 1: 2 nodes (1 contacts 1)
  • Round k: 2^k nodes infected (before hitting N)

For 99% of 100 nodes (99 nodes infected):

We need: 2^k ≥ 99 k ≥ log₂(99) ≈ 6.62

So 7 rounds are needed.

This is the theoretical ideal (random peer selection with no collisions). In practice, collisions happen (two nodes contact the same peer), so it might take 8-10 rounds. The formula works well when N is large relative to the number of infected nodes.

For the last 1% (node that hasn’t been infected yet): the problem is that late-stage propagation slows down because most peers already have the data, so the chance of contacting an uninfected node decreases. This is the “diminishing returns” of gossip — the first 50% spreads fast, the last 10% takes proportionally longer.

Solution 4

For a shopping cart that must never lose items: choose vector clock siblings.

Here’s why:

  • With LWW, concurrent writes from different devices can cause item loss. If the phone adds “milk” and the web browser adds “eggs” at nearly the same time, only one survives.
  • Vector clocks preserve both writes as siblings, and the merge logic produces the union: {milk, eggs}.
  • No items are lost.

When LWW would be better:

  • When the data is not additive (e.g., updating a profile’s “status message” — you want the latest one, not both).
  • When application-level merge logic is too complex or error-prone.
  • When storage is a concern (siblings can accumulate).
  • When you’re willing to trade data integrity for simplicity.

Most production shopping cart systems actually use a hybrid: they use vector clocks for cart items (additive) but LWW for cart-level metadata like “last modified date.”