Distributed & Decentralized Systems Curriculum
Consistency Trade offs · Eventual Consistency

Key Question

What guarantees do you get when you choose availability over consistency?

Deep Dive

If linearizability is the gold standard for correctness, BASE is the survival strategy for large-scale systems. BASE stands for Basically Available, Soft-state, Eventually consistent. It was coined by Dan Pritchett at eBay in 2008 as a counterpoint to ACID (Atomicity, Consistency, Isolation, Durability) and captures the philosophy behind most NoSQL databases.

B — Basically Available

The system guarantees that it will respond to every request. Not “respond correctly” — just “respond.” A write is never rejected outright (though it might be queued or accepted without full replication). This means the system never tells the client “I’m not ready, try again later.” In practice, this means:

  • The system is partition-tolerant by design.
  • If a node fails, other nodes still accept requests.
  • The write path never blocks on coordination with all replicas.

“Dynamo: it’s always writeable” — that’s the mantra.

S — Soft State

State changes even without client input. Replicas exchange data in the background, so the value on disk at a given node can change due to gossip, anti-entropy, or read repair. This is different from a traditional database where state only changes when a transaction commits.

Consider a DynamoDB table with 3 replicas. Client A writes “cart = {item1}” to replica R1. Without any further client input, the state of R2 and R3 will change as gossip propagates the write:

Replica R1: {item1}        ← client wrote here
Replica R2: {} → {item1}   ← soft state changed via gossip
Replica R3: {} → {item1}   ← soft state changed via gossip

The state “evolves” without clients doing anything.

E — Eventual Consistency

If writes stop coming, all replicas will converge to the same value. The “eventual” window is unbounded but finite. There is NO guarantee about when convergence happens — only that it will happen eventually.

The formal statement: there exists a time T such that for all t ≥ T, all reads to a quorum of replicas return the same value. The key word is “exists” — we don’t know when T is, but it exists.

Dynamo “Add to Cart” Convergence

Amazon’s Dynamo paper describes a shopping cart scenario:

Time ────────────────────────────────────────────────>
Client A adds milk to cart (writes to R1)
Client B adds eggs to cart (writes to R2, partition exists)
                                        [partition heals]
                                        Gossip propagates:
                                        R1 learns about eggs
                                        R2 learns about milk
                                        [convergence: {milk, eggs}]

During the partition, both writes succeed because Dynamo is available. After healing, gossip converges the cart to contain both items. From the user’s perspective, nothing was lost.

Contrast with ACID

ACIDBASE
Strong consistency at commitWeak consistency — will converge
Isolation guaranteedNo isolation guarantee
State persists exactly as committedState is “soft” — can change
Writes may fail (rollback)Writes always accepted (basically available)
Focus: correctnessFocus: availability and partition tolerance

BASE is not “ACID’s opposite” — it’s a different set of trade-offs for a different environment. ACID shines in centralized, trusted networks. BASE shines in distributed, unreliable networks.

Check Your Understanding

  1. What does “soft state” mean in BASE? Give an example.
  2. Under BASE, if writes continue forever, will the system ever converge? Explain.
  3. Why does BASE make the system “basically available” rather than simply “available”?

The “So What?”

BASE is the philosophy behind most of the NoSQL databases you’ll encounter in production: Cassandra, DynamoDB, Riak, CouchDB, Voldemort. Understanding BASE helps you know what to expect: your writes always succeed, but reads might return stale data, and convergence happens eventually. If your application can tolerate temporary inconsistency (social feeds, shopping carts, analytics), BASE is the right choice. If you’re building an ATM network or an airline booking system, think twice.


✏️ Exercises

Eventual Consistency — Exercises

Exercise 1

Under the BASE model, is it acceptable for a system to return stale data indefinitely (i.e., never converge)? Explain your answer using the definition of “eventual consistency.”

Exercise 2

You’re building a commenting system on an eventually consistent database. You want to ensure a user always sees their own comment immediately after posting, but you cannot use sticky sessions (the load balancer doesn’t support it). How would you implement read-your-writes?

Exercise 3

In a gossip protocol with 100 nodes, where each round involves each node contacting one random peer (push only), how many rounds are needed for 99% of nodes to receive an update? Assume the update starts at a single node.

Exercise 4

You’re building a distributed shopping cart that must never lose items (even under concurrent writes from different devices). Would you choose LWW or vector clock siblings? Explain your reasoning. Under what conditions would LWW be the better choice?

👁️ View Solutions

Eventual Consistency — Solutions

Solution 1

No, it is not acceptable. The “E” in BASE stands for “Eventual consistency,” which has a specific formal meaning: there exists a time T such that for all t ≥ T, all reads return the same value. The word “eventual” means convergence is guaranteed (though the time bound is unspecified). If the system never converges, it violates even the weak guarantees of BASE.

A system that never converges would be “inconsistent” in the worst sense — different clients would permanently see different values. This is not eventual consistency; it’s just inconsistency. Systems like Dynamo and Cassandra guarantee eventual convergence through anti-entropy, gossip, read repair, and conflict resolution mechanisms. They do not guarantee “when” but they guarantee “that.”

The practical answer: your system must eventually converge. If it doesn’t, you have a bug in your anti-entropy or conflict resolution logic.

Solution 2

You can implement read-your-writes without sticky sessions using client-provided version tokens:

  1. Write phase: When the user posts a comment, the server accepts the write and returns a version token (e.g., a timestamp or a write ID).

    POST /comment {"text": "hello"}
      → Response 201: {"comment_id": 42, "version_token": "T1001"}
  2. Read phase: The client includes the version token in subsequent GET requests:

    GET /comments?after_token=T1001
  3. Server side: The coordinating node checks its replica. If the replica’s version is behind T1001, it either:

    • Forwards the read to a replica that has seen the write (coordination).
    • Blocks the read until the replica catches up via anti-entropy or a direct read-from-primary.
    • Returns a “try again” response (least desirable — but the client can retry).

Alternative approach: Client-side compensation. The client sends the POST, gets the response, and optimistically adds the comment to the local UI state. Even if the subsequent GET doesn’t return it, the UI shows it. This is a UX trick, not true RYW, but it works surprisingly well for commenting systems.

Solution 3

With push-only gossip, the infection spreads roughly exponentially. The number of infected nodes after k rounds starting from 1 infected node follows:

  • Round 0: 1 node infected
  • Round 1: 2 nodes (1 contacts 1)
  • Round k: 2^k nodes infected (before hitting N)

For 99% of 100 nodes (99 nodes infected):

We need: 2^k ≥ 99 k ≥ log₂(99) ≈ 6.62

So 7 rounds are needed.

This is the theoretical ideal (random peer selection with no collisions). In practice, collisions happen (two nodes contact the same peer), so it might take 8-10 rounds. The formula works well when N is large relative to the number of infected nodes.

For the last 1% (node that hasn’t been infected yet): the problem is that late-stage propagation slows down because most peers already have the data, so the chance of contacting an uninfected node decreases. This is the “diminishing returns” of gossip — the first 50% spreads fast, the last 10% takes proportionally longer.

Solution 4

For a shopping cart that must never lose items: choose vector clock siblings.

Here’s why:

  • With LWW, concurrent writes from different devices can cause item loss. If the phone adds “milk” and the web browser adds “eggs” at nearly the same time, only one survives.
  • Vector clocks preserve both writes as siblings, and the merge logic produces the union: {milk, eggs}.
  • No items are lost.

When LWW would be better:

  • When the data is not additive (e.g., updating a profile’s “status message” — you want the latest one, not both).
  • When application-level merge logic is too complex or error-prone.
  • When storage is a concern (siblings can accumulate).
  • When you’re willing to trade data integrity for simplicity.

Most production shopping cart systems actually use a hybrid: they use vector clocks for cart items (additive) but LWW for cart-level metadata like “last modified date.”