Consistency Trade offs · CRDTs

Key Question

How can replicas converge to the same state without any coordination or consensus?

Deep Dive

CRDT stands for Conflict-free Replicated Data Type. It’s a data structure that multiple replicas can update independently, then merge later, and always end up in the same state — no locking, no leader election, no two-phase commit required.

The Magic: Commutative Operations

The fundamental insight: if all operations on a data type commute (A then B gives the same result as B then A), replicas can apply operations in any order and converge.

Replica A:    start 0  ──[+1]──►  1  ──[+1]──►  2
Replica B:    start 0  ──[+1]──►  1  ──[+1]──►  2
     │                              │
     └────────── MERGE ─────────────┘
                    │
                    ▼
            Final: both see 2

Now what if order differs?

Replica A:    0 ──[+1]──► 1 ──[+2]──► 3
Replica B:    0 ──[+2]──► 2 ──[+1]──► 3
     │                         │
     └───── MERGE ─────────────┘
                   │
                   ▼
           Both see 3 ✓

Addition commutes — order doesn’t matter. CRDTs generalize this property.

Two Flavors of CRDT

Type	Name	How it works
State-based	CvRDT (Convergent)	Replicas exchange full state. Merge function must be commutative, associative, and idempotent.
Operation-based	CmRDT (Commutative)	Replicas exchange operations. Operations must commute. Requires reliable broadcast.

State-based is simpler to implement (just exchange state and merge) but can be bandwidth-heavy (sending full state vs just the delta).

Operation-based is more efficient (send only the op) but requires reliable exactly-once delivery — if an op is lost or duplicated, convergence breaks.

CRDT vs LWW (Last Writer Wins)

A naive approach to convergence is Last Writer Wins (LWW) — each value carries a timestamp, and the highest timestamp wins. Simple, but LWW loses data:

Replica A:  name = "Alice" (t=100)
Replica B:  name = "Bob"   (t=101)   ──► merge → "Bob"

Concurrent writes:
Replica A:  name = "Alex" (t=102)
Replica B:  name = "Bob"  (t=102)   ──► tie-breaker → one is lost

A CRDT counter doesn’t lose increments. If two replicas both increment, both increments are preserved in the merged result.

Check Your Understanding

What does it mean for a merge function to be “commutative, associative, and idempotent”?
What happens if an operation is delivered twice in an operation-based CRDT (CmRDT)?
Why is LWW considered “lossy” compared to CRDT counters?

The “So What?”

CRDTs are the foundation of offline-first applications. Google Docs, Figma, and Notion all use CRDT-like structures to let users edit offline and sync later without conflicts. If you want local-first software that works offline and syncs seamlessly, CRDTs are the enabling technology.

✏️ Exercises

Exercises: CRDTs

Exercise 1: G-Counter Merge

A G-Counter has 3 replicas. Replica A’s state is [5, 0, 0]. Replica B’s state is [0, 3, 2].

What is the state of each replica after they merge (A with B, B with A)?
What is the final counter value?
Now replica C (state [1, 1, 1]) joins the merge. What is the final result?

Exercise 2: G-Counter Limitations

Explain in one sentence why a G-Counter cannot represent decrements.
Could you build a G-Counter where decrement is allowed by subtracting from the local slot? What goes wrong?
What mathematical property of the merge function would break if you allowed arbitrary subtraction?

Exercise 3: OR-Set Concurrent Operations

Three replicas (A, B, C) start with an empty OR-Set.

Replica A: adds “apple” (tag = a1), adds “banana” (tag = a2)
Replica B: adds “apple” (tag = b1), removes “banana” (removes a2 — B observed A’s add)
Replica C: removes “apple” (removes a1, b1 — C observed both)

After all replicas merge:

Is “apple” in the set? Why or why not?
Is “banana” in the set? Why or why not?
How many unique tags are in the final set?

Exercise 4: CRDT vs Consensus

An e-commerce platform uses an inventory CRDT (PN-Counter) to track stock across regions. The current count for item X is 5.

Two regions both sell one unit simultaneously (decrement by 1 each). What does the counter show after merge?
The actual physical inventory is 5 units. After the decrements above, you sell 3 more units in region A and 2 more in region B. A customer checks the count and sees 3. They buy 3 — but the warehouse only has 2 left. How could this over-selling happen?
Would Raft prevent this problem? What trade-off does it introduce?

👁️ View Solutions

Solutions: CRDTs

Exercise 1: G-Counter Merge

After A merges with B:
- A receives [0, 3, 2] from B → elementwise max of [5, 0, 0] and [0, 3, 2] = [5, 3, 2]
- B receives [5, 0, 0] from A → elementwise max of [0, 3, 2] and [5, 0, 0] = [5, 3, 2]
- Both converge to [5, 3, 2]
Final counter value: 5 + 3 + 2 = 10
C joins: max of [5, 3, 2] and [1, 1, 1]:
- max(5, 1) = 5
- max(3, 1) = 3
- max(2, 1) = 2
- Result: [5, 3, 2] — C’s values were all lower, so they don’t change the outcome.

Exercise 2: G-Counter Limitations

A G-Counter’s merge function is elementwise max — taking the maximum of two values can never produce a smaller number, so you can never reduce any slot.
If you subtract from your local slot, the elementwise max during merge would take the larger previous value from another replica. Your decrement would be silently lost. For example: A has [5, 0, 0], A decrements to [4, 0, 0], then merges with B’s [5, 0, 0] (which hadn’t seen the decrement). Result: [5, 0, 0] — the decrement disappeared.
Elementwise max is not invertible. If you allow subtraction, the merge function would need to decrease a value, which max cannot do. The lattice property (join = least upper bound) only works with monotonic increases.

Exercise 3: OR-Set Concurrent Operations

Is “apple” in the set? Yes. C removed tags a1 and b1. But A had already added “apple” with a1, and B had added with b1. Both are removed by C’s operation. However… wait, let’s trace carefully:
- A adds “apple” with tag a1. A adds “banana” with tag a2.
- B adds “apple” with tag b1. B removes “banana” — removes tag a2.
- C removes “apple” — removes tags a1 and b1 (that C observed).
After merge: A has {(“apple”, a1), (“banana”, a2)} ∪ {(“apple”, b1)} — but a1 and b1 were removed by C. However, the merge is a union of internal sets. C’s state has { } (it removed everything it knew about). A has {(“apple”, a1), (“banana”, a2), (“apple”, b1)} only if B’s adds propagated. But B also sent its adds.

Actually, OR-Set merge is set union of (element, tag) pairs. C’s remove of tags a1 and b1 means those tags are excluded from C’s internal set. But when A merges with C, A still has a1 and b1 (A received b1 from B). So the union brings them back. Wait — that breaks the semantics.

Correction: In an OR-Set, when you remove, you add the removed tags to a “tombstone” set (or equivalently, the internal set is {(elem, tag)} and remove means “remove all observed (elem, tag) pairs”). But the merge is union — if another replica still has those tags, they come back.

This is why real OR-Sets use “observed-remove” with version vectors or tombstones. The answer depends on implementation. For the standard OR-Set:

After all merges (union of all (e,t) pairs minus removed pairs):
- Banana’s a2 tag is removed by B, and only A had it. Gone.
- Apple’s a1 is removed by C, but B has b1 (different tag) — wait, a1 is explicitly removed, but a1 was from A. B had b1, not a1.
Let me retrace properly: after full sync:
- All tags: a1 (“apple”), a2 (“banana”), b1 (“apple”)
- Removed tags: a2 (by B), a1 and b1 (by C)
- Surviving tags: none — all three tags are in the remove set.
Final set: empty {}.

This is actually a problem with basic OR-Sets — concurrent adds and removes can delete everything. This is why production OR-Sets often use version vectors or “add wins” semantics more carefully.
Is “banana” in the set? No — tag a2 was removed by B, and no other replica added “banana”. a2 is in the tombstone set.
Unique tags surviving: 0 (everything was removed).

Key insight: This exercise shows a weakness of basic OR-Sets — if a remove operation observes all tags and removes them, the element disappears. Production OR-Sets add version vectors to track causality so that concurrent adds survive.

Exercise 4: CRDT vs Consensus

After both regions decrement: Both decrements are recorded. Region A’s N-counter has [1, 0], Region B’s has [0, 1]. After merge: P = [0, 0], N = [1, 1]. Value = 0 - (1 + 1) = -2. Wait — the initial value was 5, so P should track the initial value. Let me reconsider.

Initial state (both regions): P = [5, 5], N = [0, 0] (each region’s P-counter has “5” for the initial stock). If each sells 1: region A decrements → N_A = [1, 0], region B → N_B = [0, 1].

After merge: P = [5, 5], N = [1, 1]. Value = (5+5) - (1+1) = 10 - 2 = 8. That’s wrong — we only had 5 units initially. This reveals a problem: each region independently initialized P to 5. The initialization is not coordinated.

The real answer: A PN-Counter for inventory must be initialized on one replica and propagated. If both regions independently set P to 5, the merge results in P = max(5,5) = 5 per slot, sum(P) = 10 — double counting. This is why initial state needs coordination.

But ignoring the initialization issue: after both decrements, the counter shows 3 (5 - 1 - 1 = 3).
Over-selling: The counter shows 3 (a stale or concurrent read), but in reality, by the time that customer checks, 5 more units have been sold across both regions (3 + 2). The counter eventually converges to show -2, but during the sync delay, the customer saw 3 and bought 3. The system sold 10 units of a 5-unit inventory.
Would Raft prevent this? Yes — Raft provides strong consistency. All writes go through a leader, so inventory is always accurate. Trade-off: Writes are slower (require majority ack), and if the leader is in a different region, every write incurs cross-region latency. During a network partition, the minority partition cannot sell anything. CRDTs sacrifice accuracy (stale reads) for availability (local writes always succeed).

🎮 Interactive CRDT

Scroll to load interactive visualization…