Time Causality · Vector Clocks

Key Question

How do major distributed storage systems like Amazon Dynamo use vector clocks in production?

Deep Dive

Amazon Dynamo, the pioneering distributed key-value store (described in the 2007 Dynamo paper that inspired DynamoDB, Cassandra, Riak, and Voldemort), uses a variant of vector clocks called version vectors. The key insight is that in a key-value store, you do not need a vector per process — you need a vector per key. Each key has its own version vector tracking which replicas have accepted writes for that specific key.

Dynamo’s approach works as follows:

Each key’s value is associated with a version vector: an array of (node_id, counter) pairs.
When a client writes to a key, the coordinating node increments its counter in the vector and returns the new vector with the written value.
When a client reads a key, the system returns the value(s) and the associated version vector(s).
If multiple version vectors are returned (siblings), they are concurrent — the application must resolve the conflict.

Walk through a Dynamo shopping cart scenario:

Version 1:  User adds "laptop" to cart
            Coordinated by Node A
            Key "cart": value = ["laptop"], version = [(A, 1)]

Version 2a: User adds "mouse" to cart (from home computer, coordinated by Node A)
            Version: [(A, 2)], value = ["laptop", "mouse"]

Version 2b: User adds "keyboard" to cart (from phone, coordinated by Node B)
            The phone's client read version [(A, 1)] earlier (before the mouse was added)
            It writes: value = ["laptop", "keyboard"], version = [(A, 1), (B, 1)]
            (Node B increments its own counter: B starts at 0, adds B=1)

Now the system has two concurrent versions:

Version A: ["laptop", "mouse"] with vector [(A, 2)]
Version B: ["laptop", "keyboard"] with vector [(A, 1), (B, 1)]

Are they concurrent? Compare:

(A, 2) vs (A, 1): 2 > 1 (Version A wins on A’s counter).
But Version A has no entry for B (implicitly 0), while Version B has (B, 1). So B < 1? No, B is missing in A (treated as 0) and B=1 in B. 0 < 1 (Version B wins on B’s counter).

Neither dominates. The versions are concurrent — a conflict.

Resolution strategies:

Dynamo offers two approaches:

Last writer wins (LWW): Use an external timestamp (like the physical clock) to pick one version. Fast, simple, and always returns one value. But it loses data — one of the cart updates disappears. Amazon uses LWW for session data but NOT for shopping carts.
Sibling preservation: Return both versions to the application. The application merges them. For the shopping cart: the application computes the union of both lists — [“laptop”, “mouse”, “keyboard”]. This is a semantic merge: it knows that cart items are additive, so the union is correct. The merged result is written back with a new version vector that DOMINATES both parents: [(A, 2), (B, 1)].

Merged: ["laptop", "mouse", "keyboard"], vector = [(A, 2), (B, 1)]

This new vector is greater than both Version A (A: 2 ≥ 2, B: 1 ≥ 0 ✓) and Version B (A: 2 ≥ 1, B: 1 ≥ 1 ✓). Both prior versions are now in the causal past.

When would a system choose LWW?

LWW is appropriate for:

Data where losing an update is acceptable (e.g., a “last seen” timestamp).
Systems where availability is paramount and conflict resolution is too costly.
Simple use cases where the value is a single scalar that can be overwritten.

Sibling preservation is better for:

Shopping carts, document editing, or any data where merging is meaningful.
Systems where data loss is unacceptable.
Applications that can provide a merge function.

Performance considerations: Version vectors grow with the number of unique nodes that have modified a key. In a cluster of 1000 nodes, a key that has been written to by only 3 nodes has a version vector of length 3, not 1000. Dynamo prunes stale entries (where a node has not written for a long time) to keep vectors small. The practical observation from Dynamo’s production use: version vectors typically stay small (fewer than 10 entries per key) for most workloads.

Check Your Understanding

In the shopping cart scenario, how did the phone client end up writing with a stale version vector [(A, 1)]?
What is the merged version vector after the cart items are merged? Why does it dominate both parent vectors?
Why would a system choose LWW over sibling preservation? Give a concrete example of data that might use each strategy.

The “So What?”

Version vectors (vector clocks per key) are the mechanism behind Dynamo’s “sloppy quorum” and eventual consistency model. The next time you use Amazon DynamoDB (which is based on Dynamo’s design but with managed conflict resolution), Cassandra, or Riak, remember that vector clocks are running under the hood, tracking which replicas have seen which updates. The choice between LWW and sibling preservation is a design decision about whether you value simplicity over data integrity — a tradeoff that every distributed systems engineer must make.

✏️ Exercises

Topic 5: Vector Clocks — Exercises

Exercise 1: Comparing Vectors

Given the following pairs of vector clocks from a 3-process system (processes A, B, C), determine whether the events are causally related (V1 < V2 or V2 < V1) or concurrent (V1 || V2).

Pair	V1	V2
(a)	[2, 1, 4]	[1, 2, 3]
(b)	[3, 0, 2]	[3, 1, 2]
(c)	[1, 2, 3]	[1, 2, 3]
(d)	[2, 2, 2]	[3, 3, 3]
(e)	[2, 0, 0]	[1, 1, 1]

For each pair, state whether V1 → V2, V2 → V1, or V1 || V2, and explain your reasoning.

Exercise 2: Causal Ordering

In a 3-process system, the following events occur:

Process A: local event X → V_A = [1, 0, 0]
Process B: local event Y → V_B = [0, 1, 0]
Process A: sends message to C → V_A = [2, 0, 0], message carries [2, 0, 0]
Process C: receives message from A → V_C = [2, 0, 1]
Process C: sends message to B → V_C = [2, 0, 2], message carries [2, 0, 2]
Process B: receives message from C → V_B = [2, 1, 2] (after max, then increment)

(a) Is event X causally related to event Y? What does the vector clock comparison show? (b) After step 6, what is the complete vector clock state on all three processes? (c) If a new event occurs on Process C after step 5 (without any new messages), what would its vector clock be?

Exercise 3: Dynamo Conflict Scenario

A 2-node Dynamo-style key-value store stores a document with key “profile” = { “name”: “Alice” }. The initial version vector is [(A, 1)].

The following events occur concurrently:

Client 1 reads “profile” from Node A. Gets vector [(A, 1)], value = { “name”: “Alice” }. Client 1 adds field “age”: 30 and writes back to Node A.
Client 2 reads “profile” from Node B. Gets vector [(A, 1)], value = { “name”: “Alice” }. Client 2 adds field “email”: “alice@example.com” and writes back to Node B.

(a) What is the version vector of Client 1’s write? Of Client 2’s write? (b) A third client reads “profile” from Node A. Node A has received both writes through replication. What vectors does the client receive? Are they concurrent? (c) The application uses a merge function that combines fields: the result should include all fields from both versions. What is the merged value and the merged version vector? (d) After the merge is written back, a fourth client reads the key. What vector does it receive, and what does it tell the reader about this version relative to the two original writes?

Exercise 4: LWW vs. Sibling Preservation

A social media application stores user posts in a Dynamo-style database. Each post has a “like count” and a “comments list.”

(a) Which field would benefit from sibling preservation (returning multiple versions for merge)? Which could safely use LWW? Explain your reasoning. (b) If the like count uses LWW and the comments list uses sibling preservation, how does the application handle a read that returns both a single value (for like count) and multiple siblings (for comments)? (c) Design a scenario where LWW on the like count loses exactly one “like” even though the user pressed the like button. (d) Would a vector clock implementation detect this lost update? Why or why not?

👁️ View Solutions

Topic 5: Vector Clocks — Solutions

Solution 1

(a) [2, 1, 4] vs [1, 2, 3]

Position 0: 2 > 1 (V1 wins)
Position 1: 1 < 2 (V2 wins)
Position 2: 4 > 3 (V1 wins)

V1 has elements both greater and less than V2. Neither dominates. V1 || V2 (concurrent).

(b) [3, 0, 2] vs [3, 1, 2]

Position 0: 3 ≤ 3 ✓
Position 1: 0 ≤ 1 ✓
Position 2: 2 ≤ 2 ✓

V1 ≤ V2, and strictly less at position 1. V1 → V2 (V1 happened before V2). Process B processed one more event (or learned about one more) between the two events.

(c) [1, 2, 3] vs [1, 2, 3]

All elements equal:

V1 ≤ V2 (all equal) AND V2 ≤ V1 (all equal). The vectors are identical. These events are causally equivalent — they have the same causal history. They are not concurrent in the sense of being incomparable. This represents two events that have seen exactly the same state from all processes.

(d) [2, 2, 2] vs [3, 3, 3]

Position 0: 2 ≤ 3 ✓
Position 1: 2 ≤ 3 ✓
Position 2: 2 ≤ 3 ✓

V1 ≤ V2, strictly less in all positions. V1 → V2. Event V1 definitely happened before V2.

(e) [2, 0, 0] vs [1, 1, 1]

Position 0: 2 > 1 (V1 wins)
Position 1: 0 < 1 (V2 wins)
Position 2: 0 < 1 (V2 wins)

V1 has element greater (position 0) and elements less (positions 1, 2). Neither dominates. V1 || V2 (concurrent). Process A had more local events, but B and C had events that A did not know about, and vice versa.

Solution 2

(a) X has V = [1, 0, 0] and Y has V = [0, 1, 0].

Compare: [1, 0, 0] vs [0, 1, 0]:

Position 0: 1 > 0 (X wins)
Position 1: 0 < 1 (Y wins)
Position 2: 0 ≤ 0

Neither dominates. X || Y (concurrent). X and Y are independent local events on different processes with no causal connection.

(b) After step 6:

Process A: [2, 0, 0] (unchanged since step 3)
Process B: [2, 1, 2] (computed in step 6)
Process C: [2, 0, 2] (unchanged since step 5)

(c) After step 5, Process C’s vector is [2, 0, 2]. A new local event on C would increment C’s own element: V_C = [2, 0, 3].

Solution 3

(a)

Client 1’s write at Node A: increments A’s counter. Vector = [(A, 2)].

Client 2’s write at Node B: increments B’s counter. Vector = [(A, 1), (B, 1)].

(b) The third client receives both vectors:

Version X: vector [(A, 2)], value = { “name”: “Alice”, “age”: 30 }
Version Y: vector [(A, 1), (B, 1)], value = { “name”: “Alice”, “email”: “alice@example.com” }

Compare [(A, 2)] vs [(A, 1), (B, 1)]:

A: 2 > 1 (X wins on A)
B: X has no B entry (implicitly 0), Y has B=1. 0 < 1 (Y wins on B)

Neither dominates. The versions are concurrent. The client receives siblings.

(c) The merge function combines fields:

Merged value: { “name”: “Alice”, “age”: 30, “email”: “alice@example.com” }

Merged vector: [(A, 2), (B, 1)] — the element-wise maximum of both vectors.

Check that it dominates both parent vectors:

X = [(A, 2)]: A: 2 ≥ 2 ✓ (B implicit 0, merged B=1 ≥ 0 ✓) → merged ≥ X
Y = [(A, 1), (B, 1)]: A: 2 ≥ 1 ✓, B: 1 ≥ 1 ✓ → merged ≥ Y

Both prior versions are in the causal past of the merged version.

(d) The fourth client reads and gets a single version: vector [(A, 2), (B, 1)], value = { “name”: “Alice”, “age”: 30, “email”: “alice@example.com” }.

The vector tells the reader: “This version incorporates everything from Node A up to counter 2 and Node B up to counter 1.” Since the original two writes had vectors [(A, 2)] and [(A, 1), (B, 1)], and both are ≤ this merged vector, the reader knows they have seen all prior versions. No siblings — the conflict was resolved.

Solution 4

(a) Comments list benefits from sibling preservation. If two users concurrently add comments to a post, both comments should appear in the final list. LWW would lose one user’s comment.

Like count could safely use LWW. If two users concurrently like a post, the like count should ideally increase by 2. But with LWW, one like would be lost. Whether this is acceptable depends on whether exact like counts matter. For many applications, an approximate count is acceptable; for others (contests, trending algorithms), it matters.

Better approach: treat the like count as an operation-based CRDT (each like increments a counter, and increments commute) rather than LWW.

(b) The application receives:

For like count (LWW): a single value — say, 42.
For comments (siblings): two versions — [“Great post!”] and [“Nice work!”].

The application merges the comments: [“Great post!”, “Nice work!”] and takes the like count as-is (42). When writing back, the application writes both fields together with a new version vector that dominates the parent vectors.

(c) Scenario:

User A sees like count = 10, vector = [(A, 5)]
User A presses like. The write: like count = 11, vector = [(A, 6)]
Concurrently, User B sees like count = 10, vector = [(A, 5)] (hadn’t received A’s write yet)
User B presses like. The write: like count = 11, vector = [(A, 5), (B, 1)]

If the system uses LWW and compares by timestamp:

User A’s write timestamp: 10:00:01 (Node A clock)
User B’s write timestamp: 10:00:00.500 (Node B clock, which is slightly behind)

LWW picks User A’s write: like count = 11. But there should be 12 (two likes from two users). One like is lost.

(d) Yes, a vector clock implementation would detect this conflict:

User A’s write: [(A, 6)] vs User B’s write: [(A, 5), (B, 1)]
A: 6 > 5 (A wins), B: 0 < 1 (B wins).
Neither dominates: concurrent writes detected!

The system would present both versions to the application. The application would compute: the correct like count is the maximum of both (11) plus the additional likes each version represents. Version A has 1 new like (11 - 10 = 1). Version B has 1 new like (11 - 10 = 1). Total: 10 + 1 + 1 = 12. Vector clocks saved the lost like.