Time Causality · Vector Clocks

Key Question

How does a vector clock detect that two updates happened independently, and why does this matter for data consistency?

Deep Dive

The single most important advantage of vector clocks over Lamport clocks is the ability to detect concurrency definitively. When V_a and V_b are concurrent (neither dominates), we know that the two events happened without knowledge of each other. In a database context, this means two writes to the same key are in conflict — they must be flagged so the system or application can resolve them.

Let us walk through a write-write conflict scenario in a distributed key-value store with 3 nodes:

Node A                    Node B                    Node C
[0,0,0]                   [0,0,0]                   [0,0,0]
  |                          |                          |
  |                          |                          |
w1: write X to key "cart"    |                          |
  |   V = [1,0,0]            |                          |
  |                          |                          |
  |                          w2: write Y to key "cart"
  |                          |   V = [0,1,0]
  |                          |                          |
  |                          |                          |
  |                          |                          r3: read "cart"
  |                          |                          |   returns [1,0,0] (X) and [0,1,0] (Y)
  |                          |                          |   Vectors: [1,0,0] vs [0,1,0]
  |                          |                          |   Neither dominates → CONFLICT
  |                          |                          |

Step-by-step analysis:

Client 1 writes X to key “cart” at Node A. A increments its own counter: V = [1, 0, 0]. This write is acknowledged.
Client 2 (without knowledge of Client 1’s write — maybe the network is partitioned, or Client 2 is connected to Node B) writes Y to the same key “cart” at Node B. B increments its own counter: V = [0, 1, 0]. This write is also acknowledged.
Client 3 reads key “cart” from Node C. Node C, which has received both values through asynchronous replication (or the client specified read-repair), has both versions: X with V = [1, 0, 0] and Y with V = [0, 1, 0].
Node C compares the vectors:
- Is [1,0,0] ≤ [0,1,0]? 1 ≤ 0? NO.
- Is [0,1,0] ≤ [1,0,0]? 1 ≤ 0? NO.
- Neither vector dominates the other. The writes are concurrent.
Node C returns both values to Client 3 as “siblings” — concurrent versions that the application must resolve.

This is the direct contrast to Lamport clocks. With Lamport clocks, the first write might get L=1 and the second L=1 (if both started with clock=0). Or one might be higher if the node had processed more events. In either case, the system cannot distinguish between “one came after the other” and “they are concurrent.”

Why concurrency detection matters for databases:

Without concurrency detection, the database must use a policy like “last writer wins” (LWW). LWW picks one version based on a timestamp or sequence number and discards the other. This guarantees that reads always return a single value, but it can silently lose data.

Concrete scenario:
  "cart" key = { "items": ["shoes"] }     (initial state)
  
  Write 1: add "shirt" → { "items": ["shoes", "shirt"] }   V = [1,0,0]
  Write 2: add "hat"   → { "items": ["shoes", "hat"] }     V = [0,1,0]
  
  LWW (Lamport): keeps whichever has higher clock → loses one item
  Vector clocks: detects conflict → presents both items for merge

With vector clocks, the application receives both versions and merges them: “shirt” and “hat” were both added. The merged result is { “items”: [“shoes”, “shirt”, “hat”] }.

The cost: Vector clocks grow linearly with the number of processes. For each write, the database stores N integers as metadata. For thousands of nodes, this is expensive. Practical systems use version vectors or dotted version vectors to compress this state, but the core idea — per-process counters for conflict detection — remains the same.

Check Your Understanding

In the 3-node write-write conflict scenario above, how does Node C determine that the two writes are concurrent?
What would happen if the same scenario were handled by a system using only Lamport clocks with “last writer wins”?
Why is “presenting both versions to the application” a better outcome than silently picking one?

The “So What?”

Concurrency detection is the killer feature of vector clocks. In real systems, concurrent writes happen more often than you might think: network partitions, offline clients, mobile apps that sync later, and multi-region deployments all produce concurrent updates. Without vector clocks, these updates collide silently and data is lost. With vector clocks, the system surfaces the conflict and lets the application (or the user) decide how to merge. This is the difference between a system that loses your shopping cart items and one that preserves them.

✏️ Exercises

Topic 5: Vector Clocks — Exercises

Exercise 1: Comparing Vectors

Given the following pairs of vector clocks from a 3-process system (processes A, B, C), determine whether the events are causally related (V1 < V2 or V2 < V1) or concurrent (V1 || V2).

Pair	V1	V2
(a)	[2, 1, 4]	[1, 2, 3]
(b)	[3, 0, 2]	[3, 1, 2]
(c)	[1, 2, 3]	[1, 2, 3]
(d)	[2, 2, 2]	[3, 3, 3]
(e)	[2, 0, 0]	[1, 1, 1]

For each pair, state whether V1 → V2, V2 → V1, or V1 || V2, and explain your reasoning.

Exercise 2: Causal Ordering

In a 3-process system, the following events occur:

Process A: local event X → V_A = [1, 0, 0]
Process B: local event Y → V_B = [0, 1, 0]
Process A: sends message to C → V_A = [2, 0, 0], message carries [2, 0, 0]
Process C: receives message from A → V_C = [2, 0, 1]
Process C: sends message to B → V_C = [2, 0, 2], message carries [2, 0, 2]
Process B: receives message from C → V_B = [2, 1, 2] (after max, then increment)

(a) Is event X causally related to event Y? What does the vector clock comparison show? (b) After step 6, what is the complete vector clock state on all three processes? (c) If a new event occurs on Process C after step 5 (without any new messages), what would its vector clock be?

Exercise 3: Dynamo Conflict Scenario

A 2-node Dynamo-style key-value store stores a document with key “profile” = { “name”: “Alice” }. The initial version vector is [(A, 1)].

The following events occur concurrently:

Client 1 reads “profile” from Node A. Gets vector [(A, 1)], value = { “name”: “Alice” }. Client 1 adds field “age”: 30 and writes back to Node A.
Client 2 reads “profile” from Node B. Gets vector [(A, 1)], value = { “name”: “Alice” }. Client 2 adds field “email”: “alice@example.com” and writes back to Node B.

(a) What is the version vector of Client 1’s write? Of Client 2’s write? (b) A third client reads “profile” from Node A. Node A has received both writes through replication. What vectors does the client receive? Are they concurrent? (c) The application uses a merge function that combines fields: the result should include all fields from both versions. What is the merged value and the merged version vector? (d) After the merge is written back, a fourth client reads the key. What vector does it receive, and what does it tell the reader about this version relative to the two original writes?

Exercise 4: LWW vs. Sibling Preservation

A social media application stores user posts in a Dynamo-style database. Each post has a “like count” and a “comments list.”

(a) Which field would benefit from sibling preservation (returning multiple versions for merge)? Which could safely use LWW? Explain your reasoning. (b) If the like count uses LWW and the comments list uses sibling preservation, how does the application handle a read that returns both a single value (for like count) and multiple siblings (for comments)? (c) Design a scenario where LWW on the like count loses exactly one “like” even though the user pressed the like button. (d) Would a vector clock implementation detect this lost update? Why or why not?

👁️ View Solutions

Topic 5: Vector Clocks — Solutions

Solution 1

(a) [2, 1, 4] vs [1, 2, 3]

Position 0: 2 > 1 (V1 wins)
Position 1: 1 < 2 (V2 wins)
Position 2: 4 > 3 (V1 wins)

V1 has elements both greater and less than V2. Neither dominates. V1 || V2 (concurrent).

(b) [3, 0, 2] vs [3, 1, 2]

Position 0: 3 ≤ 3 ✓
Position 1: 0 ≤ 1 ✓
Position 2: 2 ≤ 2 ✓

V1 ≤ V2, and strictly less at position 1. V1 → V2 (V1 happened before V2). Process B processed one more event (or learned about one more) between the two events.

(c) [1, 2, 3] vs [1, 2, 3]

All elements equal:

V1 ≤ V2 (all equal) AND V2 ≤ V1 (all equal). The vectors are identical. These events are causally equivalent — they have the same causal history. They are not concurrent in the sense of being incomparable. This represents two events that have seen exactly the same state from all processes.

(d) [2, 2, 2] vs [3, 3, 3]

Position 0: 2 ≤ 3 ✓
Position 1: 2 ≤ 3 ✓
Position 2: 2 ≤ 3 ✓

V1 ≤ V2, strictly less in all positions. V1 → V2. Event V1 definitely happened before V2.

(e) [2, 0, 0] vs [1, 1, 1]

Position 0: 2 > 1 (V1 wins)
Position 1: 0 < 1 (V2 wins)
Position 2: 0 < 1 (V2 wins)

V1 has element greater (position 0) and elements less (positions 1, 2). Neither dominates. V1 || V2 (concurrent). Process A had more local events, but B and C had events that A did not know about, and vice versa.

Solution 2

(a) X has V = [1, 0, 0] and Y has V = [0, 1, 0].

Compare: [1, 0, 0] vs [0, 1, 0]:

Position 0: 1 > 0 (X wins)
Position 1: 0 < 1 (Y wins)
Position 2: 0 ≤ 0

Neither dominates. X || Y (concurrent). X and Y are independent local events on different processes with no causal connection.

(b) After step 6:

Process A: [2, 0, 0] (unchanged since step 3)
Process B: [2, 1, 2] (computed in step 6)
Process C: [2, 0, 2] (unchanged since step 5)

(c) After step 5, Process C’s vector is [2, 0, 2]. A new local event on C would increment C’s own element: V_C = [2, 0, 3].

Solution 3

(a)

Client 1’s write at Node A: increments A’s counter. Vector = [(A, 2)].

Client 2’s write at Node B: increments B’s counter. Vector = [(A, 1), (B, 1)].

(b) The third client receives both vectors:

Version X: vector [(A, 2)], value = { “name”: “Alice”, “age”: 30 }
Version Y: vector [(A, 1), (B, 1)], value = { “name”: “Alice”, “email”: “alice@example.com” }

Compare [(A, 2)] vs [(A, 1), (B, 1)]:

A: 2 > 1 (X wins on A)
B: X has no B entry (implicitly 0), Y has B=1. 0 < 1 (Y wins on B)

Neither dominates. The versions are concurrent. The client receives siblings.

(c) The merge function combines fields:

Merged value: { “name”: “Alice”, “age”: 30, “email”: “alice@example.com” }

Merged vector: [(A, 2), (B, 1)] — the element-wise maximum of both vectors.

Check that it dominates both parent vectors:

X = [(A, 2)]: A: 2 ≥ 2 ✓ (B implicit 0, merged B=1 ≥ 0 ✓) → merged ≥ X
Y = [(A, 1), (B, 1)]: A: 2 ≥ 1 ✓, B: 1 ≥ 1 ✓ → merged ≥ Y

Both prior versions are in the causal past of the merged version.

(d) The fourth client reads and gets a single version: vector [(A, 2), (B, 1)], value = { “name”: “Alice”, “age”: 30, “email”: “alice@example.com” }.

The vector tells the reader: “This version incorporates everything from Node A up to counter 2 and Node B up to counter 1.” Since the original two writes had vectors [(A, 2)] and [(A, 1), (B, 1)], and both are ≤ this merged vector, the reader knows they have seen all prior versions. No siblings — the conflict was resolved.

Solution 4

(a) Comments list benefits from sibling preservation. If two users concurrently add comments to a post, both comments should appear in the final list. LWW would lose one user’s comment.

Like count could safely use LWW. If two users concurrently like a post, the like count should ideally increase by 2. But with LWW, one like would be lost. Whether this is acceptable depends on whether exact like counts matter. For many applications, an approximate count is acceptable; for others (contests, trending algorithms), it matters.

Better approach: treat the like count as an operation-based CRDT (each like increments a counter, and increments commute) rather than LWW.

(b) The application receives:

For like count (LWW): a single value — say, 42.
For comments (siblings): two versions — [“Great post!”] and [“Nice work!”].

The application merges the comments: [“Great post!”, “Nice work!”] and takes the like count as-is (42). When writing back, the application writes both fields together with a new version vector that dominates the parent vectors.

(c) Scenario:

User A sees like count = 10, vector = [(A, 5)]
User A presses like. The write: like count = 11, vector = [(A, 6)]
Concurrently, User B sees like count = 10, vector = [(A, 5)] (hadn’t received A’s write yet)
User B presses like. The write: like count = 11, vector = [(A, 5), (B, 1)]

If the system uses LWW and compares by timestamp:

User A’s write timestamp: 10:00:01 (Node A clock)
User B’s write timestamp: 10:00:00.500 (Node B clock, which is slightly behind)

LWW picks User A’s write: like count = 11. But there should be 12 (two likes from two users). One like is lost.

(d) Yes, a vector clock implementation would detect this conflict:

User A’s write: [(A, 6)] vs User B’s write: [(A, 5), (B, 1)]
A: 6 > 5 (A wins), B: 0 < 1 (B wins).
Neither dominates: concurrent writes detected!

The system would present both versions to the application. The application would compute: the correct like count is the maximum of both (11) plus the additional likes each version represents. Version A has 1 new like (11 - 10 = 1). Version B has 1 new like (11 - 10 = 1). Total: 10 + 1 + 1 = 12. Vector clocks saved the lost like.

🎮 Interactive Vector Clock

Scroll to load interactive visualization…