Key Question
How do we compare vector clocks to determine whether one event happened before another?
Deep Dive
Vector clocks carry N counters per event. Comparing two vectors requires element-wise comparison. The formal rules are:
- V_a ≤ V_b if every element of V_a is less than or equal to the corresponding element in V_b.
- V_a < V_b if V_a ≤ V_b AND at least one element is strictly less.
- V_a || V_b (concurrent) if some elements of V_a are greater than V_b AND some elements are less than V_b. Equivalently: neither V_a ≤ V_b nor V_b ≤ V_a.
The key property: a → b if and only if V_a < V_b. Unlike Lamport clocks, this works in BOTH directions. If V_a < V_b, then a → b is guaranteed. If V_a || V_b, then a and b are concurrent.
Let us walk through multiple comparison examples:
Example 1: Causal relationship
V_a = [1, 2, 0]
V_b = [1, 3, 1]
Compare element by element:
- Position 0: 1 ≤ 1 ✓
- Position 1: 2 ≤ 3 ✓
- Position 2: 0 ≤ 1 ✓
V_a ≤ V_b. Check for strictness: position 1 (2 < 3) and position 2 (0 < 1). Yes, at least one is strictly less. Therefore V_a < V_b, and a → b.
What happened: Process B and Process C have moved forward since a occurred, meaning they learned about a (or events causally related to a) and did more work.
Example 2: Concurrent events
V_a = [2, 1, 0]
V_b = [1, 2, 0]
Compare:
- Position 0: 2 > 1 (V_a wins)
- Position 1: 1 < 2 (V_b wins)
- Position 2: 0 ≤ 0 (tie, but doesn’t matter)
V_a ≰ V_b (position 1: 1 < 2? No wait… 1 < 2, so V_a[1] < V_b[1]. But V_a[0] > V_b[0]. So neither V_a ≤ V_b nor V_b ≤ V_a holds). Let me be precise:
Is V_a ≤ V_b? V_a[0] = 2 and V_b[0] = 1. 2 ≤ 1? NO. So V_a ≰ V_b. Is V_b ≤ V_a? V_b[1] = 2 and V_a[1] = 1. 2 ≤ 1? NO. So V_b ≰ V_a.
Neither ≤ the other. The events are concurrent (a || b).
What likely happened: Process A and Process B each made progress independently after any common point of knowledge. A processed more events (A’s counter is 2), and B processed more events (B’s counter is 2), but neither heard about the other’s progress.
Example 3: Same vector, same event?
V_a = [3, 2, 1]
V_b = [3, 2, 1]
V_a ≤ V_b (all elements equal) and V_b ≤ V_a (all elements equal). Are these the same event? Not necessarily — they could be two different events that happen to have the same vector (possible if both processes were synchronized at the same point). But importantly, if V_a = V_b, then a and b are NOT concurrent in a causal sense — they have the same causal history. In a well-designed system, events with the same vector are causally equivalent (they have seen the same events from all processes).
Visualizing dominance:
V_b's elements →
1 2 3 4
V_a's ┌─────────────────┐
elems 1│ ≤ ≤ ≤ ≤ V_a ≤ V_b
↓ 2│ ? ≤ ≤ ≤
3│ ? ? ≤ ≤
4│ ? ? ? =
└─────────────────┘
If all comparisons are ≤ or =, V_a ≤ V_b. If any comparison is > (in V_a’s favor) and another is < (in V_b’s favor), they are concurrent.
Example 4: Simple ordering
V_a = [1, 0, 0]
V_b = [2, 1, 0]
V_a ≤ V_b (1 ≤ 2, 0 ≤ 1, 0 ≤ 0) and strictly less. So V_a < V_b, and a → b.
This is the easy case — a is clearly in the causal past of b.
Check Your Understanding
- Compare V_a = [3, 0, 2] and V_b = [2, 3, 1]. Are they concurrent or causally related?
- If V_a = [1, 1, 1] and V_b = [2, 2, 2], what is the relationship?
- What does it mean for two events to have identical vector clocks?
The “So What?”
The formal comparison rules are not just theory — they are the core logic in every system that uses vector clocks. When you implement a vector clock library, the comparison function is the most critical piece: get it wrong, and you will either miss conflicts (data loss) or report false conflicts (user-facing merge screens for no reason). The element-wise comparison is simple, but its implications — the ability to definitively say “these events are concurrent” — are profound.
✏️ Exercises
Topic 5: Vector Clocks — Exercises
Exercise 1: Comparing Vectors
Given the following pairs of vector clocks from a 3-process system (processes A, B, C), determine whether the events are causally related (V1 < V2 or V2 < V1) or concurrent (V1 || V2).
| Pair | V1 | V2 |
|---|---|---|
| (a) | [2, 1, 4] | [1, 2, 3] |
| (b) | [3, 0, 2] | [3, 1, 2] |
| (c) | [1, 2, 3] | [1, 2, 3] |
| (d) | [2, 2, 2] | [3, 3, 3] |
| (e) | [2, 0, 0] | [1, 1, 1] |
For each pair, state whether V1 → V2, V2 → V1, or V1 || V2, and explain your reasoning.
Exercise 2: Causal Ordering
In a 3-process system, the following events occur:
- Process A: local event X → V_A = [1, 0, 0]
- Process B: local event Y → V_B = [0, 1, 0]
- Process A: sends message to C → V_A = [2, 0, 0], message carries [2, 0, 0]
- Process C: receives message from A → V_C = [2, 0, 1]
- Process C: sends message to B → V_C = [2, 0, 2], message carries [2, 0, 2]
- Process B: receives message from C → V_B = [2, 1, 2] (after max, then increment)
(a) Is event X causally related to event Y? What does the vector clock comparison show? (b) After step 6, what is the complete vector clock state on all three processes? (c) If a new event occurs on Process C after step 5 (without any new messages), what would its vector clock be?
Exercise 3: Dynamo Conflict Scenario
A 2-node Dynamo-style key-value store stores a document with key “profile” = { “name”: “Alice” }. The initial version vector is [(A, 1)].
The following events occur concurrently:
- Client 1 reads “profile” from Node A. Gets vector [(A, 1)], value = { “name”: “Alice” }. Client 1 adds field “age”: 30 and writes back to Node A.
- Client 2 reads “profile” from Node B. Gets vector [(A, 1)], value = { “name”: “Alice” }. Client 2 adds field “email”: “alice@example.com” and writes back to Node B.
(a) What is the version vector of Client 1’s write? Of Client 2’s write? (b) A third client reads “profile” from Node A. Node A has received both writes through replication. What vectors does the client receive? Are they concurrent? (c) The application uses a merge function that combines fields: the result should include all fields from both versions. What is the merged value and the merged version vector? (d) After the merge is written back, a fourth client reads the key. What vector does it receive, and what does it tell the reader about this version relative to the two original writes?
Exercise 4: LWW vs. Sibling Preservation
A social media application stores user posts in a Dynamo-style database. Each post has a “like count” and a “comments list.”
(a) Which field would benefit from sibling preservation (returning multiple versions for merge)? Which could safely use LWW? Explain your reasoning. (b) If the like count uses LWW and the comments list uses sibling preservation, how does the application handle a read that returns both a single value (for like count) and multiple siblings (for comments)? (c) Design a scenario where LWW on the like count loses exactly one “like” even though the user pressed the like button. (d) Would a vector clock implementation detect this lost update? Why or why not?
👁️ View Solutions
Topic 5: Vector Clocks — Solutions
Solution 1
(a) [2, 1, 4] vs [1, 2, 3]
- Position 0: 2 > 1 (V1 wins)
- Position 1: 1 < 2 (V2 wins)
- Position 2: 4 > 3 (V1 wins)
V1 has elements both greater and less than V2. Neither dominates. V1 || V2 (concurrent).
(b) [3, 0, 2] vs [3, 1, 2]
- Position 0: 3 ≤ 3 ✓
- Position 1: 0 ≤ 1 ✓
- Position 2: 2 ≤ 2 ✓
V1 ≤ V2, and strictly less at position 1. V1 → V2 (V1 happened before V2). Process B processed one more event (or learned about one more) between the two events.
(c) [1, 2, 3] vs [1, 2, 3]
- All elements equal:
V1 ≤ V2 (all equal) AND V2 ≤ V1 (all equal). The vectors are identical. These events are causally equivalent — they have the same causal history. They are not concurrent in the sense of being incomparable. This represents two events that have seen exactly the same state from all processes.
(d) [2, 2, 2] vs [3, 3, 3]
- Position 0: 2 ≤ 3 ✓
- Position 1: 2 ≤ 3 ✓
- Position 2: 2 ≤ 3 ✓
V1 ≤ V2, strictly less in all positions. V1 → V2. Event V1 definitely happened before V2.
(e) [2, 0, 0] vs [1, 1, 1]
- Position 0: 2 > 1 (V1 wins)
- Position 1: 0 < 1 (V2 wins)
- Position 2: 0 < 1 (V2 wins)
V1 has element greater (position 0) and elements less (positions 1, 2). Neither dominates. V1 || V2 (concurrent). Process A had more local events, but B and C had events that A did not know about, and vice versa.
Solution 2
(a) X has V = [1, 0, 0] and Y has V = [0, 1, 0].
Compare: [1, 0, 0] vs [0, 1, 0]:
- Position 0: 1 > 0 (X wins)
- Position 1: 0 < 1 (Y wins)
- Position 2: 0 ≤ 0
Neither dominates. X || Y (concurrent). X and Y are independent local events on different processes with no causal connection.
(b) After step 6:
- Process A: [2, 0, 0] (unchanged since step 3)
- Process B: [2, 1, 2] (computed in step 6)
- Process C: [2, 0, 2] (unchanged since step 5)
(c) After step 5, Process C’s vector is [2, 0, 2]. A new local event on C would increment C’s own element: V_C = [2, 0, 3].
Solution 3
(a)
Client 1’s write at Node A: increments A’s counter. Vector = [(A, 2)].
Client 2’s write at Node B: increments B’s counter. Vector = [(A, 1), (B, 1)].
(b) The third client receives both vectors:
- Version X: vector [(A, 2)], value = { “name”: “Alice”, “age”: 30 }
- Version Y: vector [(A, 1), (B, 1)], value = { “name”: “Alice”, “email”: “alice@example.com” }
Compare [(A, 2)] vs [(A, 1), (B, 1)]:
- A: 2 > 1 (X wins on A)
- B: X has no B entry (implicitly 0), Y has B=1. 0 < 1 (Y wins on B)
Neither dominates. The versions are concurrent. The client receives siblings.
(c) The merge function combines fields:
Merged value: { “name”: “Alice”, “age”: 30, “email”: “alice@example.com” }
Merged vector: [(A, 2), (B, 1)] — the element-wise maximum of both vectors.
Check that it dominates both parent vectors:
- X = [(A, 2)]: A: 2 ≥ 2 ✓ (B implicit 0, merged B=1 ≥ 0 ✓) → merged ≥ X
- Y = [(A, 1), (B, 1)]: A: 2 ≥ 1 ✓, B: 1 ≥ 1 ✓ → merged ≥ Y
Both prior versions are in the causal past of the merged version.
(d) The fourth client reads and gets a single version: vector [(A, 2), (B, 1)], value = { “name”: “Alice”, “age”: 30, “email”: “alice@example.com” }.
The vector tells the reader: “This version incorporates everything from Node A up to counter 2 and Node B up to counter 1.” Since the original two writes had vectors [(A, 2)] and [(A, 1), (B, 1)], and both are ≤ this merged vector, the reader knows they have seen all prior versions. No siblings — the conflict was resolved.
Solution 4
(a) Comments list benefits from sibling preservation. If two users concurrently add comments to a post, both comments should appear in the final list. LWW would lose one user’s comment.
Like count could safely use LWW. If two users concurrently like a post, the like count should ideally increase by 2. But with LWW, one like would be lost. Whether this is acceptable depends on whether exact like counts matter. For many applications, an approximate count is acceptable; for others (contests, trending algorithms), it matters.
Better approach: treat the like count as an operation-based CRDT (each like increments a counter, and increments commute) rather than LWW.
(b) The application receives:
- For like count (LWW): a single value — say, 42.
- For comments (siblings): two versions — [“Great post!”] and [“Nice work!”].
The application merges the comments: [“Great post!”, “Nice work!”] and takes the like count as-is (42). When writing back, the application writes both fields together with a new version vector that dominates the parent vectors.
(c) Scenario:
- User A sees like count = 10, vector = [(A, 5)]
- User A presses like. The write: like count = 11, vector = [(A, 6)]
- Concurrently, User B sees like count = 10, vector = [(A, 5)] (hadn’t received A’s write yet)
- User B presses like. The write: like count = 11, vector = [(A, 5), (B, 1)]
If the system uses LWW and compares by timestamp:
- User A’s write timestamp: 10:00:01 (Node A clock)
- User B’s write timestamp: 10:00:00.500 (Node B clock, which is slightly behind)
LWW picks User A’s write: like count = 11. But there should be 12 (two likes from two users). One like is lost.
(d) Yes, a vector clock implementation would detect this conflict:
- User A’s write: [(A, 6)] vs User B’s write: [(A, 5), (B, 1)]
- A: 6 > 5 (A wins), B: 0 < 1 (B wins).
- Neither dominates: concurrent writes detected!
The system would present both versions to the application. The application would compute: the correct like count is the maximum of both (11) plus the additional likes each version represents. Version A has 1 new like (11 - 10 = 1). Version B has 1 new like (11 - 10 = 1). Total: 10 + 1 + 1 = 12. Vector clocks saved the lost like.