Distributed Transactions Coordination · Distributed Transactions 2PC

Key Question

What happens to a 2PC transaction if the coordinator crashes?

Deep Dive

Consider this timeline: The coordinator sends PREPARE to all participants. All say YES. The coordinator writes COMMIT to its own log. Then it crashes.

Some participants received the COMMIT message before the crash — they commit. Others did not — they are in the PREPARED state, holding locks, waiting.

Coordinator crashes here!
     |
     v
 Participant 1: received COMMIT → commits
 Participant 2: received PREPARE (said YES), waiting → BLOCKED
 Participant 3: received PREPARE (said YES), waiting → BLOCKED

Participants 2 and 3 are blocked. They cannot guess. If they commit unilaterally, they might violate atomicity when the coordinator restarts and says “actually, abort.” If they abort, they might violate atomicity when the coordinator says “commit.” They must hold their locks and wait — potentially forever.

This is the blocking problem. In practice, blocked transactions hang for as long as the coordinator is down, holding locks that may cascade into application-level outages.

Three-Phase Commit (3PC) tries to fix this by adding an intermediate phase:

Phase 1:   Coordinator → Participants: "Can you commit?"
           Participants → Coordinator: "Yes"
Phase 2:   Coordinator → Participants: "Pre-commit"
           (all now know consensus is forming)
Phase 3:   Coordinator → Participants: "Do commit"

If coordinator crashes after pre-commit, participants can elect a new coordinator and unanimously decide to commit (since they all received pre-commit). No single participant is left guessing.

Trade-off: 3PC assumes a synchronous network — bounded message delays, bounded process pauses. In an asynchronous network (the real world), 3PC can violate safety under network partitions. This is a consequence of the FLP impossibility result. For this reason, 2PC (with its blocking but safe semantics) is far more common in practice.

Check Your Understanding

Why does a participant hold locks while blocked after a coordinator crash?
How does 3PC’s pre-commit phase help avoid blocking?
Why is 3PC less common than 2PC in real systems?

The “So What?”

The 2PC vs 3PC trade-off teaches a fundamental lesson: you can trade liveness for safety, but you can’t have both in an asynchronous system (FLP). Most production systems pick 2PC because safety matters more than avoiding the rare coordinator crash.

✏️ Exercises

Exercises: Distributed Transactions & 2PC

2PC Log State: In the Two-Phase Commit protocol, what exact state does a participant write to its local durable log before responding “YES” to the coordinator’s prepare request? Why must this state be written to a log rather than kept in memory?
Blocking Explained: Consider a 2PC execution where the coordinator crashes after receiving all “YES” responses but before any participant has received the commit/abort decision. Participant 1 is running, Participant 2 is running, and Participant 3 is also running. Can any of these participants safely decide on their own? Explain what happens to each.
Distributed 2PL and Phantoms: A distributed database uses Two-Phase Locking (2PL) for concurrency control. Transaction T1 runs a query: SELECT * FROM accounts WHERE balance > 1000 across two shards. Transaction T2 inserts a new row with balance = 2000 on shard 2. How can a phantom read occur, and what mechanism (in addition to 2PL) is needed to prevent it?

👁️ View Solutions

Solutions: Distributed Transactions & 2PC

Exercise 1

The participant writes PREPARED (or READY) to its log, along with the transaction’s writes (enough information to commit later). This must be durable (fsynced to disk) before the YES response is sent.

Why a log instead of memory: If the participant crashes after sending YES but before receiving the coordinator’s decision, memory is lost. On recovery, the participant reads its log. If the log says PREPARED, it knows it made a promise and must wait for the coordinator. If the log says COMMIT or ABORT, it follows that. Without the log, the participant would have no memory of the promise after restarting.

Exercise 2

No participant can safely decide on their own.

Participant 1 received the PREPARE, said YES, and is now waiting. It has not received a COMMIT or ABORT. It holds locks and is blocked until the coordinator recovers.
Participant 2 is in the same state — blocked.
Participant 3 is also in the same state — blocked.

If any participant unilaterally commits, it might violate atomicity: the coordinator might have decided to abort (e.g., if it suspected a timeout from one participant). If any participant unilaterally aborts, it might violate atomicity: the coordinator might have logged COMMIT before crashing.

All three participants are blocked until the coordinator recovers and reveals the decision.

Exercise 3

Phantom read scenario:

T1’s query SELECT * FROM accounts WHERE balance > 1000 on shard 2 returns no rows.
T2 inserts a new row with balance = 2000 on shard 2 and commits.
If T1 re-runs the same query, it now sees the new row — a “phantom” that appeared from nowhere.

Standard 2PL (locking individual rows) does not prevent this because the new row didn’t exist when T1 first read.

Solution: A predicate lock or next-key lock is needed. In a distributed setting, this means locking the range balance > 1000 on each shard. The lock prevents T2 from inserting rows matching that predicate. In practice, many systems use serializable snapshot isolation (SSI) instead, which detects phantoms at commit time by checking for conflicting predicate reads.