Distributed Transactions Coordination · Distributed Transactions 2PC

Key Question

How does 2PC ensure all-or-nothing atomicity across multiple nodes?

Deep Dive

Two-Phase Commit (2PC) is the classic protocol for distributed atomicity. It has two roles: one coordinator (the transaction manager) and N participants (the data nodes).

Phase 1 — Prepare: The coordinator sends a PREPARE message to every participant. Each participant writes the transaction to its log in a PREPARED state — this log write is the participant’s “promise” that it will commit later, even if it crashes and restarts. Then the participant responds YES (ready to commit) or NO (abort).

Phase 2 — Commit: If ALL participants said YES, the coordinator writes COMMIT to its own log and sends COMMIT to everyone. If ANY participant said NO (or the coordinator times out waiting), the coordinator writes ABORT and sends ABORT to all.

Coordinator                  Participant 1           Participant 2
     |                            |                       |
     |-------- PREPARE ---------->|                       |
     |-------- PREPARE -------------------------------->  |
     |                            |                       |
     |<-------- YES ------------- |                       |
     |<------------------------------- YES -------------- |
     |                            |                       |
     |  (ALL yes → commit)        |                       |
     |                            |                       |
     |-------- COMMIT ----------->|                       |
     |-------- COMMIT ----------------------------------> |
     |                            |                       |
     |<-------- ACK --------------|                       |
     |<------------------------------ ACK ----------------|
     |                            |                       |
     v                            v                       v
   DONE                         DONE                    DONE

The log write before YES is crucial. If a participant crashes after sending YES and before receiving COMMIT, it must check its log on recovery. If the log says PREPARED, it must wait for the coordinator’s decision. If the log says COMMIT or ABORT, it follows that.

The blocking problem: After the coordinator sends COMMIT but before a participant receives it, if the participant crashes, it checks its log, sees PREPARED, and must wait — blocking — until the coordinator tells it what to do. This is the core flaw of 2PC.

Check Your Understanding

What does a participant write to its log before responding YES to a prepare?
What happens if a participant says NO during the prepare phase?
Why does the participant write to its log before responding YES?

The “So What?”

2PC is the foundation for XA transactions (used by databases, message queues, and application servers since the 1990s), and it directly inspired the atomic commit protocols in modern systems like Google Spanner, etcd, and distributed SQL databases.

✏️ Exercises

Exercises: Distributed Transactions & 2PC

2PC Log State: In the Two-Phase Commit protocol, what exact state does a participant write to its local durable log before responding “YES” to the coordinator’s prepare request? Why must this state be written to a log rather than kept in memory?
Blocking Explained: Consider a 2PC execution where the coordinator crashes after receiving all “YES” responses but before any participant has received the commit/abort decision. Participant 1 is running, Participant 2 is running, and Participant 3 is also running. Can any of these participants safely decide on their own? Explain what happens to each.
Distributed 2PL and Phantoms: A distributed database uses Two-Phase Locking (2PL) for concurrency control. Transaction T1 runs a query: SELECT * FROM accounts WHERE balance > 1000 across two shards. Transaction T2 inserts a new row with balance = 2000 on shard 2. How can a phantom read occur, and what mechanism (in addition to 2PL) is needed to prevent it?

👁️ View Solutions

Solutions: Distributed Transactions & 2PC

Exercise 1

The participant writes PREPARED (or READY) to its log, along with the transaction’s writes (enough information to commit later). This must be durable (fsynced to disk) before the YES response is sent.

Why a log instead of memory: If the participant crashes after sending YES but before receiving the coordinator’s decision, memory is lost. On recovery, the participant reads its log. If the log says PREPARED, it knows it made a promise and must wait for the coordinator. If the log says COMMIT or ABORT, it follows that. Without the log, the participant would have no memory of the promise after restarting.

Exercise 2

No participant can safely decide on their own.

Participant 1 received the PREPARE, said YES, and is now waiting. It has not received a COMMIT or ABORT. It holds locks and is blocked until the coordinator recovers.
Participant 2 is in the same state — blocked.
Participant 3 is also in the same state — blocked.

If any participant unilaterally commits, it might violate atomicity: the coordinator might have decided to abort (e.g., if it suspected a timeout from one participant). If any participant unilaterally aborts, it might violate atomicity: the coordinator might have logged COMMIT before crashing.

All three participants are blocked until the coordinator recovers and reveals the decision.

Exercise 3

Phantom read scenario:

T1’s query SELECT * FROM accounts WHERE balance > 1000 on shard 2 returns no rows.
T2 inserts a new row with balance = 2000 on shard 2 and commits.
If T1 re-runs the same query, it now sees the new row — a “phantom” that appeared from nowhere.

Standard 2PL (locking individual rows) does not prevent this because the new row didn’t exist when T1 first read.

Solution: A predicate lock or next-key lock is needed. In a distributed setting, this means locking the range balance > 1000 on each shard. The lock prevents T2 from inserting rows matching that predicate. In practice, many systems use serializable snapshot isolation (SSI) instead, which detects phantoms at commit time by checking for conflicting predicate reads.

🎮 Interactive Two-Phase Commit

Scroll to load interactive visualization…