Distributed & Decentralized Systems Curriculum
Consistency Trade offs ยท CAP Theorem

Key Question

Why canโ€™t you have all three guarantees simultaneously?

Deep Dive

The CAP theorem is provable with a simple thought experiment involving two nodes and a network partition. Letโ€™s walk through it step by step.

Setup

Two nodes, N1 and N2, both maintain a register X (initialized to X=0). They communicate over a network. A partition occurs: all messages between N1 and N2 are dropped.

Before partition:        During partition:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”      โ”Œโ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”
โ”‚ N1  โ”‚โ”€โ”€โ”€โ”€โ”‚ N2  โ”‚      โ”‚ N1  โ”‚~~~~โ”‚ N2  โ”‚
โ”‚ X=0 โ”‚    โ”‚ X=0 โ”‚      โ”‚ X=0 โ”‚    โ”‚ X=0 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”˜      โ””โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”˜
                        Network cut โ€” no communication

Two clients: Client A (connected to N1) and Client B (connected to N2).

Step 1: Client A writes to N1

Client A sends write X=1 to N1. N1 accepts the write and updates its local state. X is now 1 on N1.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”    //    โ”Œโ”€โ”€โ”€โ”€โ”€โ”
โ”‚ N1  โ”‚~~~~~~~~โ”‚ N2  โ”‚
โ”‚ X=1 โ”‚         โ”‚ X=0 โ”‚  โ† N2 can't know about the write
โ””โ”€โ”€โ”€โ”€โ”€โ”˜         โ””โ”€โ”€โ”€โ”€โ”€โ”˜
   โ†‘
Client A: write X=1 (acknowledged)

The partition means N2 never receives the update from N1. N2โ€™s state remains X=0.

Step 2: Client B reads from N2

Client B sends read X to N2. Now we face the dilemma.

Goal: Consistency (C) + Availability (A) + Partition Tolerance (P)

We have P (the partition exists).
Now we need C and A simultaneously.

For Availability (A): 
  N2 must respond to Client B's request (it cannot timeout or error out).

For Consistency (C): 
  N2 must return the latest value, which is X=1.

But N2 CANNOT return X=1 because it doesn't know about the write!
N2 only knows X=0.

The Two Impossible Outcomes

ChoiceN2โ€™s responseC?A?Result
Return stale valueX=0NO โ€” violates CYES โ€” A is satisfiedFails CAP
Return error/timeoutโ€errorโ€ or no responseN/A (no data returned)NO โ€” violates AFails CAP

There is NO third option. N2 cannot return X=1 because it doesnโ€™t have it. The partition prevents communication. Therefore, in the presence of a partition, you CANNOT simultaneously satisfy C and A.

Mathematical Summary

The CAP theorem is a proof by contradiction:

  1. Assume a system that provides C, A, and P.
  2. Introduce a partition (P is tolerated, so the system must continue).
  3. Write happens on one side of the partition.
  4. Read happens on the other side.
  5. For A: the read must return a response.
  6. For C: the response must be the latest write.
  7. But the latest write is inaccessible across the partition.
  8. Contradiction: the read cannot satisfy both C and A.

Therefore, no system can simultaneously provide C, A, and P.

Formal statement by Gilbert and Lynch (2002):

โ€œIn a distributed system, it is impossible to guarantee both consistency and availability in the presence of network partitions.โ€

Note: the theorem doesnโ€™t say โ€œyou only get two.โ€ It says โ€œyou cannot have all three.โ€ During a partition, you give up either C or A. When there is no partition, you can have both โ€” which is the common case.

What the Proof Does NOT Say

  • It doesnโ€™t say you canโ€™t have both C and A MOST of the time (you can, when thereโ€™s no partition).
  • It doesnโ€™t choose for you which to sacrifice (thatโ€™s an application-level decision).
  • It doesnโ€™t say systems must choose CP or AP at all times (you can have tunable consistency, like Cassandra).
  • It doesnโ€™t apply to single-node systems (no network, no partition).

Check Your Understanding

  1. Walk through the CAP proof with three nodes instead of two. Does the result change?
  2. In the proof, what happens if N2 waits until the partition heals before responding? Is the system CAP-available?
  3. Could N2 return X=1 if it has a cached/pre-negotiated value of X? What assumptions would this require?

The โ€œSo What?โ€

This proof is mathematically airtight โ€” CAP is a theorem, not an opinion. When someone claims their system has โ€œall three,โ€ theyโ€™re either misunderstanding the definitions, or theyโ€™re describing a single-node system. The proof forces you to accept the fundamental trade-off: when the network fails, you must choose between serving stale data or serving no data. Thereโ€™s no escape. The only way to โ€œhave bothโ€ is to design systems where partitions are so unlikely that youโ€™re willing to risk inconsistency or unavailability when they occur.


โœ๏ธ Exercises

CAP Theorem โ€” Exercises

Exercise 1

A traditional relational database (e.g., PostgreSQL) with a single primary and two asynchronous replicas experiences a network partition that isolates the primary from the replicas. Is this system CP or CA during the partition? Explain what happens.

Exercise 2

A system claims to provide โ€œeventual consistency.โ€ Can it also claim to provide CAP Consistency (linearizability)? Explain the relationship between eventual consistency and CAPโ€™s C.

Exercise 3

During a network partition, an AP system accepts writes on both sides of the partition. When the partition heals, those writes might conflict. Describe two approaches the system can use to resolve these conflicts and converge.

Exercise 4

A single-machine database with no replication experiences a hard drive failure that prevents any reads or writes. Does CAP apply to this scenario? Why or why not? What about a single-machine database that IS replicated (e.g., a single PostgreSQL instance that streams WAL to a standby)?

๐Ÿ‘๏ธ View Solutions

CAP Theorem โ€” Solutions

Solution 1

This system is neither cleanly CP nor AP โ€” itโ€™s a CP system that becomes unavailable during a partition.

Hereโ€™s what happens:

  • The primary is isolated from the replicas.
  • Writes to the primary succeed (no C โ€” replicas donโ€™t have them).
  • Reads from the primary return the latest data (A for primary-connected clients).
  • Reads from the replicas return stale data (C is violated โ€” replicas are behind the primary).
  • Read-only clients connected to replicas see stale data silently.

PostgreSQLโ€™s asynchronous replication does NOT detect or handle partitions explicitly. So during a partition:

  • The system is available (writes succeed on primary, reads on both sides) โ€” โœ“ A
  • But C is violated: replicas serve stale data โ€” โœ— C
  • During the partition, it behaves like AP (inconsistently)

However, if you consider the system as designed (it wasnโ€™t designed for partition tolerance), then when the partition heals, the replicas eventually catch up. But during the partition, it is NOT consistent.

If the replicas were SYNCHRONOUS (waiting for at least one replica to acknowledge), the system would be CP: the primary would block writes during the partition (since the replica canโ€™t acknowledge), and reads from the primary would be consistent but the primary would be effectively unavailable for writes.

Solution 2

No. Eventual consistency and CAP Consistency (linearizability) are different things.

  • CAP Consistency (linearizability) provides a real-time guarantee: once a write completes, every subsequent read must return that value.
  • Eventual consistency guarantees convergence only when writes stop. It does NOT guarantee that a read after a write returns the latest value.

You CANNOT claim CAP Consistency AND eventual consistency for the same operation at the same time. Hereโ€™s why:

Client A writes X=1 (completed)
Client B reads X (starts 1ms after A's write completed)
  - Under CAP-C: B must see X=1
  - Under EC: B might see X=0 (it will eventually converge, maybe after 100ms)

However, a system can provide both guarantees for DIFFERENT operations:

  • Strongly consistent (CAP-C) writes to the catalog table.
  • Eventually consistent reads to the user data table.

Or provide tunable consistency:

  • Cassandra: read at QUORUM โ†’ CAP-C, read at ONE โ†’ eventual consistency.

The key insight: eventual consistency is a WEAKER guarantee than CAP-C. They are on a spectrum, not two sides of a coin.

Solution 3

Approach 1: Last Writer Wins (LWW)

Each write is timestamped. When the partition heals and writes are compared, the write with the latest timestamp wins.

Side A: write X=1 at T=10:00:05
Side B: write X=2 at T=10:00:06

After healing: X=2 wins (T=10:00:06 > 10:00:05)

Simple and always converges. But Side Aโ€™s write (X=1) is silently lost.

Approach 2: Vector Clocks with Sibling Resolution

Each node maintains a version counter. After healing, conflicting versions are detected via vector clocks:

Side A: write X=1 โ†’ version ([A,1])
Side B: write X=2 โ†’ version ([B,1])

Neither dominates the other (A doesn't know about B's write, vice versa).
Resolution: both values are kept as siblings: {1, 2}

On next read, the application merges them:
- Shopping cart: union of items {milk} + {eggs} = {milk, eggs}
- Profile name: ask the user which one to keep
- Counter: sum (if appropriate) or max (if appropriate)

Both approaches are used in production. LWW for simplicity (most systems), siblings for data integrity (Dynamoโ€™s shopping cart).

Solution 4

Part 1: Single-machine database with hard drive failure

CAP does NOT apply to this scenario in the meaningful sense:

  • There is NO network (single machine).
  • Partition Tolerance (โ€œPโ€) is about network partitions โ€” the system cannot be split into groups that cannot communicate.
  • The hard drive failure is a node failure, not a partition.

A single-node system is effectively โ€œCAโ€ trivially:

  • C: Thereโ€™s only one copy of data, so itโ€™s always consistent.
  • A: Itโ€™s available as long as the node is up.
  • P: โ€œPartitionsโ€ are impossible because thereโ€™s only one node.

This is why CAP is only interesting for DISTRIBUTED (multi-node) systems.

Part 2: Single PostgreSQL instance with WAL streaming to standby

CAP DOES apply here because there are two nodes connected by a network:

  • If the primary fails and the standby isnโ€™t promoted: system becomes unavailable (no A).
  • If the network between primary and standby drops: the standby falls behind (no C if reads from standby).
  • If you configure synchronous replication: the primary blocks writes when the standby is unreachable (no A for writes).

So even a โ€œsimpleโ€ primary-standby setup is subject to CAP. The moment you have two nodes connected by a network, CAP applies.