Consistency Trade offs · CAP Theorem

Key Question

What do Consistency, Availability, and Partition Tolerance actually mean in formal terms?

Deep Dive

The CAP theorem states that a distributed data store can only provide two of three guarantees simultaneously: Consistency, Availability, and Partition Tolerance. But the devil is in the definitions. The theorem is mathematically airtight, but only if you use the precise definitions. Let’s nail them down.

C — Consistency (Linearizability)

In CAP, “Consistency” means linearizability (also called “atomic consistency”). Not eventual consistency, not causal consistency — linearizability. Formally:

All nodes see the same data at the same time.
After a write completes, all subsequent reads (from any node) return that value.
The system behaves as if there is a single, global copy of the data.

This is NOT the same as “consistency” in ACID (which refers to application-level invariants, like foreign keys or check constraints). CAP Consistency is strictly about the visibility of writes across replicas.

CAP Consistent System:
Client A writes X=1 to Node 1
Client B reads X from Node 2 → gets X=1

Not CAP Consistent:
Client A writes X=1 to Node 1
Client B reads X from Node 2 → gets X=0 (stale)

A — Availability

“Availability” means every request received by a non-failing node must result in a response. Not “a correct response” — just “a response.” The response could be an error (e.g., “internal server error”), but it cannot be a timeout or a dropped connection.

Key nuance: Availability only applies to non-failing nodes. If a node has crashed, the system isn’t required to respond from that node. But any node that IS running must process every request it receives.

Available System:
Client → Node 2: read X → "X=0" (even if stale)
Client → Node 1: read X → "X=42"

Unavailable System:
Client → Node 2: read X → [timeout, connection refused, no response]

Availability does NOT mean “fast.” A 30-second response time is still “available” in the CAP sense. Availability means “the system doesn’t crash or drop requests.”

P — Partition Tolerance

A “partition” is a break in the network that prevents messages from being delivered between nodes. Partition tolerance means the system continues to operate despite an arbitrary number of messages being dropped or delayed between nodes.

Partition:
  ┌─────┐     ┌─────┐
  │  A  │~~~~~│  B  │
  └─────┘     └─────┘
  Messages between A and B are lost (network cut)

The network is NOT reliable. Partitions happen: a switch fails, a cable is cut, a router software update goes wrong, a network interface card overheats. Partitions are not rare — at Facebook’s scale, a network partition is a daily occurrence.

A system that is partition-tolerant CANNOT fail entirely when a partition occurs. It might degrade, but it survives.

The “Pick Two” Trap

The classic statement is “pick two of three.” But this is misleading. In practice:

If you have a P system (distributed across a network), partitions WILL happen.
Therefore, you must deal with partitions.
During a partition, you must choose: either C or A.

So the real choice is: CP or AP during a partition. You get P whether you want it or not (assuming you’re distributed). And during a partition, you sacrifice either C or A.

The “pick two” statement has caused endless confusion. A more accurate version: “In the presence of a partition, you must choose between consistency and availability.”

Check Your Understanding

In CAP, does “Consistency” mean the same thing as ACID consistency? Explain.
If a server responds with “HTTP 500 Internal Server Error,” is the system CAP-available?
Can a single-node (non-replicated) database be classified under CAP? Why or why not?
Why is “pick CP” different from “pick CA” in a distributed system?

The “So What?”

Most debates about CAP are actually debates about definitions. If two engineers are arguing about whether Cassandra “has consistency,” check if they mean CAP Consistency (linearizability — no, Cassandra doesn’t have it) or “some kind of consistency” (yes, Cassandra provides eventual consistency). Using precise definitions prevents hours of wasted argument. CAP is most useful as a vocabulary for discussing trade-offs, not as a technical check-box.

✏️ Exercises

CAP Theorem — Exercises

Exercise 1

A traditional relational database (e.g., PostgreSQL) with a single primary and two asynchronous replicas experiences a network partition that isolates the primary from the replicas. Is this system CP or CA during the partition? Explain what happens.

Exercise 2

A system claims to provide “eventual consistency.” Can it also claim to provide CAP Consistency (linearizability)? Explain the relationship between eventual consistency and CAP’s C.

Exercise 3

During a network partition, an AP system accepts writes on both sides of the partition. When the partition heals, those writes might conflict. Describe two approaches the system can use to resolve these conflicts and converge.

Exercise 4

A single-machine database with no replication experiences a hard drive failure that prevents any reads or writes. Does CAP apply to this scenario? Why or why not? What about a single-machine database that IS replicated (e.g., a single PostgreSQL instance that streams WAL to a standby)?

👁️ View Solutions

CAP Theorem — Solutions

Solution 1

This system is neither cleanly CP nor AP — it’s a CP system that becomes unavailable during a partition.

Here’s what happens:

The primary is isolated from the replicas.
Writes to the primary succeed (no C — replicas don’t have them).
Reads from the primary return the latest data (A for primary-connected clients).
Reads from the replicas return stale data (C is violated — replicas are behind the primary).
Read-only clients connected to replicas see stale data silently.

PostgreSQL’s asynchronous replication does NOT detect or handle partitions explicitly. So during a partition:

The system is available (writes succeed on primary, reads on both sides) — ✓ A
But C is violated: replicas serve stale data — ✗ C
During the partition, it behaves like AP (inconsistently)

However, if you consider the system as designed (it wasn’t designed for partition tolerance), then when the partition heals, the replicas eventually catch up. But during the partition, it is NOT consistent.

If the replicas were SYNCHRONOUS (waiting for at least one replica to acknowledge), the system would be CP: the primary would block writes during the partition (since the replica can’t acknowledge), and reads from the primary would be consistent but the primary would be effectively unavailable for writes.

Solution 2

No. Eventual consistency and CAP Consistency (linearizability) are different things.

CAP Consistency (linearizability) provides a real-time guarantee: once a write completes, every subsequent read must return that value.
Eventual consistency guarantees convergence only when writes stop. It does NOT guarantee that a read after a write returns the latest value.

You CANNOT claim CAP Consistency AND eventual consistency for the same operation at the same time. Here’s why:

Client A writes X=1 (completed)
Client B reads X (starts 1ms after A's write completed)
  - Under CAP-C: B must see X=1
  - Under EC: B might see X=0 (it will eventually converge, maybe after 100ms)

However, a system can provide both guarantees for DIFFERENT operations:

Strongly consistent (CAP-C) writes to the catalog table.
Eventually consistent reads to the user data table.

Or provide tunable consistency:

Cassandra: read at QUORUM → CAP-C, read at ONE → eventual consistency.

The key insight: eventual consistency is a WEAKER guarantee than CAP-C. They are on a spectrum, not two sides of a coin.

Solution 3

Approach 1: Last Writer Wins (LWW)

Each write is timestamped. When the partition heals and writes are compared, the write with the latest timestamp wins.

Side A: write X=1 at T=10:00:05
Side B: write X=2 at T=10:00:06

After healing: X=2 wins (T=10:00:06 > 10:00:05)

Simple and always converges. But Side A’s write (X=1) is silently lost.

Approach 2: Vector Clocks with Sibling Resolution

Each node maintains a version counter. After healing, conflicting versions are detected via vector clocks:

Side A: write X=1 → version ([A,1])
Side B: write X=2 → version ([B,1])

Neither dominates the other (A doesn't know about B's write, vice versa).
Resolution: both values are kept as siblings: {1, 2}

On next read, the application merges them:
- Shopping cart: union of items {milk} + {eggs} = {milk, eggs}
- Profile name: ask the user which one to keep
- Counter: sum (if appropriate) or max (if appropriate)

Both approaches are used in production. LWW for simplicity (most systems), siblings for data integrity (Dynamo’s shopping cart).

Solution 4

Part 1: Single-machine database with hard drive failure

CAP does NOT apply to this scenario in the meaningful sense:

There is NO network (single machine).
Partition Tolerance (“P”) is about network partitions — the system cannot be split into groups that cannot communicate.
The hard drive failure is a node failure, not a partition.

A single-node system is effectively “CA” trivially:

C: There’s only one copy of data, so it’s always consistent.
A: It’s available as long as the node is up.
P: “Partitions” are impossible because there’s only one node.

This is why CAP is only interesting for DISTRIBUTED (multi-node) systems.

Part 2: Single PostgreSQL instance with WAL streaming to standby

CAP DOES apply here because there are two nodes connected by a network:

If the primary fails and the standby isn’t promoted: system becomes unavailable (no A).
If the network between primary and standby drops: the standby falls behind (no C if reads from standby).
If you configure synchronous replication: the primary blocks writes when the standby is unreachable (no A for writes).

So even a “simple” primary-standby setup is subject to CAP. The moment you have two nodes connected by a network, CAP applies.

🎮 Interactive CAP Theorem

Scroll to load interactive visualization…