Distributed & Decentralized Systems Curriculum
Consistency Trade offs Β· CAP Theorem

Key Question

Can you β€œchoose CA” and ignore partition tolerance?

Deep Dive

One of the most persistent CAP myths is the idea that you can choose CA (Consistency + Availability) and simply hope partitions don’t happen. This is seductive: β€œMy deployment is on a single rack with redundant switches. Partitions are extremely rare. I’ll just skip partition tolerance.”

This reasoning is dangerously wrong. Here’s why.

Partitions Are Inevitable

At scale, network partitions are not rare β€” they are guaranteed. Google’s experience: in a cluster of 10,000 machines, there are dozens of network failures per day. Switches fail. Cables break. Software bugs cause routing tables to corrupt. Power supplies die. A deployment gets misconfigured.

Statistically: if a single machine has a 0.01% chance of a network failure per day, a 10,000-machine cluster has 63% probability of at least one failure per day (1 - 0.9999^10000).

What Happens to a β€œCA” System During a Partition

Let’s say you deploy a replicated database with two nodes on the same rack, with a β€œCA” configuration β€” meaning you intend to have both consistency and availability, and you’re ignoring partitions.

Normal operation:
β”Œβ”€β”€β”€β”€β”€β”β”€β”€β”€β”€β”€β”€β”€β”
β”‚  A  β”‚       β”‚  B  β”‚
β””β”€β”€β”€β”€β”€β”˜       β””β”€β”€β”€β”€β”€β”˜
   ↑              ↑
 Client writes    Client reads
 to A,            from B,
 syncs to B       sees latest

All is well. Then a partition occurs (someone trips over a cable):

During partition:
β”Œβ”€β”€β”€β”€β”€β”   //   β”Œβ”€β”€β”€β”€β”€β”
β”‚  A  β”‚~~~~~~~~β”‚  B  β”‚
β””β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”˜
   ↑               ↑
 Client A       Client B
 writes X=1     reads X 

Now what? The system did not design for partitions (it’s β€œCA”). There is no partition-handling logic. So:

  • The write to A succeeds (A is available).
  • B can’t sync with A.
  • Client B’s read reaches B. B returns X=0 (old value).

The system has violated Consistency. It returned stale data. Your β€œCA” system just became an inconsistent, unavailable mess. You didn’t avoid the partition β€” you just avoided planning for it, which made things worse than if you had chosen CP or AP deliberately.

Illustration: A β€œCA” system’s failure modes

                    Partition occurs
                    β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚                       β”‚
   Ignore partition       Detect partition
   (no handling)          (handle explicitly)
        β”‚                       β”‚
   β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”            β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”
   β”‚ B can't  β”‚            β”‚  CP     β”‚     β”‚  AP     β”‚
   β”‚ sync β†’   β”‚            β”‚(block orβ”‚     β”‚(serve   β”‚
   β”‚ stale    β”‚            β”‚ error)  β”‚     β”‚stale)   β”‚
   β”‚ reads    β”‚            β”‚         β”‚     β”‚         β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚
   System silently violates consistency
   Users see data disappear
   Hard to debug ("it worked in testing!")

Real-World Examples of Partition Handling Failure

MongoDB (pre-3.6): MongoDB’s β€œeventually consistent” secondary reads could serve stale data during network issues. Data centers experienced β€œrollback” scenarios where committed writes vanished after a replica set election. MongoDB was designed for availability, but the inconsistency wasn’t always obvious to users who thought they were getting β€œstrong consistency.”

GitHub’s 2018 MySQL partition incident: GitHub experienced a network partition between their MySQL primary and replicas. Reads from replicas returned stale data. Their β€œCA” assumption (partition should be impossible in a single datacenter with redundant networks) was proven wrong.

The Correct Interpretation of β€œCA”

There is NO β€œCA” in a distributed system. There is only:

  • CP: sacrifice availability during partition.
  • AP: sacrifice consistency during partition.
  • Single-node: trivial, no partition possible.

If you hear someone say β€œwe’re a CA system,” translate it as: β€œWe haven’t designed for partitions, and we will silently fail when one occurs.” Or: β€œWe’re a single-node system.”

Can You Minimize Partition Probability?

Yes. You can use:

  • Redundant network paths (multiple switches, diverse cabling).
  • Multiple datacenters (geographic redundancy).
  • Reliable hardware (but even the best hardware fails).

But you cannot eliminate partitions entirely. At Google’s scale, even a 99.999% reliable switch means thousands of failures per year across the fleet.

Check Your Understanding

  1. Why can’t a multi-node distributed system ever be β€œCA”?
  2. What happens to a single-node non-replicated database during a partition? Can it be classified as CA?
  3. A company says β€œwe run on a single AWS availability zone with redundant switches β€” partitions are impossible for us.” Why is this statement risky?
  4. What does β€œP” mean in CAP β€” do you choose to support it, or is it forced on you?

The β€œSo What?”

You cannot opt out of partition tolerance. Partitions are a fact of life in distributed systems, not a corner case. The only question is: what breaks when a partition occurs? If you plan for it, you can choose whether consistency or availability breaks. If you don’t plan for it, both break β€” and you won’t understand why. Engineering for partition tolerance is a fundamental requirement, not an optional feature.


✏️ Exercises

CAP Theorem β€” Exercises

Exercise 1

A traditional relational database (e.g., PostgreSQL) with a single primary and two asynchronous replicas experiences a network partition that isolates the primary from the replicas. Is this system CP or CA during the partition? Explain what happens.

Exercise 2

A system claims to provide β€œeventual consistency.” Can it also claim to provide CAP Consistency (linearizability)? Explain the relationship between eventual consistency and CAP’s C.

Exercise 3

During a network partition, an AP system accepts writes on both sides of the partition. When the partition heals, those writes might conflict. Describe two approaches the system can use to resolve these conflicts and converge.

Exercise 4

A single-machine database with no replication experiences a hard drive failure that prevents any reads or writes. Does CAP apply to this scenario? Why or why not? What about a single-machine database that IS replicated (e.g., a single PostgreSQL instance that streams WAL to a standby)?

πŸ‘οΈ View Solutions

CAP Theorem β€” Solutions

Solution 1

This system is neither cleanly CP nor AP β€” it’s a CP system that becomes unavailable during a partition.

Here’s what happens:

  • The primary is isolated from the replicas.
  • Writes to the primary succeed (no C β€” replicas don’t have them).
  • Reads from the primary return the latest data (A for primary-connected clients).
  • Reads from the replicas return stale data (C is violated β€” replicas are behind the primary).
  • Read-only clients connected to replicas see stale data silently.

PostgreSQL’s asynchronous replication does NOT detect or handle partitions explicitly. So during a partition:

  • The system is available (writes succeed on primary, reads on both sides) β€” βœ“ A
  • But C is violated: replicas serve stale data β€” βœ— C
  • During the partition, it behaves like AP (inconsistently)

However, if you consider the system as designed (it wasn’t designed for partition tolerance), then when the partition heals, the replicas eventually catch up. But during the partition, it is NOT consistent.

If the replicas were SYNCHRONOUS (waiting for at least one replica to acknowledge), the system would be CP: the primary would block writes during the partition (since the replica can’t acknowledge), and reads from the primary would be consistent but the primary would be effectively unavailable for writes.

Solution 2

No. Eventual consistency and CAP Consistency (linearizability) are different things.

  • CAP Consistency (linearizability) provides a real-time guarantee: once a write completes, every subsequent read must return that value.
  • Eventual consistency guarantees convergence only when writes stop. It does NOT guarantee that a read after a write returns the latest value.

You CANNOT claim CAP Consistency AND eventual consistency for the same operation at the same time. Here’s why:

Client A writes X=1 (completed)
Client B reads X (starts 1ms after A's write completed)
  - Under CAP-C: B must see X=1
  - Under EC: B might see X=0 (it will eventually converge, maybe after 100ms)

However, a system can provide both guarantees for DIFFERENT operations:

  • Strongly consistent (CAP-C) writes to the catalog table.
  • Eventually consistent reads to the user data table.

Or provide tunable consistency:

  • Cassandra: read at QUORUM β†’ CAP-C, read at ONE β†’ eventual consistency.

The key insight: eventual consistency is a WEAKER guarantee than CAP-C. They are on a spectrum, not two sides of a coin.

Solution 3

Approach 1: Last Writer Wins (LWW)

Each write is timestamped. When the partition heals and writes are compared, the write with the latest timestamp wins.

Side A: write X=1 at T=10:00:05
Side B: write X=2 at T=10:00:06

After healing: X=2 wins (T=10:00:06 > 10:00:05)

Simple and always converges. But Side A’s write (X=1) is silently lost.

Approach 2: Vector Clocks with Sibling Resolution

Each node maintains a version counter. After healing, conflicting versions are detected via vector clocks:

Side A: write X=1 β†’ version ([A,1])
Side B: write X=2 β†’ version ([B,1])

Neither dominates the other (A doesn't know about B's write, vice versa).
Resolution: both values are kept as siblings: {1, 2}

On next read, the application merges them:
- Shopping cart: union of items {milk} + {eggs} = {milk, eggs}
- Profile name: ask the user which one to keep
- Counter: sum (if appropriate) or max (if appropriate)

Both approaches are used in production. LWW for simplicity (most systems), siblings for data integrity (Dynamo’s shopping cart).

Solution 4

Part 1: Single-machine database with hard drive failure

CAP does NOT apply to this scenario in the meaningful sense:

  • There is NO network (single machine).
  • Partition Tolerance (β€œP”) is about network partitions β€” the system cannot be split into groups that cannot communicate.
  • The hard drive failure is a node failure, not a partition.

A single-node system is effectively β€œCA” trivially:

  • C: There’s only one copy of data, so it’s always consistent.
  • A: It’s available as long as the node is up.
  • P: β€œPartitions” are impossible because there’s only one node.

This is why CAP is only interesting for DISTRIBUTED (multi-node) systems.

Part 2: Single PostgreSQL instance with WAL streaming to standby

CAP DOES apply here because there are two nodes connected by a network:

  • If the primary fails and the standby isn’t promoted: system becomes unavailable (no A).
  • If the network between primary and standby drops: the standby falls behind (no C if reads from standby).
  • If you configure synchronous replication: the primary blocks writes when the standby is unreachable (no A for writes).

So even a β€œsimple” primary-standby setup is subject to CAP. The moment you have two nodes connected by a network, CAP applies.