Consensus Fault Tolerance · Byzantine Fault Tolerance

Key Question

Why do you need 3f+1 nodes to tolerate f Byzantine faults?

Deep Dive

For crash faults, you need 2f+1 nodes to tolerate f failures (a majority). For Byzantine faults, you need 3f+1 — three times as many. Where does this extra overhead come from?

Intuition: In crash fault tolerance, a faulty node says nothing. You can simply ignore it. In BFT, a faulty node can say different things to different people. You can’t just discard its messages because you don’t know which messages are lies.

The 3f+1 proof:

Imagine a system with N nodes, f of which are Byzantine. For safety (agreement), every correct node must receive enough consistent messages to decide.

Worst case: f faulty nodes are unresponsive (crash-like), 
            and f faulty nodes send conflicting information.

Correct nodes:     [C1]  [C2]  [C3]  ...  [C_(N-2f)]
Byzantine (liars): [B1]  [B2]  ...  [B_f]
Byzantine (silent):[S1]  [S2]  ...  [S_f]

A correct node needs responses from N - f nodes (wait for slowest, 
ignore silent failed). Of those N - f responses, up to f could be lies.
So we need: N - f - f > f  (honest responses > remaining Byzantine)
           → N - 2f > f
           → N > 3f
           → N >= 3f + 1

Visual: f=1 requires 4 nodes

f=1, N=4 (minimum viable):
  [C1] [C2] [C3] [B1]
  
  Node C1 waits for N-f = 3 responses (including itself):
    C1, C2, C3 (all honest) → C1 gets 3 matching responses ✓
    C1, C2, B1 (B1 lies) → C1 gets 2 matching, 1 different → still OK
  
  Without B1's response: C1 only has C2 and C3 = 2 = not enough.
  So C1 must wait for 3 responses, including possibly B1.

f=1, N=3 (insufficient):
  [C1] [C2] [B1]
  
  Node C1 waits for N-f = 2 responses (including itself):
    C1, C2 → both honest → agree ✓
    C1, B1 → B1 lies → C1 thinks decision is X, 
                        but C2 heard something else from B1

  With only 3 nodes, one Byzantine can cause confusion. 
  If C1 hears from C2 (honest) about value X, and from B1 about value Y,
  C1 can't form a reliable majority.

Practical overhead:

Faults (f)	Nodes needed (3f+1)	Crash tolerance (2f+1)	Overhead
1	4	3	33% more
2	7	5	40% more
3	10	7	43% more
5	16	11	45% more

BFT costs roughly 50% more nodes than crash fault tolerance for the same fault tolerance. Plus the protocol overhead is much higher (more messages, more phases).

Check Your Understanding

How many nodes are needed to tolerate 3 Byzantine faulty nodes?
Why can’t 4 nodes tolerate 2 Byzantine faults?
In crash fault tolerance, you need 2f+1 nodes. Why does BFT need an extra f nodes?

The “So What?”

BFT is expensive — roughly 3x the fault-tolerance overhead compared to Paxos/Raft (which only need 2f+1 for crash faults). This is why BFT was considered impractical for decades. The 3f+1 bound is the fundamental reason: to survive liars, you need redundancy beyond what crash tolerance requires.

✏️ Exercises

Byzantine Fault Tolerance: Exercises

Exercise 1: 3f+1 Bound

How many nodes are needed to tolerate 3 Byzantine faulty nodes? Show the math.

Exercise 2: PBFT Phases

In PBFT, why is the Prepare phase needed if we already have Pre-prepare? What problem would occur with only Pre-prepare and Commit?

Exercise 3: Proof-of-Work as BFT

How does Proof-of-Work act as a probabilistic BFT mechanism? What happens if an attacker controls 40% of the total hashrate?

👁️ View Solutions

Byzantine Fault Tolerance: Solutions

Exercise 1

N ≥ 3f + 1 = 3(3) + 1 = 10 nodes.

For f = 3, you need at least 10 nodes. With 10 nodes, you can tolerate up to 3 Byzantine faults. With only 9 nodes (3f, but 3f+1 is 10), you’d have 3 correct nodes per “group” — any 3 faulty nodes could group together and confuse the protocol since honest nodes need 2f+1 replies but can’t distinguish liars from honest ones in the worst case.

Compare to crash faults: 2f+1 = 7 nodes needed for f=3 crash tolerance. BFT requires 3 more nodes.

Exercise 2

Without the Prepare phase, a single faulty primary could send different Pre-prepare messages to different replicas, and those replicas would commit different values in the Commit phase — violating agreement.

The Prepare phase ensures within-view consistency: replicas share their Pre-prepare receipts with each other. When a replica collects 2f matching Prepare messages (including its own Pre-prepare), it knows that at least f+1 correct replicas received the same Pre-prepare. Even if the primary is Byzantine, the correct replicas have agreed on the ordering.

The Commit phase then ensures cross-view consistency: that the decision survives view changes.

Without Prepare, a Byzantine primary could send Pre-prepare(“attack”) to half the replicas and Pre-prepare(“retreat”) to the other half. The Commit phase alone can’t detect this because replicas don’t cross-check what they received from the primary.

Exercise 3

Proof-of-Work acts as probabilistic BFT because:

Identity is expensive: Each block requires computational work. An adversary must expend compute power proportional to their hashrate to produce blocks.
Chain selection rule: The longest (most work) chain is considered canonical. To reverse a transaction, the adversary must outpace the honest chain — requiring >50% of total hashrate.
Probabilistic finality: Each confirmation block reduces the probability that an adversary can rewrite history.

With 40% hashrate, an attacker has a significant but not dominant position. The probability of a successful double-spend attack decreases exponentially with confirmations:

After 1 block: ~40% chance (if attacker has pre-mined)
After 3 blocks: ~10% chance
After 6 blocks: ~0.8% chance
After 10 blocks: ~0.02% chance

The attacker can temporarily overtake the honest chain, but the honest chain grows faster on average. The system recommends waiting for enough confirmations (typically 6 for Bitcoin) to make reversal economically infeasible.

This contrasts with classical BFT where finality is instant and deterministic — once 2f+1 nodes commit, the decision is final regardless of adversarial hashrate.