Decentralized Systems · DHTs Kademlia

Key Question

How do you find a file in a peer-to-peer network where no one knows where everything is?

Deep Dive

A Distributed Hash Table (DHT) is a decentralized key-value store spread across many nodes. You PUT a value under a key, and anyone who knows the key can GET the value back — no central server required.

Before DHTs, P2P lookedups were broken in two ways:

Napster (1999):           Gnutella (2000):         DHT (2001+):
     ┌─────┐              ┌──┐ ┌──┐ ┌──┐          ┌──┐ ┌──┐ ┌──┐
     │Index│              │A │⇄│B │⇄│C │          │A │⇄│B │⇄│C │
     └──┬──┘              └──┘ └──┘ └──┘          └──┘ └──┘ └──┘
        │                   │⇄│  │⇄│  │⇄│           │⇄│  │⇄│  │⇄│
   ┌────┴────┐             ┌──┐ ┌──┐ ┌──┐          ┌──┐ ┌──┐ ┌──┐
   ▼         ▼             │D │⇄│E │⇄│F │          │D │⇄│E │⇄│F │
 ┌──┐       ┌──┐           └──┘ └──┘ └──┘          └──┘ └──┘ └──┘
 │A │       │B │
 └──┘       └──┘         Each query floods to      Each node knows
                         every neighbor (O(n))     a few "routes"
Central index = SPOF

Napster: Central index server. Fast, but a single point of failure (and a lawsuit magnet).
Gnutella: Flood queries to all neighbors. TTL field limits spread, but still O(n) messages per query — doesn’t scale.
DHT: Each node stores a small slice of the key space. Queries route through O(log n) hops. No central coordination.

Core operations are dirt simple:

PUT("song.mp3", 192.168.1.5)  → stored on responsible node(s)
GET("song.mp3")                → returns 192.168.1.5

This looks like consistent hashing, and it is — but with a twist. Consistent hashing assumes you know the full ring of nodes. In a P2P network, you join, you leave, you crash. You don’t have a directory. You know maybe 5-10 nodes. DHTs answer: how do you route to the right node when you only know a handful of peers?

The answer depends on the DHT protocol. Chord uses a finger table and clockwise distance. Kademlia uses XOR. Pastry uses prefix matching. They all achieve O(log n) lookups with only O(log n) routing state per node.

Check Your Understanding

What are the scaling properties of Napster, Gnutella, and a DHT (in terms of messages per lookup)?
Why can’t you just use consistent hashing directly in a P2P network?
Draw the communication pattern of each system from memory.

The “So What?”

DHTs are the routing layer of every major decentralized system today — BitTorrent, IPFS, Ethereum’s discovery protocol. Without them, P2P networks either centralize or drown in flooded traffic. They’re the reason “find this data” scales to millions of nodes.

✏️ Exercises

Exercises: DHTs & Kademlia

Exercise 1

In Kademlia with 8-bit IDs, what is the XOR distance between node IDs 11001010 and 11011010?

Exercise 2

Why does Kademlia use parallel queries (α > 1) instead of sequential queries (α = 1)? What’s the trade-off?

Exercise 3

What problem does S/Kademlia’s disjoint path routing solve, and how does it protect against an attacker who controls ~30% of the network?

Exercise 4 (Challenge)

In a Kademlia network with N = 100,000 nodes and k = 20, assume each node has a fully populated routing table. How many distinct nodes can a single node know about (maximum)? How many hops should a lookup take in the worst case?

👁️ View Solutions

Solutions: DHTs & Kademlia

Exercise 1

XOR distance between 11001010 and 11011010.

  11001010
⊕ 11011010
──────────
  00010000  = 16

The distance is 16.

Exercise 2

Parallel queries (α > 1) vs sequential.

Parallel queries (α=3) reduce lookup latency by sending requests simultaneously. If one node is slow or offline, the other α-1 requests proceed independently. The trade-off is bandwidth: each hop sends 3x as many messages. In practice this is a good trade because:

It masks network latency (RTT matters less)
It provides fault tolerance (dead nodes don’t stall the lookup)
The XOR topology means responses fill routing tables faster

Exercise 3

S/Kademlia’s disjoint path routing.

Vanilla Kademlia routes all queries toward the same k closest nodes. If an attacker controls those nodes (via a Sybil or Eclipse attack), they can return false results. S/Kademlia sends lookups along d independent paths to different sets of nodes. If the attacker controls ~30% of the network, the probability of them controlling all d paths is (0.3)^d. With d=3, the attacker has only a 2.7% chance of controlling every path, greatly reducing attack success.

Exercise 4 (Challenge)

Routing table capacity and lookup hops.

Each node has 160 k-buckets, each holding up to k=20 contacts. Maximum nodes known per node: 160 × 20 = 3,200.

In practice, many buckets will be sparsely populated (especially high-distance ranges), so a node typically knows far fewer.

Lookup hops (worst case): log₂(N) = log₂(100,000) ≈ 16.6. So 17 hops maximum.

In practice, with α=3 parallelism and well-populated routing tables, lookups converge in ~10-12 hops.