Union-Find (Disjoint Set)
Efficiently track and merge disjoint sets using union by rank and path compression.
The Mathematics
Equivalence relations, union by rank, and path compression
Disjoint Set Partitions
A disjoint-set data structure maintains a collection of non-overlapping sets that partition a universe :
Each set is identified by a representative element. Two operations: returns the representative, merges two sets.
Union by Rank
Each tree root has a rank (upper bound on height). When merging, attach the shorter tree under the taller:
This keeps tree heights logarithmic:
Path Compression
During , make every node on the path point directly to the root:
Combined with union by rank, the amortized cost per operation is:
How Slow Does α Grow?
for all practical values of (up to ). This makes union-find effectively constant time per operation.
See It Work
Union by rank and path compression on 6 elements
Initialize: each element is its own parent. All ranks are 0.
The Code
Bridge from set operations to Python implementation
Mathematical Formulation
Initialize each element as its own set (self-loop = root)
All ranks start at 0 (single-element trees)
Path compression: make every node on path point directly to root
Union by rank: always attach smaller tree under larger tree's root
Make the smaller tree's root point to the larger tree's root
With both optimizations, amortized cost is inverse Ackermann α(n) ≈ O(1)
Python Implementation
class UnionFind:
def __init__(self, n):
self.parent = list(range(n)) # Each element is its own root
self.rank = [0] * n # Rank (upper bound on height)
def find(self, x):
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x]) # Path compression
return self.parent[x]
def union(self, x, y):
rx, ry = self.find(x), self.find(y)
if rx == ry: return # Already in same set
if self.rank[rx] < self.rank[ry]:
rx, ry = ry, rx # Ensure rx has higher rank
self.parent[ry] = rx # Attach smaller tree under larger
if self.rank[rx] == self.rank[ry]:
self.rank[rx] += 1 # Increment rank if equal