Union-Find (Disjoint Set)

Efficiently track and merge disjoint sets using union by rank and path compression.

O(α(n))Data Structures
Phase 1f(x)

The Mathematics

Equivalence relations, union by rank, and path compression

Disjoint Set Partitions

A disjoint-set data structure maintains a collection of non-overlapping sets S1,S2,,SkS_1, S_2, \ldots, S_k that partition a universe UU:

SiSj=    ij,iSi=US_i \cap S_j = \emptyset \;\;\forall i \neq j, \quad \bigcup_i S_i = U

Each set is identified by a representative element. Two operations: find(x)\text{find}(x) returns the representative, union(x,y)\text{union}(x, y) merges two sets.

Union by Rank

Each tree root has a rank (upper bound on height). When merging, attach the shorter tree under the taller:

rank(rx)<rank(ry)    parent(rx)=ry\text{rank}(r_x) < \text{rank}(r_y) \;\Rightarrow\; \text{parent}(r_x) = r_y

This keeps tree heights logarithmic:

height(T)log2n\text{height}(T) \leq \lfloor \log_2 n \rfloor

Path Compression

During find(x)\text{find}(x), make every node on the path point directly to the root:

find(x):    xp(x)r    compress    xr\text{find}(x): \;\; x \to p(x) \to \cdots \to r \;\;\xrightarrow{\text{compress}}\;\; x \to r

Combined with union by rank, the amortized cost per operation is:

α(n)(inverse Ackermann — effectively O(1))\alpha(n) \quad \text{(inverse Ackermann — effectively } O(1)\text{)}

How Slow Does α Grow?

α(n)4\alpha(n) \leq 4 for all practical values of nn (up to 222655362^{2^{2^{65536}}}). This makes union-find effectively constant time per operation.

Phase 2

See It Work

Union by rank and path compression on 6 elements

0r=01r=02r=03r=04r=05r=0
Operation: init
x:  parent(x)=x,  rank(x)=0\forall x:\; \text{parent}(x) = x,\; \text{rank}(x) = 0
Step 1 of 8

Initialize: each element is its own parent. All ranks are 0.

Phase 3</>

The Code

Bridge from set operations to Python implementation

Mathematical Formulation

x:  parent(x)=x\forall x:\; \text{parent}(x) = x

Initialize each element as its own set (self-loop = root)

rank(x)=0\text{rank}(x) = 0

All ranks start at 0 (single-element trees)

find(x):xparent(x)root\text{find}(x): x \to \text{parent}(x) \to \cdots \to \text{root}

Path compression: make every node on path point directly to root

rank(rx)<rank(ry)swap\text{rank}(r_x) < \text{rank}(r_y) \Rightarrow \text{swap}

Union by rank: always attach smaller tree under larger tree's root

parent(ry)=rx\text{parent}(r_y) = r_x

Make the smaller tree's root point to the larger tree's root

α(n) amortized per operation\alpha(n) \text{ amortized per operation}

With both optimizations, amortized cost is inverse Ackermann α(n) ≈ O(1)

Python Implementation

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))   # Each element is its own root
        self.rank = [0] * n            # Rank (upper bound on height)

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # Path compression
        return self.parent[x]

    def union(self, x, y):
        rx, ry = self.find(x), self.find(y)
        if rx == ry: return            # Already in same set
        if self.rank[rx] < self.rank[ry]:
            rx, ry = ry, rx            # Ensure rx has higher rank
        self.parent[ry] = rx           # Attach smaller tree under larger
        if self.rank[rx] == self.rank[ry]:
            self.rank[rx] += 1         # Increment rank if equal