Union-Find — KLAR

Phase 1f(x)

The Mathematics

Equivalence relations, union by rank, and path compression

Disjoint Set Partitions

A disjoint-set data structure maintains a collection of non-overlapping sets $S_1, S_2, \ldots, S_k$ that partition a universe $U$ :

S_i \cap S_j = \emptyset \;\;\forall i \neq j, \quad \bigcup_i S_i = U

Each set is identified by a representative element. Two operations: $\text{find}(x)$ returns the representative, $\text{union}(x, y)$ merges two sets.

Union by Rank

Each tree root has a rank (upper bound on height). When merging, attach the shorter tree under the taller:

\text{rank}(r_x) < \text{rank}(r_y) \;\Rightarrow\; \text{parent}(r_x) = r_y

This keeps tree heights logarithmic:

\text{height}(T) \leq \lfloor \log_2 n \rfloor

Path Compression

During $\text{find}(x)$ , make every node on the path point directly to the root:

\text{find}(x): \;\; x \to p(x) \to \cdots \to r \;\;\xrightarrow{\text{compress}}\;\; x \to r

Combined with union by rank, the amortized cost per operation is:

\alpha(n) \quad \text{(inverse Ackermann — effectively } O(1)\text{)}

How Slow Does α Grow?

$\alpha(n) \leq 4$ for all practical values of $n$ (up to $2^{2^{2^{65536}}}$ ). This makes union-find effectively constant time per operation.

Phase 2▶

See It Work

Union by rank and path compression on 6 elements

Operation: init

\forall x:\; \text{parent}(x) = x,\; \text{rank}(x) = 0

Step 1 of 8

Initialize: each element is its own parent. All ranks are 0.

Phase 3</>

The Code

Bridge from set operations to Python implementation

Mathematical Formulation

\forall x:\; \text{parent}(x) = x

Initialize each element as its own set (self-loop = root)

\text{rank}(x) = 0

All ranks start at 0 (single-element trees)

\text{find}(x): x \to \text{parent}(x) \to \cdots \to \text{root}

Path compression: make every node on path point directly to root

\text{rank}(r_x) < \text{rank}(r_y) \Rightarrow \text{swap}

Union by rank: always attach smaller tree under larger tree's root

\text{parent}(r_y) = r_x

Make the smaller tree's root point to the larger tree's root

\alpha(n) \text{ amortized per operation}

With both optimizations, amortized cost is inverse Ackermann α(n) ≈ O(1)

Python Implementation

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))   # Each element is its own root
        self.rank = [0] * n            # Rank (upper bound on height)

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # Path compression
        return self.parent[x]

    def union(self, x, y):
        rx, ry = self.find(x), self.find(y)
        if rx == ry: return            # Already in same set
        if self.rank[rx] < self.rank[ry]:
            rx, ry = ry, rx            # Ensure rx has higher rank
        self.parent[ry] = rx           # Attach smaller tree under larger
        if self.rank[rx] == self.rank[ry]:
            self.rank[rx] += 1         # Increment rank if equal

Union-Find (Disjoint Set)