K-Means Clustering
Partition data into k clusters by iteratively assigning points and updating centroids.
The Mathematics
WCSS objective, assignment step, update step, and convergence
Objective: Within-Cluster Sum of Squares
K-Means seeks to partition data points into clusters by minimizing the total distance from each point to its cluster centroid:
where is the centroid (mean) of cluster .
Lloyd's Algorithm: Two Alternating Steps
1. Assignment Step
Assign each point to the nearest centroid:
2. Update Step
Recompute each centroid as the mean of its assigned points:
Convergence
Each step decreases (or maintains) WCSS, so the algorithm is guaranteed to converge. However, it may converge to a local minimum:
K-Means++ Initialization
Random initialization can lead to poor local minima. K-Means++ selects initial centroids spread apart, giving competitive ratio in expectation.
See It Work
Watch K-Means alternate between assign and update steps
Clusters & Centroids
WCSS Over Iterations
WCSS = —
Phase: init
Initialize 3 centroids at arbitrary positions. Now alternate between assign and update steps.
The Code
Bridge from mathematical formulation to Python implementation
Mathematical Formulation
Initialize k centroids (random or heuristic)
Compute distance from each point to each centroid
Assign each point to its nearest centroid
Update centroid to mean of assigned points
Stop when centroids no longer change
Python Implementation
def k_means(X, k=3, max_iter=100):
centroids = initialize_centroids(X, k) # Random init
for iteration in range(max_iter):
# Assign step
assignments = []
for x in X: # For each point
dists = [norm(x - c) for c in centroids]
assignments.append(argmin(dists)) # Nearest centroid
# Update step
for j in range(k): # For each cluster
cluster = X[assignments == j]
centroids[j] = mean(cluster) # New centroid
if converged(centroids):
break # Done
return assignments, centroids