- Soft clustering: data points can belong to multiple clusters with probability scores
- Models data as a mixture of K Gaussian distributions
- Each cluster has its own mean ($\mu$), covariance ($\Sigma$), and mixing coefficient ($\pi$)
Expectation-Maximization (EM)
-
Initialization: Randomly initialize parameters for K Gaussians
-
E-step: Calculate posterior probabilities (responsibilities) — probability of each point belonging to each cluster
-
M-step: Update means, covariances, and mixing coefficients using weighted averages
-
Convergence: Repeat until log-likelihood converges
Advantages
- Provides soft assignments (probabilities)
- Can model elliptical clusters
- More flexible than K-means
- Handles overlapping clusters
Limitations
- Computationally more expensive than K-means
- Sensitive to initialization
- May converge to local optima