用python写一个GMM聚类算法
时间: 2024-05-13 09:13:40 浏览: 113
GMM聚类算法基于python gmm.py
以下是一个基本的GMM聚类算法的Python实现:
```python
import numpy as np
from scipy.stats import multivariate_normal
class GMM:
def __init__(self, k, max_iterations=100):
self.k = k
self.max_iterations = max_iterations
def fit(self, X):
n_samples, n_features = X.shape
# 初始化均值、协方差矩阵和权重
self.means = np.random.rand(self.k, n_features)
self.covs = np.zeros((self.k, n_features, n_features))
for i in range(self.k):
self.covs[i] = np.eye(n_features)
self.weights = np.ones(self.k) / self.k
# EM算法迭代
for i in range(self.max_iterations):
# E步骤
gaussians = np.zeros((n_samples, self.k))
for i in range(self.k):
gaussians[:, i] = multivariate_normal.pdf(X, mean=self.means[i], cov=self.covs[i])
numerator = gaussians * self.weights
denominator = numerator.sum(axis=1)[:, np.newaxis]
gamma = numerator / denominator
# M步骤
for i in range(self.k):
gamma_i = gamma[:, i]
self.means[i] = (gamma_i * X).sum(axis=0) / gamma_i.sum()
self.covs[i] = np.cov(X.T, aweights=gamma_i)
self.weights[i] = gamma_i.sum() / n_samples
def predict(self, X):
gaussians = np.zeros((X.shape[0], self.k))
for i in range(self.k):
gaussians[:, i] = multivariate_normal.pdf(X, mean=self.means[i], cov=self.covs[i])
return np.argmax(gaussians, axis=1)
```
该代码实现了一个基本的GMM聚类算法,包括初始化均值、协方差矩阵和权重,以及EM算法迭代的E步骤和M步骤。在E步骤中,使用多元高斯分布计算每个样本属于每个簇的概率,然后根据权重进行加权得到每个样本属于每个簇的概率,最后使用这些概率更新每个样本属于每个簇的概率。在M步骤中,使用加权平均值和加权协方差矩阵更新每个簇的均值和协方差矩阵,并使用每个簇中的样本数更新每个簇的权重。
阅读全文