Kmeans多维数据代码实现聚类
时间: 2023-10-08 11:06:51 浏览: 89
以下是Kmeans多维数据的Python实现,假设有一个数据集X,其中每个数据点都有n个维度:
```python
import numpy as np
class Kmeans:
def __init__(self, n_clusters=8, max_iter=300, tol=1e-4, random_state=1):
self.n_clusters = n_clusters
self.max_iter = max_iter
self.tol = tol
self.random_state = random_state
self.labels_ = None
self.centers_ = None
def fit(self, X):
n_samples, n_features = X.shape
np.random.seed(self.random_state)
center_ids = np.random.choice(n_samples, self.n_clusters, replace=False)
centers = X[center_ids]
for i in range(self.max_iter):
distances = np.linalg.norm(X[:, np.newaxis, :] - centers, axis=-1)
labels = np.argmin(distances, axis=-1)
new_centers = np.array([X[labels == j].mean(axis=0) for j in range(self.n_clusters)])
if np.allclose(centers, new_centers, rtol=self.tol):
break
centers = new_centers
self.labels_ = labels
self.centers_ = centers
```
其中,n_clusters为聚类的簇数,max_iter为最大迭代次数,tol为收敛阈值,random_state为随机数种子。fit方法进行Kmeans聚类,输入数据集X,输出标签labels和聚类中心centers。
阅读全文