请你用python编写K均值聚类的算法
时间: 2024-01-21 09:19:11 浏览: 75
好的,以下是Python实现的K均值聚类算法:
```python
import numpy as np
class KMeans:
def __init__(self, k=2, max_iter=100, random_state=None):
self.k = k
self.max_iter = max_iter
self.random_state = random_state
def fit(self, X):
np.random.seed(self.random_state)
self.centroids = X[np.random.choice(X.shape[0], self.k, replace=False)]
for i in range(self.max_iter):
clusters = [[] for _ in range(self.k)]
# Assign each data point to the closest centroid
for x in X:
distances = [np.linalg.norm(x - c) for c in self.centroids]
closest_idx = np.argmin(distances)
clusters[closest_idx].append(x)
# Update centroids as the mean of the data points in each cluster
for j in range(self.k):
if clusters[j]:
self.centroids[j] = np.mean(clusters[j], axis=0)
self.labels_ = np.zeros(X.shape[0])
for i, cluster in enumerate(clusters):
for x in cluster:
self.labels_[np.where(X == x)[0][0]] = i
return self
def predict(self, X):
labels = np.zeros(X.shape[0])
for i, x in enumerate(X):
distances = [np.linalg.norm(x - c) for c in self.centroids]
closest_idx = np.argmin(distances)
labels[i] = closest_idx
return labels
```
其中,KMeans类的构造函数中可以指定聚类数k、最大迭代次数max_iter和随机数种子random_state,fit方法接收数据X并进行聚类,predict方法可用于对新数据进行预测。
阅读全文