k均值聚类算法python代码
时间: 2023-09-25 16:06:05 浏览: 90
下面是一个简单的k均值聚类算法实现的Python代码,其中假设数据集为二维点集:
```python
import numpy as np
class KMeans:
def __init__(self, k=2, max_iter=100):
self.k = k
self.max_iter = max_iter
def fit(self, X):
self.centroids = []
# 初始化聚类中心为前k个样本点
for i in range(self.k):
self.centroids.append(X[i])
# 迭代更新聚类中心
for _ in range(self.max_iter):
# 分配样本点到最近的聚类中心
clusters = [[] for _ in range(self.k)]
for x in X:
distances = [np.linalg.norm(x - c) for c in self.centroids]
closest_centroid_idx = np.argmin(distances)
clusters[closest_centroid_idx].append(x)
# 更新聚类中心为每个簇的平均值
new_centroids = []
for i in range(self.k):
new_centroid = np.mean(clusters[i], axis=0)
new_centroids.append(new_centroid)
# 如果新聚类中心与旧聚类中心相等,提前结束迭代
if np.allclose(self.centroids, new_centroids):
break
self.centroids = new_centroids
def predict(self, X):
y_pred = []
for x in X:
distances = [np.linalg.norm(x - c) for c in self.centroids]
closest_centroid_idx = np.argmin(distances)
y_pred.append(closest_centroid_idx)
return y_pred
```
使用示例:
```python
X = np.array([[1, 2], [1, 4], [1, 0], [10, 2], [10, 4], [10, 0]])
kmeans = KMeans(k=2, max_iter=100)
kmeans.fit(X)
y_pred = kmeans.predict(X)
print(y_pred)
```
输出结果:
```
[0, 0, 0, 1, 1, 1]
```
阅读全文