K-means++代码
时间: 2023-10-31 13:12:18 浏览: 112
以下是K-means++的Python代码示例:
```python
import numpy as np
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans
def kmeans_pp(X, k):
# randomly choose first centroid
centroids = [X[np.random.choice(len(X))]]
# choose k-1 centroids
for i in range(k-1):
# calculate distance of each point to nearest centroid
distances = np.array([min([np.linalg.norm(x-c)**2 for c in centroids]) for x in X])
# choose new centroid from points with highest distance
probabilities = distances / distances.sum()
centroids.append(X[np.random.choice(len(X), p=probabilities)])
return np.array(centroids)
# test on synthetic data
X, _ = make_blobs(n_samples=1000, centers=5, random_state=42)
centroids = kmeans_pp(X, 5)
kmeans = KMeans(n_clusters=5, init=centroids, n_init=1).fit(X)
```
该代码实现了K-means++算法,并使用生成的数据集进行了测试。该算法的主要思想是选择初始质心时尽量避免质心之间的相似性,从而提高算法的性能。
阅读全文