二分-k均值聚类算法进行图像分割并输出轮廓系数的完整代码
时间: 2023-08-11 21:16:05 浏览: 95
以下是使用Python实现二分-k均值聚类算法进行图像分割并输出轮廓系数的完整代码:
``` python
import numpy as np
from PIL import Image
from sklearn.metrics import silhouette_score
def load_image(filename):
img = Image.open(filename)
return np.array(img)
def save_image(filename, data):
img = Image.fromarray(np.uint8(data))
img.save(filename)
def kmeans(data, k, max_iter=100):
centers = np.random.randint(256, size=(k,))
for i in range(max_iter):
clusters = [[] for _ in range(k)]
for x in data:
distances = np.abs(centers - x)
idx = np.argmin(distances)
clusters[idx].append(x)
new_centers = [np.mean(cluster) for cluster in clusters]
if np.allclose(new_centers, centers):
break
centers = new_centers
return centers, clusters
def bisecting_kmeans(data, k, max_iter=100):
clusters = [data]
while len(clusters) < k:
max_silhouette = -1
for i, cluster in enumerate(clusters):
centers, subclusters = kmeans(cluster, 2, max_iter)
silhouettes = np.zeros(len(subclusters))
for j, subcluster in enumerate(subclusters):
if len(subcluster) > 1:
silhouette = silhouette_score(subcluster, np.zeros_like(subcluster))
silhouettes[j] = silhouette
silhouette = np.mean(silhouettes)
if silhouette > max_silhouette:
max_silhouette = silhouette
max_idx = i
max_centers = centers
max_subclusters = subclusters
clusters.pop(max_idx)
clusters.extend(max_subclusters)
return clusters
if __name__ == '__main__':
img = load_image('input.jpg')
data = img.reshape(-1, 3)
clusters = bisecting_kmeans(data, 4)
for i, cluster in enumerate(clusters):
centers, _ = kmeans(cluster, 1)
distances = np.abs(cluster - centers[0])
mask = np.argmin(distances, axis=1) == 0
data[mask] = centers[0]
img = data.reshape(img.shape)
save_image('output.jpg', img)
silhouettes = np.zeros(len(clusters))
for i, cluster in enumerate(clusters):
if len(cluster) > 1:
silhouette = silhouette_score(cluster, np.zeros_like(cluster))
silhouettes[i] = silhouette
print('Silhouette coefficients:', silhouettes)
```
这个代码基于之前的二分-k均值聚类算法实现了轮廓系数的计算。首先,使用 bisecting_kmeans 函数将数据分成四个簇。然后,使用 kmeans 函数将每个簇压缩为一个点,并使用该点将簇中所有像素点赋值为该点的颜色。接下来,将数据数组转换回图像并保存输出图像。最后,使用 silhouette_score 函数计算每个簇的轮廓系数,并输出到控制台。
注意,这里使用了 scikit-learn 库中的 silhouette_score 函数来计算轮廓系数,如果没有安装该库,需要先安装。此外,轮廓系数的计算需要注意簇中至少有两个样本。
阅读全文