DIANA聚类代码并可视化
时间: 2023-07-10 09:26:39 浏览: 189
聚类算法的代码
以下是一个使用DIANA聚类的Python代码示例,并可视化聚类结果:
```python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.spatial.distance import pdist, squareform
def diana(X, k):
# 计算初始距离矩阵
D = squareform(pdist(X))
n = len(X)
# 开始聚类
clusters = [[i] for i in range(n)]
while len(clusters) < k:
max_d = 0
to_merge = None
# 找到距离最远的簇对
for i in range(len(clusters)):
for j in range(i+1, len(clusters)):
d = 0
for m in clusters[i]:
for n in clusters[j]:
d += D[m][n]
d /= len(clusters[i]) * len(clusters[j])
if d > max_d:
max_d = d
to_merge = (i, j)
# 合并簇
i, j = to_merge
clusters[i] = clusters[i] + clusters[j]
del clusters[j]
# 返回簇的索引
labels = np.zeros(n)
for i, c in enumerate(clusters):
for j in c:
labels[j] = i
return labels
# 生成随机数据集
np.random.seed(0)
X = np.random.randn(100, 2)
# 使用DIANA聚类
labels = diana(X, 3)
# 可视化聚类结果
df = pd.DataFrame(X, columns=['x1', 'x2'])
df['label'] = labels
colors = plt.cm.Spectral(np.linspace(0, 1, len(df['label'].unique())))
for i, color in zip(df['label'].unique(), colors):
plt.scatter(df[df['label'] == i]['x1'], df[df['label'] == i]['x2'], color=color)
plt.show()
```
这个代码将生成一个随机的二维数据集,使用DIANA聚类算法将其分为3个簇,并将聚类结果可视化。可以看到,不同的簇被不同的颜色表示出来。
阅读全文