Python 最短距离法聚类实现代码
时间: 2023-11-12 11:58:42 浏览: 167
K-means 聚类算法 python 代码实现
以下是Python最短距离法聚类的实现代码:
```python
import numpy as np
def euclidean_distance(x1, x2):
return np.sqrt(np.sum((x1 - x2) ** 2))
class MyAgglomerativeClustering:
def __init__(self, n_clusters=2):
self.n_clusters = n_clusters
self.labels = None
def fit(self, X):
n_samples, n_features = X.shape
self.labels = np.zeros(n_samples, dtype=np.int)
# Initialize each point to its own cluster
clusters = [[i] for i in range(n_samples)]
# Keep merging clusters until we have the desired number
# of clusters
while len(clusters) > self.n_clusters:
# Find the closest pair of clusters
min_distance = np.inf
for i in range(len(clusters)):
for j in range(i + 1, len(clusters)):
for index_i in clusters[i]:
for index_j in clusters[j]:
distance = euclidean_distance(X[index_i], X[index_j])
if distance < min_distance:
min_distance = distance
merge_i = i
merge_j = j
# Merge the closest pair of clusters
clusters[merge_i] += clusters[merge_j]
del clusters[merge_j]
# Assign labels
for i, cluster in enumerate(clusters):
for index in cluster:
self.labels[index] = i
return self.labels
```
这里我们定义了一个名为MyAgglomerativeClustering的类,其中fit()方法实现了最短距离法聚类。该算法的基本思想是:开始时将每个点看作一个独立的簇,然后不断地合并距离最近的两个簇,直到达到预定的簇的数目。
该实现使用了欧氏距离作为相似度度量。可通过调整n_clusters参数来设置需要得到的簇的数目。
阅读全文