使用DBSCNA算法对sklearn中的鸢尾花数据集进行聚类代码(聚类数目设置为3)
时间: 2023-06-12 16:08:18 浏览: 98
以下是使用DBSCAN算法对sklearn中的鸢尾花数据集进行聚类的Python代码,聚类数目设置为3:
```python
from sklearn.datasets import load_iris
from sklearn.cluster import DBSCAN
import numpy as np
iris = load_iris()
X = iris.data
dbscan = DBSCAN(eps=0.5, min_samples=5)
dbscan.fit(X)
labels = dbscan.labels_
n_clusters_ = len(set(labels)) - (1 if -1 in labels else 0)
print('Estimated number of clusters: %d' % n_clusters_)
unique_labels = set(labels)
colors = [plt.cm.Spectral(each)
for each in np.linspace(0, 1, len(unique_labels))]
for k, col in zip(unique_labels, colors):
if k == -1:
# 黑色为噪声点
col = [0, 0, 0, 1]
class_member_mask = (labels == k)
xy = X[class_member_mask & core_samples_mask]
plt.plot(xy[:, 0], xy[:, 1], 'o', markerfacecolor=tuple(col),
markeredgecolor='k', markersize=14)
xy = X[class_member_mask & ~core_samples_mask]
plt.plot(xy[:, 0], xy[:, 1], 'o', markerfacecolor=tuple(col),
markeredgecolor='k', markersize=6)
plt.title('Estimated number of clusters: %d' % n_clusters_)
plt.show()
```
这里使用了DBSCAN算法,并设置eps为0.5,min_samples为5,聚类数目被自动估计。然后,我们在图表中画出每个聚类。如果有噪声点,则用黑色表示。
阅读全文