python dbsacn 代码
时间: 2024-02-18 15:29:32 浏览: 63
下面是一个简单的 Python 实现的 DBSCAN(基于密度聚类算法):
```python
import numpy as np
class DBSCAN:
def __init__(self, eps=1, min_samples=5):
self.eps = eps
self.min_samples = min_samples
def fit(self, X):
self.n_samples = X.shape[0]
self.labels = np.zeros(self.n_samples)
self.visited = np.zeros(self.n_samples)
cluster_idx = 0
for i in range(self.n_samples):
if not self.visited[i]:
self.visited[i] = 1
neighbours = self.region_query(X[i])
if len(neighbours) < self.min_samples:
self.labels[i] = -1
else:
self.expand_cluster(X, i, neighbours, cluster_idx)
cluster_idx += 1
return self.labels
def expand_cluster(self, X, i, neighbours, cluster_idx):
self.labels[i] = cluster_idx
j = 0
while j < len(neighbours):
neighbour = neighbours[j]
if not self.visited[neighbour]:
self.visited[neighbour] = 1
new_neighbours = self.region_query(X[neighbour])
if len(new_neighbours) >= self.min_samples:
neighbours = neighbours + new_neighbours
if not self.labels[neighbour]:
self.labels[neighbour] = cluster_idx
j += 1
def region_query(self, x):
neighbours = []
for i in range(self.n_samples):
if np.linalg.norm(x - X[i]) < self.eps:
neighbours.append(i)
return neighbours
```
这里的 `fit` 方法对应于 DBSCAN 算法的主要过程,`region_query` 方法用于找到一个点的邻居点,`expand_cluster` 方法用于将一个点扩展到同一个簇中。在 `fit` 方法中,我们遍历数据集中的每个点,如果该点没有被访问过,我们就开始寻找它的邻居点。如果它的邻居点的数量小于 `min_samples`,那么该点被标记为噪声点(即 -1),否则将该点加入簇中,并扩展该簇。
阅读全文