st-dbscan聚类算法用代码实现
时间: 2023-07-13 11:09:22 浏览: 275
下面是st-dbscan聚类算法的Python代码实现:
```python
import numpy as np
from scipy.spatial.distance import pdist, squareform
def stdbscan(data, eps, min_pts):
"""
ST-DBSCAN algorithm implementation.
Parameters:
data (ndarray): Input data.
eps (float): The maximum distance between two samples for them to be considered as in the same neighborhood.
min_pts (int): The number of samples (or total weight) in a neighborhood for a point to be considered as a core point.
Returns:
labels (ndarray): Cluster labels for each point. -1 for noise points.
"""
# Compute pairwise distance matrix
dist_mat = squareform(pdist(data))
# Initialize variables
num_pts = data.shape[0]
visited = np.zeros(num_pts, dtype=bool)
labels = np.zeros(num_pts, dtype=int)
cluster_id = 0
# Iterate over all points
for i in range(num_pts):
if not visited[i]:
visited[i] = True
# Get neighbors within eps distance
neighbors = np.where(dist_mat[i] <= eps)[0]
# Check if there are enough neighbors
if len(neighbors) < min_pts:
labels[i] = -1 # Noise point
else:
cluster_id += 1
labels[i] = cluster_id
# Expand cluster
j = 0
while j < len(neighbors):
neighbor = neighbors[j]
if not visited[neighbor]:
visited[neighbor] = True
# Get neighbors within eps distance
new_neighbors = np.where(dist_mat[neighbor] <= eps)[0]
# Check if there are enough neighbors
if len(new_neighbors) >= min_pts:
neighbors = np.concatenate((neighbors, new_neighbors))
# Assign to cluster
if labels[neighbor] == 0:
labels[neighbor] = cluster_id
j += 1
return labels
```
其中,`data`是输入数据,`eps`是最大距离阈值(即点与点之间的距离超过该值则不再属于同一簇),`min_pts`是最小密度阈值(即点周围的其它点个数不足该值时不再属于核心点)。函数返回每个点的聚类标签,-1表示噪声点。
阅读全文