给我一个KMeans原数据集,再给我
时间: 2024-02-01 12:16:07 浏览: 81
一个KMeans原数据集:
```
[
[2.0, 3.0],
[2.5, 5.0],
[1.0, 1.5],
[5.5, 6.5],
[7.0, 7.5],
[6.0, 5.0],
[4.5, 7.0],
[3.5, 4.5],
[2.5, 4.5],
[5.5, 8.0],
[5.0, 5.5],
[5.0, 7.0],
[4.0, 5.0],
[6.5, 6.0]
]
```
再给你一个KMeans的代码实现,你可以使用以下Python代码实现KMeans算法:
```python
import random
from typing import List, Tuple
def distance(p1: Tuple[float, float], p2: Tuple[float, float]) -> float:
return ((p1[0] - p2[0]) ** 2 + (p1[1] - p2[1]) ** 2) ** 0.5
def kmeans(data: List[Tuple[float, float]], k: int) -> Tuple[List[List[Tuple[float, float]]], List[Tuple[float, float]]]:
centroids = random.sample(data, k)
assignments = [-1] * len(data)
while True:
new_assignments = []
for point in data:
distances = [distance(point, centroid) for centroid in centroids]
closest_centroid = distances.index(min(distances))
new_assignments.append(closest_centroid)
if new_assignments == assignments:
break
assignments = new_assignments
for i in range(k):
assigned_points = [data[j] for j in range(len(data)) if assignments[j] == i]
centroid = [sum(point[0] for point in assigned_points) / len(assigned_points),
sum(point[1] for point in assigned_points) / len(assigned_points)]
centroids[i] = centroid
clusters = [[] for _ in range(k)]
for i, point in enumerate(data):
clusters[assignments[i]].append(point)
return clusters, centroids
```
你可以使用以上代码对数据集进行聚类,函数`kmeans(data, k)`的第一个参数`data`需要传入数据集,第二个参数`k`是聚类的数量。
阅读全文