导入经纬度数据,经纬度距离kmeans聚类代码
时间: 2023-11-05 16:04:37 浏览: 127
KMeans聚类算法+代码
5星 · 资源好评率100%
以下是导入经纬度数据并进行kmeans聚类的代码:
```python
import pandas as pd
from sklearn.cluster import KMeans
from math import radians, cos, sin, asin, sqrt
# 导入经纬度数据
df = pd.read_csv('经纬度数据.csv')
# 计算经纬度距离
def haversine(lon1, lat1, lon2, lat2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
"""
# 将十进制度数转化为弧度
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
# haversine公式
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
r = 6371 # 地球平均半径,单位为公里
return c * r
# 计算每个点之间的距离矩阵
dist_matrix = []
for i in range(0, len(df)):
row_dist = []
for j in range(0, len(df)):
dist = haversine(df['经度'][i], df['纬度'][i], df['经度'][j], df['纬度'][j])
row_dist.append(dist)
dist_matrix.append(row_dist)
# 进行kmeans聚类
kmeans = KMeans(n_clusters=5, random_state=0).fit(dist_matrix)
labels = kmeans.labels_
# 将聚类结果加入原数据集
df['cluster'] = labels
df.to_csv('聚类结果.csv', index=False)
```
其中,`df` 是一个包含经度和纬度数据的 DataFrame,`haversine` 函数用来计算两个经纬度之间的距离,`dist_matrix` 存储了每个点之间的距离矩阵,`kmeans` 是进行聚类的对象,`labels` 是聚类的结果。最后,将聚类结果加入原数据集并保存到文件中。
阅读全文