python使用欧氏距离，曼哈顿距离和夹角余弦距离实现k-means算法

以下是使用Python实现k-means算法，其中包括欧氏距离、曼哈顿距离和夹角余弦距离的实现： ```python import numpy as np import math # 欧氏距离 def euclidean_distance(x1, x2): return np.sqrt(np.sum((x1 - x2)**2)) # 曼哈顿距离 def manhattan_distance(x1, x2): return np.sum(np.abs(x1 - x2)) # 夹角余弦距离 def cosine_distance(x1, x2): dot_product = np.dot(x1, x2) norm_x1 = np.linalg.norm(x1) norm_x2 = np.linalg.norm(x2) return 1 - dot_product / (norm_x1 * norm_x2) class KMeans: def __init__(self, k=3, max_iters=100, distance="euclidean"): self.k = k self.max_iters = max_iters self.distance = distance def initialize_centroids(self, X): n_samples, n_features = X.shape centroids = np.zeros((self.k, n_features)) for i in range(self.k): centroid = X[np.random.choice(range(n_samples))] centroids[i] = centroid return centroids def closest_centroid(self, sample, centroids): distances = np.zeros(self.k) for i, centroid in enumerate(centroids): if self.distance == "euclidean": distances[i] = euclidean_distance(sample, centroid) elif self.distance == "manhattan": distances[i] = manhattan_distance(sample, centroid) else: distances[i] = cosine_distance(sample, centroid) closest_index = np.argmin(distances) return closest_index def create_clusters(self, X, centroids): clusters = [[] for _ in range(self.k)] for sample_i, sample in enumerate(X): centroid_i = self.closest_centroid(sample, centroids) clusters[centroid_i].append(sample_i) return clusters def calculate_centroids(self, X, clusters): n_features = X.shape[1] centroids = np.zeros((self.k, n_features)) for i, cluster in enumerate(clusters): centroid = np.mean(X[cluster], axis=0) centroids[i] = centroid return centroids def get_cluster_labels(self, clusters, X): y_pred = np.zeros(X.shape[0]) for cluster_i, cluster in enumerate(clusters): for sample_i in cluster: y_pred[sample_i] = cluster_i return y_pred def predict(self, X): centroids = self.initialize_centroids(X) for _ in range(self.max_iters): clusters = self.create_clusters(X, centroids) prev_centroids = centroids centroids = self.calculate_centroids(X, clusters) if np.all(centroids == prev_centroids): break return self.get_cluster_labels(clusters, X) ``` 使用示例： ```python from sklearn.datasets import make_blobs import matplotlib.pyplot as plt X, y = make_blobs(centers=3, n_samples=500, random_state=1) kmeans = KMeans(k=3, max_iters=100, distance="euclidean") y_pred = kmeans.predict(X) plt.scatter(X[:, 0], X[:, 1], c=y_pred) plt.title("K-Means Clustering") plt.show() ``` 其中，distance参数可以设置为"euclidean"、"manhattan"或者"cosine"，表示使用欧氏距离、曼哈顿距离或夹角余弦距离。

阅读全文

python使用欧氏距离，曼哈顿距离和夹角余弦距离实现k-means算法

相关推荐

基于Python实现一个k-means算法和混合高斯模型【100011012】

基于Python实现k-means算法和混合高斯模型【100011756】

python实现的k-means算法

数据挖掘中的距离度量和相似度度量及Python实现

Python人工智能课程 AI算法课程 Python机器学习与深度学习 7.聚类 共88页.pdf

Python实现数据挖掘中的距离与相似度度量及其应用

【K-means聚类算法精通之路】：从入门到实战的30天密集课程

【k-means聚类：从入门到实战】：原理、实现、优化一文通

【进阶】K-means聚类在图像分割中的应用

K均值聚类算法详解与Python实现

【K-means聚类优化秘籍】：提升聚类效果的10大策略

【大规模数据聚类策略】：Python算法实战指南

【Python聚类算法终极指南】：从入门到精通，手把手教你提升算法性能

Python机器学习应用：实践无监督学习中的聚类算法及其用例

K均值聚类算法常见问题与解决方案：轻松应对算法难题

【K均值聚类算法实战手册】：掌握算法原理，轻松应对数据聚类挑战

K均值聚类算法：从零基础到实战应用，一文搞定

【聚类分析实战】：Python数据分组的6种高效方法

数据挖掘中的聚类算法及应用

【Python聚类分析完全手册】：分群技术的9大精髓

大家在看

差分GPS定位技术

MULTISIM添加元件库

海康威视Visio图库

西门子博途V18系统手册

智能变电站SCD文件的集成工具 南瑞继保设计工具

最新推荐

python基于K-means聚类算法的图像分割

k-means 聚类算法与Python实现代码

python 代码实现k-means聚类分析的思路(不使用现成聚类库)

【K-means算法】{1} —— 使用Python实现K-means算法并处理Iris数据集

Python机器学习算法之k均值聚类（k-means）

掌握HTML/CSS/JS和Node.js的Web应用开发实践

管理建模和仿真的文件

计算机体系结构概述：基础概念与发展趋势

int a[][3]={{1,2},{4}}输出这个数组

勒玛算法研讨会项目：在线商店模拟与Qt界面实现

Python人工智能课程 AI算法课程 Python机器学习与深度学习 7.聚类共88页.pdf

　差分GPS定位技术

智能变电站SCD文件的集成工具南瑞继保设计工具