K-meanPython
时间: 2023-11-05 10:03:26 浏览: 80
python实现的k-means算法
5星 · 资源好评率100%
K-means is a popular clustering algorithm used in data science and machine learning. It is used to group similar data points together based on their similarities. The algorithm works by initially selecting k random data points as centroids, and then iteratively assigning each data point to the nearest centroid and updating the centroid based on the newly assigned data points.
Here is an example implementation of K-means algorithm in Python:
```python
import numpy as np
import matplotlib.pyplot as plt
# Generate random data
np.random.seed(0)
X = np.vstack((np.random.randn(100, 2) * 0.75 + np.array([1, 0]),
np.random.randn(100, 2) * 0.25 + np.array([-0.5, 0.5]),
np.random.randn(100, 2) * 0.5 + np.array([-0.5, -0.5])))
# Define K-means function
def k_means(X, k=3, max_iters=100):
centroids = X[np.random.choice(X.shape[0], k, replace=False)]
for i in range(max_iters):
# Assign each data point to the nearest centroid
distances = np.sqrt(((X - centroids[:, np.newaxis])**2).sum(axis=2))
labels = np.argmin(distances, axis=0)
# Update centroids
for j in range(k):
centroids[j] = X[labels == j].mean(axis=0)
return centroids, labels
# Run K-means on data
centroids, labels = k_means(X, k=3)
# Plot results
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis')
plt.scatter(centroids[:, 0], centroids[:, 1], s=100, c='red', marker='X')
plt.show()
```
In this example, we first generate some random data consisting of three clusters. We then define the `k_means` function, which takes in the data `X`, the number of clusters `k`, and the maximum number of iterations to run `max_iters`. In the function, we initialize the centroids randomly, and then iteratively assign each data point to the nearest centroid and update the centroids based on the newly assigned data points.
We then run the `k_means` function on the data and plot the results. The scatter plot shows the data points colored by their cluster assignment, and the red X's show the final centroids for each cluster.
阅读全文