机器学习使用KNN算法对乳腺癌数据集进行建模预测 代码表示
时间: 2024-09-29 10:07:02 浏览: 53
机器学习中的K近邻(K-Nearest Neighbors, KNN)算法是一种基于实例的学习方法,它用于分类和回归任务。在预测乳腺癌数据集时,你可以按照以下步骤编写简单的Python代码,使用scikit-learn库来实现:
首先,你需要导入必要的库:
```python
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
```
然后加载乳腺癌数据集:
```python
data = load_breast_cancer()
X = data.data
y = data.target
```
分割数据为训练集和测试集:
```python
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
```
创建并训练KNN分类器,比如选择k=5作为邻居数:
```python
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train, y_train)
```
最后,进行预测并计算模型准确率:
```python
y_pred = knn.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
```
这就是基本的KNN在乳腺癌数据集上建模预测的代码示例。运行这段代码后,你会得到模型的预测性能。
阅读全文