给定糖尿病数据集使用k近邻模型、决策树模型、贝叶斯模型以及支持向量机模型构建预测模型,并比较每个模型预测的准确率、精确率、召回率等指标,写出代码
时间: 2024-11-30 10:29:08 浏览: 6
决策树模型概述.pptx
在Python中,我们可以使用`sklearn`库中的KNN、DecisionTree、Bayesian(通过朴素贝叶斯算法实现)、和支持向量机(SVM)模型来进行糖尿病数据集的预测,并评估它们的性能。这里是一个简单的示例,假设你有一个名为`diabetes.csv`的数据集:
```python
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, precision_score, recall_score
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.svm import SVC
# 加载数据
data = pd.read_csv('diabetes.csv')
X = data.drop('Outcome', axis=1)
y = data['Outcome']
# 数据预处理(标准化)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)
# 定义模型列表
models = {
'KNN': KNeighborsClassifier(),
'Decision Tree': DecisionTreeClassifier(),
'Naive Bayes': GaussianNB(),
'SVM': SVC(probability=True) # SVM需要probability=True获取概率估计
}
# 训练和评估每个模型
results = {}
for model_name, model in models.items():
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
# 计算指标
acc = accuracy_score(y_test, y_pred)
prec = precision_score(y_test, y_pred)
rec = recall_score(y_test, y_pred)
results[model_name] = {'Accuracy': acc, 'Precision': prec, 'Recall': rec}
# 打印结果
for model, metrics in results.items():
print(f"{model}:")
for metric, value in metrics.items():
print(f" {metric}: {value:.2f}")
阅读全文