uci甲状腺疾病逻辑回归模型
时间: 2023-09-25 10:16:34 浏览: 60
甲状腺疾病是一种常见的内分泌系统疾病,可以导致许多身体症状。UCI数据集中收集了关于甲状腺疾病患者的一些医学指标,包括TSH值、T3值、T4值等等。我们可以使用逻辑回归模型来预测一个人是否患有甲状腺疾病。
以下是一个简单的Python实现:
```python
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# 读取数据
data = pd.read_csv('thyroid.csv')
# 将类别变量转换为数值变量
data['sex'] = data['sex'].map({'M': 0, 'F': 1})
data['thyroid_surgery'] = data['thyroid_surgery'].map({'f': 0, 't': 1})
data['query_on_thyroxine'] = data['query_on_thyroxine'].map({'f': 0, 't': 1})
data['on_antithyroid_medication'] = data['on_antithyroid_medication'].map({'f': 0, 't': 1})
data['sick'] = data['sick'].map({'f': 0, 't': 1})
data['pregnant'] = data['pregnant'].map({'f': 0, 't': 1})
data['thyroid_surgery'] = data['thyroid_surgery'].map({'f': 0, 't': 1})
data['class'] = data['class'].map({'negative': 0, 'compensated hypothyroid': 1, 'primary hypothyroid': 1, 'secondary hypothyroid': 1, 'hyperthyroid': 2, 'T3 toxic': 2, 'goitre': 3})
# 将数据集拆分为训练集和测试集
X = data.drop(['class'], axis=1)
y = data['class']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 训练逻辑回归模型
model = LogisticRegression()
model.fit(X_train, y_train)
# 预测测试集结果
y_pred = model.predict(X_test)
# 计算模型准确率
accuracy = accuracy_score(y_test, y_pred)
print('Accuracy:', accuracy)
```
在这个例子中,我们首先读取了UCI甲状腺疾病数据集,并将类别变量转换为数值变量。然后,我们将数据集拆分为训练集和测试集,并使用逻辑回归模型进行训练和预测。最后,我们计算了模型的准确率。
阅读全文