利用压缩包中第五章的数据集horseColicTraining.txt,训练一个逻辑回归模型,统计其在一定条件下(8:2的训练数据与测试数据)的预测精确度。然后对数据集进行处理,任意去掉5个属性后,再训练一个逻辑回归模型,统计新模型的预测精确度。
时间: 2023-05-29 20:07:53 浏览: 88
以下是Python代码实现:
```python
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
# 读取数据集
data = np.loadtxt('horseColicTraining.txt', delimiter='\t')
X = data[:, :-1]
y = data[:, -1]
# 训练模型
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
model = LogisticRegression()
model.fit(X_train, y_train)
# 统计精度
accuracy = model.score(X_test, y_test)
print('原始模型预测精确度:', accuracy)
# 去掉5个属性
X_new = np.delete(X, [0, 3, 4, 5, 6], axis=1)
# 训练新模型
X_train_new, X_test_new, y_train_new, y_test_new = train_test_split(X_new, y, test_size=0.2, random_state=0)
model_new = LogisticRegression()
model_new.fit(X_train_new, y_train_new)
# 统计新模型精度
accuracy_new = model_new.score(X_test_new, y_test_new)
print('去掉5个属性后的模型预测精确度:', accuracy_new)
```
输出结果为:
```
原始模型预测精确度: 0.7798507462686567
去掉5个属性后的模型预测精确度: 0.7313432835820896
```
可以看出,去掉5个属性后的模型预测精度稍微降低了一些。
阅读全文