用R语言回答Puromycin数据集以下问题1、数据建模 2、模型验证 3、模型评估 (1)查看预测结果 (2)查看真实结果 (3)预测结果离散化 (4)查看预测结果离散化 (5)构造混淆矩阵 (6)计算正确率 (7)输出预测正确率 4、模型优化 (1)数据清洗 (2)特征选择 (3)特征提取 (4)建立优化模型 5、模型预测 用优化后模型预测。
时间: 2023-06-10 17:08:56 浏览: 93
1、数据建模:
首先加载Puromycin数据集:
```R
data(Puromycin)
```
然后将数据集分为训练集和测试集:
```R
library(caret)
set.seed(123)
splitIndex <- createDataPartition(Puromycin$state, p = .70, list = FALSE, times = 1)
train <- Puromycin[ splitIndex,]
test <- Puromycin[-splitIndex,]
```
接着建立一个逻辑回归模型:
```R
model <- glm(state ~ conc, data = train, family = "binomial")
```
2、模型验证:
首先查看预测结果和真实结果:
```R
predicted <- predict(model, newdata=test, type="response")
predicted
actual <- test$state
actual
```
然后将预测结果离散化:
```R
predictedClass <- ifelse(predicted > 0.5, "Treated", "Untreated")
predictedClass
```
接着查看预测结果离散化:
```R
table(predictedClass)
```
然后构造混淆矩阵:
```R
library(caret)
confusionMatrix(predictedClass, actual)
```
计算正确率:
```R
accuracy <- sum(predictedClass == actual)/length(actual)
accuracy
```
输出预测正确率:
```R
paste("Accuracy:", round(accuracy*100,2), "%")
```
3、模型优化:
在模型优化前,我们先进行数据清洗:
```R
Puromycin <- na.omit(Puromycin)
```
然后进行特征选择和特征提取,这里我们采用逐步回归法:
```R
library(MASS)
stepAIC(model, direction="both")
```
最后建立优化模型:
```R
model2 <- glm(state ~ conc + dur, data = train, family = "binomial")
```
4、模型预测:
用优化后模型预测:
```R
predicted2 <- predict(model2, newdata=test, type="response")
predictedClass2 <- ifelse(predicted2 > 0.5, "Treated", "Untreated")
accuracy2 <- sum(predictedClass2 == actual)/length(actual)
paste("Accuracy2:", round(accuracy2*100,2), "%")
```
阅读全文