We have seen the cars2018.csv data in the 'Example: Fuel Consumption' Lecture Task 1: Install {keras} (and Tenserflow) on your computer. Make sure its linkage with Python libary is also established. (It is one of the key steps for this assignment. It may take some time if you have not get the right library installed. Here is a reference link. https://keras.rstudio.com/reference/install_keras.html) Task 2: Divide the data randomly into 80% training and 20% test, fit a neural network model on fuel consumption (mpg) with one hidden dense layer. Choose the number of hidden units, and select the right activation function and modelling parameters. Task 3: Fit another neural network with 2 hidden dense layers, i.e. stack another hidden layer on top of the pervious neural net. Choose the number of the hidden units for the second layer. Compare the results with output from Task 2. Task 4: Compare the model performance with the model discussed in the lecture, i.e. the lasso, rpart, random forest, and xgboost models. Task 5: Change the response variable from mpg to Liters/KM (i.e. 235.15/mpg), and re-fit a neural network model. Compare performance as in Task 4.请你提供这个题目的task4的代码
时间: 2024-02-14 15:09:40 浏览: 87
好的,以下是Task 4的代码,包括了Lasso、rpart、随机森林和xgboost模型的训练和评估:
# Load data
# Task 2 - Fit a neural network model with one hidden layer
train_index <- createDataPartition(cars2018$mpg, p = 0.8, list = FALSE)
train_data <- cars2018[train_index, ]
test_data <- cars2018[-train_index, ]
model1 <- keras_model_sequential() %>%
layer_dense(units = 10, activation = "relu", input_shape = ncol(train_data) - 1) %>%
layer_dense(units = 1)
compile(model1, optimizer = "adam", loss = "mean_squared_error", metrics = "mean_squared_error")
history1 <- fit(model1, as.matrix(train_data[, -1]), train_data$mpg, epochs = 50, batch_size = 32,
validation_split = 0.2, verbose = 0)
# Task 3 - Fit a neural network model with two hidden layers
model2 <- keras_model_sequential() %>%
layer_dense(units = 10, activation = "relu", input_shape = ncol(train_data) - 1) %>%
layer_dense(units = 5, activation = "relu") %>%
layer_dense(units = 1)
compile(model2, optimizer = "adam", loss = "mean_squared_error", metrics = "mean_squared_error")
history2 <- fit(model2, as.matrix(train_data[, -1]), train_data$mpg, epochs = 50, batch_size = 32,
validation_split = 0.2, verbose = 0)
# Task 4 - Compare the model performance with other models
# Lasso model
train_index <- createDataPartition(cars2018$mpg, p = 0.8, list = FALSE)
train_data <- cars2018[train_index, ]
test_data <- cars2018[-train_index, ]
glmnet_fit <- cv.glmnet(as.matrix(train_data[, -1]), train_data$mpg, alpha = 1, nfolds = 10)
lasso_pred <- predict(glmnet_fit, newx = as.matrix(test_data[, -1]), s = "lambda.min")
lasso_mse <- mean((lasso_pred - test_data$mpg) ^ 2)
lasso_mae <- mean(abs(lasso_pred - test_data$mpg))
# rpart model
train_index <- createDataPartition(cars2018$mpg, p = 0.8, list = FALSE)
train_data <- cars2018[train_index, ]
test_data <- cars2018[-train_index, ]
rpart_fit <- rpart(mpg ~ ., data = train_data, method = "anova")
rpart_pred <- predict(rpart_fit, newdata = test_data)
rpart_mse <- mean((rpart_pred - test_data$mpg) ^ 2)
rpart_mae <- mean(abs(rpart_pred - test_data$mpg))
# Random forest model
train_index <- createDataPartition(cars2018$mpg, p = 0.8, list = FALSE)
train_data <- cars2018[train_index, ]
test_data <- cars2018[-train_index, ]
rf_fit <- randomForest(mpg ~ ., data = train_data, ntree = 500)
rf_pred <- predict(rf_fit, newdata = test_data)
rf_mse <- mean((rf_pred - test_data$mpg) ^ 2)
rf_mae <- mean(abs(rf_pred - test_data$mpg))
# XGBoost model
train_index <- createDataPartition(cars2018$mpg, p = 0.8, list = FALSE)
train_data <- cars2018[train_index, ]
test_data <- cars2018[-train_index, ]
xgb_train <- xgb.DMatrix(data = as.matrix(train_data[, -1]), label = train_data$mpg)
xgb_test <- xgb.DMatrix(data = as.matrix(test_data[, -1]), label = test_data$mpg)
xgb_params <- list(objective = "reg:squarederror", max_depth = 3, eta = 0.1, subsample = 0.5, colsample_bytree = 0.5)
xgb_fit <- xgb.train(params = xgb_params, data = xgb_train, nrounds = 100)
xgb_pred <- predict(xgb_fit, newdata = xgb_test)
xgb_mse <- mean((xgb_pred - test_data$mpg) ^ 2)
xgb_mae <- mean(abs(xgb_pred - test_data$mpg))
# Compare model performance
model_mse <- c(lasso_mse, rpart_mse, rf_mse, xgb_mse)
model_mae <- c(lasso_mae, rpart_mae, rf_mae, xgb_mae)
model_perf <- data.frame(Model = c("Lasso", "rpart", "Random Forest", "XGBoost"),
MSE = model_mse, MAE = model_mae)
以上是Task 4的代码,包括了Lasso、rpart、随机森林和xgboost模型的训练和评估,希望可以帮到你!