帮我修改以下代码的错误:library(nycflights13) library(dplyr) flights <- na.omit(flights, cols = c("dep_time")) flights$dep_interval <- ifelse(flights$dep_time >= 601 & flights$dep_time <= 1200, "上午", ifelse(flights$dep_time > 1200 & flights$dep_time <= 1800, "下午", ifelse(flights$dep_time > 1800 & flights$dep_time <= 2400, "晚上", ifelse(flights$dep_time > 1 & flights$dep_time <= 600, "凌晨", NA)))) flights_new <- flights %>% na.omit() %>% group_by(year, month, day, dep_interval) %>% summarize("delay_mean" = mean(arr_delay, na.rm = TRUE), "delay_var" = var(arr_delay, na.rm = TRUE), "dest_count" = n_distinct(dest)) flights_select <- flights_new %>% filter(mean_arr_delay <= 30, num_airports >= 50) %>% arrange(desc(mean_arr_delay))
时间: 2024-03-07 14:50:17 浏览: 75
这段代码的问题在于`filter()`函数中引用了一个名为`mean_arr_delay`的未定义对象,正确的对象名应该是`delay_mean`。此外,在`filter()`函数中引用了一个名为`num_airports`的未定义对象。我们需要在`summarize()`函数中增加一个名为`num_airports`的参数,以便正确引用`dest_count`列。
请使用以下代码替换最后一行:
```
flights_select <- flights_new %>%
filter(delay_mean <= 30, dest_count >= 50) %>%
arrange(desc(delay_mean))
```
这里我们将`mean_arr_delay`改为`delay_mean`,并将`num_airports`改为`dest_count`,以与之前的`summarize()`函数中的列名匹配。
现在,代码应该可以成功运行并产生一个数据框,其中按平均延误降序排列,并且平均延误小于或等于30分钟,出发机场数量大于或等于50个。
相关问题
1. 数据来源于 package::nycflights13 (1) 将 flights 和 planes 进行连接,其中,最终结果应当保存 flights 中的所有 观测值并且保存 planes 中对应的观测值,并将结果命名为 flights_planes (5’) (2) 选取 flights_planes 中 tailnum 中含有字母 W 的观测值,并将选取出来的 数据框保存为新的 flights_planes(5’) (3) 剔除 flights_planes 中 engines 和 seats 缺失的观测值,然后将数据按照 engines 和 seats 分组,计算各组飞机在 2013 年的 distance 和 arr_delay 的总和,计算结果不能为 NA,将计算结果保存为新的数据 flights_sum(10’) (4) 利用上一问所得的 flights_sum,计算不同 engines 的总到达迟到时间和总 飞行距离(5’)
(1) 连接flights和planes,保存结果为flights_planes:
```R
library(nycflights13)
flights_planes <- merge(flights, planes, by = "tailnum", all.x = TRUE)
```
(2) 选取flights_planes中tailnum中含有字母W的观测值,保存为新的flights_planes:
```R
flights_planes <- flights_planes[grep("W", flights_planes$tailnum), ]
```
(3) 剔除flights_planes中engines和seats缺失的观测值,按照engines和seats分组,计算各组飞机在2013年的distance和arr_delay的总和,并将结果保存为新的数据flights_sum:
```R
flights_sum <- na.omit(flights_planes[c("engines", "seats", "distance", "arr_delay")])
flights_sum <- aggregate(cbind(distance, arr_delay) ~ engines + seats, data = flights_sum, FUN = sum)
```
(4) 利用上一问所得的flights_sum,计算不同engines的总到达迟到时间和总飞行距离:
```R
aggregate(cbind(distance, arr_delay) ~ engines, data = flights_sum, FUN = sum)
```
3.Consider the table FLIGHT given below. Write SQL queries for the questions below: a.Display details of all flights starting from Delhi. b.Display details of flights that have more than 4 number of flights operating. c.Display flight codes, starting place, destination, number of flights in descending order of number of flights. d.Display destinations along with flight codes of all the destinations starting with ‘A’.
Sure, here are the SQL queries for the questions:
a. To display details of all flights starting from Delhi:
```
SELECT * FROM FLIGHT WHERE STARTING_PLACE = 'Delhi';
```
b. To display details of flights that have more than 4 number of flights operating:
```
SELECT * FROM FLIGHT WHERE NUMBER_OF_FLIGHTS > 4;
```
c. To display flight codes, starting place, destination, number of flights in descending order of number of flights:
```
SELECT FLIGHT_CODE, STARTING_PLACE, DESTINATION, NUMBER_OF_FLIGHTS
FROM FLIGHT
ORDER BY NUMBER_OF_FLIGHTS DESC;
```
d. To display destinations along with flight codes of all the destinations starting with ‘A’:
```
SELECT FLIGHT_CODE, DESTINATION
FROM FLIGHT
WHERE DESTINATION LIKE 'A%';
```
阅读全文