使用R语言重新编写代码
时间: 2023-07-24 11:33:49 浏览: 105
好的,以下是使用 R 语言重新编写的代码,需要先安装和加载相关的包(dplyr、ggplot2、reshape2):
```R
library(dplyr)
library(ggplot2)
library(reshape2)
# 从 flights 数据表中选择需要的变量,创建 flight1 表
flight1 <- flights %>%
select(year, month, day, hour, origin, dep_delay, distance, carrier)
# 从 weather 数据表中选择需要的变量,创建 weather1 表
weather1 <- weather %>%
select(year, month, day, hour, origin, humid, wind_speed)
# 内连接 flight1 和 weather1 表,并随机抽取 100000 行数据
set.seed(123) # 设置随机种子
flight_weather <- inner_join(flight1, weather1, by = c("year", "month", "day", "hour", "origin")) %>%
sample_n(100000)
# 计算每个出发机场的平均出发延误时间,并按降序排序
longest_delay <- flight_weather %>%
group_by(origin) %>%
summarize(avg_delay = mean(dep_delay, na.rm = TRUE)) %>%
arrange(desc(avg_delay))
# 根据出发地 (origin) 在同一个图中画出风速 wind speed (x轴)和出发延误时间 dep delay (y轴) 的平滑曲线图
ggplot(flight_weather, aes(x = wind_speed, y = dep_delay, color = origin)) +
geom_smooth() +
theme_bw() +
xlab("Wind Speed") +
ylab("Departure Delay")
# 根据不同出发地(origin) 在平行的3 个图中画出风速 wind speed(x 轴)和出发延误时间 dep delay(y轴) 的散点图
flight_weather %>%
ggplot(aes(x = wind_speed, y = dep_delay)) +
geom_point() +
facet_wrap(~ origin, ncol = 3) +
theme_bw() +
xlab("Wind Speed") +
ylab("Departure Delay")
# 根据 flight_weather 表,画出每个月航班数的直方分布图,x 轴为月份,轴是每个月份航班数所占的比例
flight_weather %>%
group_by(month) %>%
summarize(count = n()) %>%
mutate(prop = count / sum(count)) %>%
ggplot(aes(x = month, y = prop)) +
geom_bar(stat = "identity", fill = "steelblue") +
theme_bw() +
xlab("Month") +
ylab("Proportion of Flights")
# 根据 flight_weather 表,画出每个月航班距离的 boxplot 图,x轴为月份,y轴为航行距离,根据的航行距离的中位数从低到高对x 轴的月份进行重新排序
flight_weather %>%
mutate(month = factor(month, levels = unique(month)[order(tapply(distance, month, median))])) %>%
ggplot(aes(x = month, y = distance)) +
geom_boxplot(fill = "steelblue") +
theme_bw() +
xlab("Month") +
ylab("Flight Distance")
```
希望这份代码能够对你有所帮助!
阅读全文