1.本题使用数据包 nycflights13 (24 分) (1)从flights 数据表中挑选出以下变量: (year, month,day, hour， origin, dep delay, distance,carrier)，将生产的新表保存为 flight1。 (2)从weather 数据表中挑选出以下变量: (year, month, day, hour，origin, humid, wind speed),将生产的新表保存为 weather1。 (3)将 flight1 表和 weather1 表根据共同变量进行内连接，随机抽取 100000 行数据，将生产的结果保存为 flight weather。(提示: sample nO函数，不用重复抽取) (4) 从 flight weather 表中对三人出发机场按照平均出发延误时间排降序，并将结果保留在longest delay 表中。把结果展示出来。 (5) 根据出发地 (origin) 在同一个图中画出风速 wind speed (x轴)和出发延误时间 dep delay (y轴) 的平滑曲线图。 (6)根据不同出发地(origin) 在平行的3个图中画出风速 wind speed(x 轴)和出发延误时间 dep delay(y轴) 的散点图。 (7) 根据 flight weather 表，画出每个月航班数的直方分布图，x轴为月份，y轴是每个月份航班数所占的比例。 (8) 根据 flight weather 表，画出每个月航班距离的 boxplot 图，x 轴为月份，y轴为航行距离，根据的航行距离的中位数从低到高对 x 轴的月份进行重新排序。 r语言

时间: 2024-03-13 10:45:16 浏览: 82

(1) 从flights数据表中挑选出以下变量，并将生产的新表保存为flight1： ```R library(nycflights13) flight1 <- select(flights, year, month, day, hour, origin, dep_delay, distance, carrier) ``` (2) 从weather数据表中挑选出以下变量，并将生产的新表保存为weather1： ```R weather1 <- select(weather, year, month, day, hour, origin, humid, wind_speed) ``` (3) 将flight1表和weather1表根据共同变量进行内连接，随机抽取100000行数据，并将生产的结果保存为flight_weather： ```R set.seed(123) flight_weather <- inner_join(sample_n(flight1, 100000), sample_n(weather1, 100000), by = c("year", "month", "day", "hour", "origin")) ``` (4) 从flight_weather表中对三人出发机场按照平均出发延误时间排降序，并将结果保留在longest_delay表中。把结果展示出来。 ```R library(dplyr) longest_delay <- flight_weather %>% group_by(origin) %>% summarise(mean_delay = mean(dep_delay)) %>% arrange(desc(mean_delay)) longest_delay ``` (5) 根据出发地 (origin) 在同一个图中画出风速wind_speed (x轴)和出发延误时间dep_delay (y轴)的平滑曲线图。 ```R library(ggplot2) ggplot(flight_weather, aes(x = wind_speed, y = dep_delay, color = origin)) + geom_smooth(se = FALSE) + labs(x = "Wind speed", y = "Departure delay") ``` (6) 根据不同出发地(origin) 在平行的3个图中画出风速wind_speed(x轴)和出发延误时间dep_delay(y轴)的散点图。 ```R library(gridExtra) plots <- list() for (i in unique(flight_weather$origin)) { plot <- ggplot(filter(flight_weather, origin == i), aes(x = wind_speed, y = dep_delay)) + geom_point() + labs(title = i, x = "Wind speed", y = "Departure delay") plots[[i]] <- plot } grid.arrange(grobs = plots, ncol = 3) ``` (7) 根据flight_weather表，画出每个月航班数的直方分布图，x轴为月份，y轴是每个月份航班数所占的比例。 ```R monthly_flights <- flight_weather %>% group_by(month) %>% summarise(flights = n()) %>% mutate(prop = flights / sum(flights)) ggplot(monthly_flights, aes(x = month, y = prop)) + geom_bar(stat = "identity") + labs(x = "Month", y = "Proportion of flights") ``` (8) 根据flight_weather表，画出每个月航班距离的boxplot图，x轴为月份，y轴为航行距离，根据的航行距离的中位数从低到高对x轴的月份进行重新排序。 ```R flight_weather %>% mutate(month = factor(month, levels = unique(month)[order(tapply(distance, month, median))])) %>% ggplot(aes(x = month, y = distance)) + geom_boxplot() + labs(x = "Month", y = "Distance")

阅读全文

相关推荐

anyflights:R包可生成类似nycflights13的航空旅行数据:small_airplane:

OpenDDS-latest.rar_cannot8k1_flights8y_openDDS开发_opendds中文教程_ope

flights.csv(1).zip

利用数据包 nycflights13: 1. 利用 flights 和 planes，找到平均服役时间最长前十种型号（model）的飞机

确定nycflights13::flights 数据集中每列的类型

1.数据：nycflights13，如果没有请先安装包(nyctights13 (1) 将flights 和 planes 进行连接，其中，最终结果应当保存 flights 中的所有观测值并且保存 planes 中对应的观测值，并将结果命名为flights_planes

移动机器人与头戴式摄像头RGB-D多人实时检测和跟踪系统

小学低年级汉语拼音教学的问题与对策

帝国CMS7.5仿《酷酷游戏网》源码/帝国CMS手游综合门户网站模板

Everything-1.5.0.1390a.x64.zip

c语言实现如果cmd中的ping.zip

证件照处理的Python脚本

大家在看

一种基于SLA的业务管理模型

Windows_server_2008_R2安装金蝶K3WISE中间层安装与配置。

轻量级xml 解析工具 xml-paras-foxe-CHS.exe

信息化综合运维体系.doc

IMX214_RegisterMap_2.0.0

最新推荐

移动机器人与头戴式摄像头RGB-D多人实时检测和跟踪系统

小学低年级汉语拼音教学的问题与对策

帝国CMS7.5仿《酷酷游戏网》源码/帝国CMS手游综合门户网站模板

Everything-1.5.0.1390a.x64.zip

易语言例程：用易核心支持库打造功能丰富的IE浏览框

管理建模和仿真的文件

STM32F407ZG引脚功能深度剖析：掌握引脚分布与配置的秘密（全面解读）

给出文档中问题的答案代码

Docker构建与运行Next.js应用的指南

"互动学习：行动中的多样性与论文攻读经历"