详解用Python进行时间序列预测的7种方法 - CSDN文库

177 浏览量更新于2023-05-03 评论收藏 725KB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

资源详情

资源评论

资源推荐

详解用详解用Python进行时间序列预测的进行时间序列预测的7种方法种方法

数据准备数据准备

数据集（JetRail高铁的乘客数量）下载.

假设要解决一个时序问题：根据过往两年的数据（2012 年 8 月至 2014 年 8月），需要用这些数据预测接下来 7 个月的乘客

数量。

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

df = pd.read_csv('train.csv')

df.head()

df.shape

依照上面的代码，我们获得了 2012-2014 年两年每个小时的乘客数量。为了解释每种方法的不同之处，以每天为单位构造和

聚合了一个数据集。

从 2012 年 8 月- 2013 年 12 月的数据中构造一个数据集。

创建 train and test 文件用于建模。前 14 个月（ 2012 年 8 月- 2013 年 10 月）用作训练数据，后两个月（2013 年 11 月 –

2013 年 12 月）用作测试数据。

以每天为单位聚合数据集。

import pandas as pd

import matplotlib.pyplot as plt

# Subsetting the dataset

# Index 11856 marks the end of year 2013

df = pd.read_csv('train.csv', nrows=11856)

# Creating train and test set

# Index 10392 marks the end of October 2013

train = df[0:10392] test = df[10392:]

# Aggregating the dataset at daily level

df['Timestamp'] = pd.to_datetime(df['Datetime'], format='%d-%m-%Y %H:%M') # 4位年用Y，2位年用y

df.index = df['Timestamp'] df = df.resample('D').mean() #按天采样，计算均值

train['Timestamp'] = pd.to_datetime(train['Datetime'], format='%d-%m-%Y %H:%M')

train.index = train['Timestamp'] train = train.resample('D').mean() #

test['Timestamp'] = pd.to_datetime(test['Datetime'], format='%d-%m-%Y %H:%M')

test.index = test['Timestamp'] test = test.resample('D').mean()

#Plotting data

train.Count.plot(figsize=(15,8), title= 'Daily Ridership', fontsize=14)

test.Count.plot(figsize=(15,8), title= 'Daily Ridership', fontsize=14)

plt.show()

我们将数据可视化（训练数据和测试数据一起），从而得知在一段时间内数据是如何变化的。

本内容试读结束，登录后可阅读更多

下载后可阅读完整内容，剩余9页未读，立即下载

评论0

weixin_38694299

粉丝: 5
资源: 948

会员权益专享

图片转文字

全年可省5，000元立即开通

最新资源

资源上传下载、课程学习等过程中有任何疑问或建议，欢迎提出宝贵意见哦~我们会及时处理！点击此处反馈