python操作resultym.csv数据表(有Date(YYYY/MM)、TotalPrice两列数据),数据表第一行为表头信息,数据表中前27行都有数据,以此为基础,python调用resultym.csv表进行操作:循环调用以resultym.csv为数据集构建的pytorch lstm预测模型(模型实现过程:读取csv表,然后将TotalPrice归一化,接着按照0.8划分训练集和测试集,然后将划分好的数据转为PyTorch张量,之后定义超参数和算法模型、优化器,最后训练模型),该模型能够根据Date值来预测TotalPrice值,然后将第一次预测出的y_test_pred赋值给B26、将第二次预测出的值赋给B27、将第三次预测出的值赋给B28,一直循环直到求出B50的数值。每预测出一个值就在表的最后一行插入一组数据,插入的数据为:Date插入的值按照前面的年月往下延(即按照2023/03、2023/04、2023/05········2025/01的顺序),TotalPrice插入的值定义为2222222.5。直到求出第50行的数值,脚本停止运行。
时间: 2023-12-06 16:44:33 浏览: 93
Machine_Learning-Housing_grade_prediction_using_python:机器学习被广泛用于建立数据分析的预测模型。 数据集包含一个训练文件,我们使用该训练文件来训练算法以对测试文件进行预测。 该文件包含大约2500多个缺失值。 我们使用KNN插补来自动填充缺失的位置,以进行更好的预测。 用于训练机器的模型…
以下是实现该功能的代码:
```python
import pandas as pd
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
# 读取csv表
df = pd.read_csv('resultym.csv')
# 归一化TotalPrice列
df['TotalPrice'] = (df['TotalPrice'] - df['TotalPrice'].min()) / (df['TotalPrice'].max() - df['TotalPrice'].min())
# 划分训练集和测试集
train_size = int(len(df) * 0.8)
train_df = df[:train_size]
test_df = df[train_size:]
# 转为PyTorch张量
train_x = torch.tensor(train_df['TotalPrice'].values).view(-1, 1, 1).float()
train_y = torch.tensor(train_df['TotalPrice'].values).view(-1, 1, 1).float()
test_x = torch.tensor(test_df['TotalPrice'].values).view(-1, 1, 1).float()
test_y = torch.tensor(test_df['TotalPrice'].values).view(-1, 1, 1).float()
# 定义超参数和算法模型、优化器
input_size = 1
output_size = 1
hidden_size = 32
num_layers = 2
learning_rate = 0.01
num_epochs = 100
class LSTM(nn.Module):
def __init__(self, input_size, hidden_size, num_layers, output_size):
super(LSTM, self).__init__()
self.hidden_size = hidden_size
self.num_layers = num_layers
self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True)
self.fc = nn.Linear(hidden_size, output_size)
def forward(self, x):
h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).requires_grad_()
c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).requires_grad_()
out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))
out = self.fc(out[:, -1, :])
return out
model = LSTM(input_size, hidden_size, num_layers, output_size)
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
# 训练模型
for epoch in range(num_epochs):
outputs = model(train_x)
optimizer.zero_grad()
loss = criterion(outputs, train_y)
loss.backward()
optimizer.step()
if epoch % 10 == 0:
print("Epoch: %d, loss: %1.5f" % (epoch, loss.item()))
# 循环预测并插入数据
for i in range(24):
# 预测
with torch.no_grad():
x_test = torch.tensor(test_df['TotalPrice'].values[-1]).view(1, 1, 1).float()
y_test_pred = model(x_test)
y_test_pred = y_test_pred.item()
# 将预测结果插入表中
date = pd.to_datetime(test_df['Date'].iloc[-1]) + pd.DateOffset(months=1)
date_str = date.strftime('%Y/%m')
new_row = pd.DataFrame({'Date': date_str, 'TotalPrice': 2222222.5}, index=[len(df)])
df = pd.concat([df, new_row], axis=0)
test_df = df[train_size:]
if i == 0:
df.iloc[25-1]['TotalPrice'] = y_test_pred
elif i == 1:
df.iloc[26-1]['TotalPrice'] = y_test_pred
elif i == 2:
df.iloc[27-1]['TotalPrice'] = y_test_pred
elif i == 3:
df.iloc[28-1]['TotalPrice'] = y_test_pred
elif i == 4:
df.iloc[29-1]['TotalPrice'] = y_test_pred
elif i == 5:
df.iloc[30-1]['TotalPrice'] = y_test_pred
elif i == 6:
df.iloc[31-1]['TotalPrice'] = y_test_pred
elif i == 7:
df.iloc[32-1]['TotalPrice'] = y_test_pred
elif i == 8:
df.iloc[33-1]['TotalPrice'] = y_test_pred
elif i == 9:
df.iloc[34-1]['TotalPrice'] = y_test_pred
elif i == 10:
df.iloc[35-1]['TotalPrice'] = y_test_pred
elif i == 11:
df.iloc[36-1]['TotalPrice'] = y_test_pred
elif i == 12:
df.iloc[37-1]['TotalPrice'] = y_test_pred
elif i == 13:
df.iloc[38-1]['TotalPrice'] = y_test_pred
elif i == 14:
df.iloc[39-1]['TotalPrice'] = y_test_pred
elif i == 15:
df.iloc[40-1]['TotalPrice'] = y_test_pred
elif i == 16:
df.iloc[41-1]['TotalPrice'] = y_test_pred
elif i == 17:
df.iloc[42-1]['TotalPrice'] = y_test_pred
elif i == 18:
df.iloc[43-1]['TotalPrice'] = y_test_pred
elif i == 19:
df.iloc[44-1]['TotalPrice'] = y_test_pred
elif i == 20:
df.iloc[45-1]['TotalPrice'] = y_test_pred
elif i == 21:
df.iloc[46-1]['TotalPrice'] = y_test_pred
elif i == 22:
df.iloc[47-1]['TotalPrice'] = y_test_pred
elif i == 23:
df.iloc[48-1]['TotalPrice'] = y_test_pred
# 更新测试集
test_x = torch.tensor(test_df['TotalPrice'].values).view(-1, 1, 1).float()
test_y = torch.tensor(test_df['TotalPrice'].values).view(-1, 1, 1).float()
# 更新模型
for epoch in range(num_epochs):
outputs = model(train_x)
optimizer.zero_grad()
loss = criterion(outputs, train_y)
loss.backward()
optimizer.step()
print("Prediction %d: %1.5f" % (i+1, y_test_pred))
```
在代码中,我们首先读取了resultym.csv表,并对其中的TotalPrice列进行归一化处理。然后按照0.8的比例将数据集划分为训练集和测试集,并将它们转为PyTorch张量。接着定义了LSTM模型、损失函数和优化器,并进行了模型训练。
接着进入循环预测并插入数据的步骤。我们首先用训练好的模型对测试集中的最后一个数据进行预测,得到预测结果y_test_pred。然后将预测结果插入到表中,在插入前需要通过pd.to_datetime函数将之前的日期字符串转为datetime类型,并使用pd.DateOffset函数往后推一个月,得到新的日期字符串。插入的数据为{'Date': date_str, 'TotalPrice': 2222222.5},其中TotalPrice固定为2222222.5。然后根据预测结果更新表中对应的TotalPrice值。最后更新测试集,重新训练模型,并打印出当前的预测结果。
需要注意的是,我们在更新表中的TotalPrice值时,需要根据具体的行数来更新,因此使用了一长串的if-elif语句。如果有更好的方法,可以自行修改。
阅读全文