用已有的时间序列数据进行transformer预测模型预测并且可视化python代码
时间: 2024-02-20 19:56:32 浏览: 175
Transformer在时间序列预测中的应用
好的,下面是一个简单的用Transformer进行时间序列预测的Python代码,包含数据准备、模型搭建、训练和预测可视化等步骤:
```python
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler
import torch
from torch import nn
from torch.utils.data import Dataset, DataLoader
# 准备数据
df = pd.read_csv('data.csv', index_col=0)
scaler = MinMaxScaler()
data = scaler.fit_transform(df.values)
train_data = data[:200]
test_data = data[200:]
# 定义数据集类
class TimeSeriesDataset(Dataset):
def __init__(self, data, seq_len):
self.data = data
self.seq_len = seq_len
def __len__(self):
return len(self.data) - self.seq_len
def __getitem__(self, idx):
x = self.data[idx:idx+self.seq_len]
y = self.data[idx+self.seq_len]
return x, y
# 定义模型
class TransformerModel(nn.Module):
def __init__(self, input_size, output_size, num_layers, hidden_size, num_heads, dropout):
super().__init__()
self.encoder = nn.Sequential(
nn.Linear(input_size, hidden_size),
nn.ReLU(),
nn.Linear(hidden_size, hidden_size),
nn.ReLU(),
nn.Linear(hidden_size, hidden_size)
)
self.decoder = nn.Sequential(
nn.Linear(1, hidden_size),
nn.ReLU(),
nn.Linear(hidden_size, hidden_size),
nn.ReLU(),
nn.Linear(hidden_size, output_size)
)
self.transformer = nn.Transformer(
d_model=hidden_size,
nhead=num_heads,
num_encoder_layers=num_layers,
num_decoder_layers=num_layers,
dropout=dropout
)
def forward(self, x, y):
x = self.encoder(x.unsqueeze(0))
y = self.decoder(y.unsqueeze(0))
x = x.transpose(0, 1)
y = y.transpose(0, 1)
out = self.transformer(x, y)
out = out.transpose(0, 1)
out = out.squeeze(0)
return out
# 训练模型
model = TransformerModel(input_size=10, output_size=1, num_layers=1, hidden_size=32, num_heads=2, dropout=0.1)
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
train_dataset = TimeSeriesDataset(train_data, seq_len=10)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
num_epochs = 100
for epoch in range(num_epochs):
for x, y in train_loader:
optimizer.zero_grad()
y_pred = model(x, y[:, 0])
loss = criterion(y_pred, y[:, 1])
loss.backward()
optimizer.step()
print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, num_epochs, loss.item()))
# 预测并可视化结果
model.eval()
x_test = test_data[:10]
y_test = test_data[10:, 0]
y_pred = []
for i in range(len(y_test)):
with torch.no_grad():
x = torch.from_numpy(x_test.astype(np.float32))
y = torch.from_numpy(np.array([y_test[i]]).astype(np.float32))
y_pred_i = model(x, y).item()
y_pred.append(y_pred_i)
x_test = np.concatenate([x_test[1:], np.array([y_test[i], y_pred_i]).reshape(1, -1)], axis=0)
y_pred = scaler.inverse_transform(np.array(y_pred).reshape(-1, 1))
y_test = scaler.inverse_transform(y_test.reshape(-1, 1))
df_pred = pd.DataFrame(y_pred, index=df.index[210:], columns=['pred'])
df_test = pd.DataFrame(y_test, index=df.index[210:], columns=['test'])
df_pred.plot()
df_test.plot()
```
这个代码中的数据是从一个名为"data.csv"的csv文件中读取的,数据的第一列是时间戳,后面的列是一些时间序列数据。代码中用到的是单变量时间序列数据,如果有多变量时间序列数据,需要在数据准备和模型定义的时候进行相应的修改。
阅读全文