基于pytorch框架,用LSTM模型对48个时间点的单变量时间序列数据,进行5个类别分类的代码
时间: 2023-06-15 20:08:13 浏览: 90
以下是基于PyTorch框架,用LSTM模型对48个时间点的单变量时间序列数据进行5个类别分类的代码示例:
```python
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
# 定义LSTM模型
class LSTM(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(LSTM, self).__init__()
self.hidden_size = hidden_size
self.lstm = nn.LSTM(input_size, hidden_size)
self.fc = nn.Linear(hidden_size, output_size)
def forward(self, x):
lstm_out, _ = self.lstm(x.view(len(x), 1, -1))
output = self.fc(lstm_out[-1])
return output
# 数据预处理
def preprocess_data(data):
# 按照时间序列长度48进行截断或补齐
if len(data) > 48:
data = data[:48]
elif len(data) < 48:
data = np.concatenate([data, np.zeros(48 - len(data))])
# 归一化
data = (data - np.mean(data)) / np.std(data)
return data
# 加载数据集
data = np.load('data.npy') # shape: (num_samples, num_timesteps)
labels = np.load('labels.npy') # shape: (num_samples,)
# 数据预处理
data = np.array([preprocess_data(sample) for sample in data])
# 划分训练集和测试集
train_size = int(0.8 * len(data))
train_data, train_labels = data[:train_size], labels[:train_size]
test_data, test_labels = data[train_size:], labels[train_size:]
# 定义超参数
input_size = 1
hidden_size = 32
output_size = 5
lr = 0.001
epochs = 100
# 定义模型、损失函数和优化器
model = LSTM(input_size, hidden_size, output_size)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=lr)
# 训练模型
for epoch in range(epochs):
running_loss = 0.0
for i in range(len(train_data)):
optimizer.zero_grad()
output = model(torch.Tensor(train_data[i]))
loss = criterion(output.view(1, -1), torch.LongTensor([train_labels[i]]))
loss.backward()
optimizer.step()
running_loss += loss.item()
print('Epoch %d loss: %.3f' % (epoch+1, running_loss/len(train_data)))
# 测试模型
correct = 0
total = 0
with torch.no_grad():
for i in range(len(test_data)):
output = model(torch.Tensor(test_data[i]))
_, predicted = torch.max(output.data, 0)
total += 1
correct += (predicted == test_labels[i])
print('Accuracy: %.2f%%' % (100 * correct / total))
```
需要注意的是,上述代码中的数据集格式为numpy数组,其中`data`表示输入的时间序列数据,`labels`表示对应的类别标签。在数据预处理中,我们将每个时间序列数据归一化,并按照时间序列长度48进行截断或补齐。在模型训练和测试中,我们使用交叉熵损失函数和Adam优化器,训练过程中输出每个epoch的平均loss,并在测试集上计算模型的准确率。
阅读全文