基于pytorch框架,读取表格数据区分数据和标签并制作出训练集和测试集,然后运用LSTM模型,对48个时间点的单变量时间序列数据,进行5个类别分类的代码
时间: 2023-06-15 16:07:56 浏览: 121
基于pytorch实现的时间序列预测训练框架,各个部分模块化,方便修改模型 包含时间序列预测模型、训练、验证、测试
5星 · 资源好评率100%
以下是基于 PyTorch 框架,读取表格数据区分数据和标签并制作出训练集和测试集,然后运用 LSTM 模型,对 48 个时间点的单变量时间序列数据,进行 5 个类别分类的代码。
```python
import torch
import torch.nn as nn
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler
from torch.utils.data import Dataset, DataLoader
# 读取数据
data = pd.read_csv('data.csv')
# 分离特征和标签
X = data.iloc[:, :-1].values
y = data.iloc[:, -1].values
# 归一化
sc = MinMaxScaler(feature_range=(0, 1))
X = sc.fit_transform(X)
# 划分训练集和测试集
train_size = int(len(X) * 0.8)
test_size = len(X) - train_size
X_train, X_test = X[0:train_size, :], X[train_size:len(X), :]
y_train, y_test = y[0:train_size], y[train_size:len(y)]
# 转换数据为张量
X_train = torch.Tensor(X_train).unsqueeze(2)
X_test = torch.Tensor(X_test).unsqueeze(2)
y_train = torch.Tensor(y_train).type(torch.LongTensor)
y_test = torch.Tensor(y_test).type(torch.LongTensor)
# 定义数据集和加载器
class TimeSeriesDataset(Dataset):
def __init__(self, data, targets):
self.data = data
self.targets = targets
def __getitem__(self, index):
return self.data[index], self.targets[index]
def __len__(self):
return len(self.data)
train_dataset = TimeSeriesDataset(X_train, y_train)
test_dataset = TimeSeriesDataset(X_test, y_test)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)
# 定义 LSTM 模型
class LSTMModel(nn.Module):
def __init__(self, input_size=1, hidden_size=64, num_layers=1, num_classes=5):
super(LSTMModel, self).__init__()
self.hidden_size = hidden_size
self.num_layers = num_layers
self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True)
self.fc = nn.Linear(hidden_size, num_classes)
def forward(self, x):
h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device)
c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device)
out, _ = self.lstm(x, (h0, c0))
out = self.fc(out[:, -1, :])
return out
# 定义超参数和设备
input_size = 1
hidden_size = 64
num_layers = 2
num_classes = 5
learning_rate = 0.001
num_epochs = 100
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
# 实例化模型
model = LSTMModel(input_size, hidden_size, num_layers, num_classes).to(device)
# 定义损失函数和优化器
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
# 训练模型
for epoch in range(num_epochs):
for i, (data, targets) in enumerate(train_loader):
data = data.to(device)
targets = targets.to(device)
# 前向传播
outputs = model(data)
loss = criterion(outputs, targets)
# 反向传播并优化
optimizer.zero_grad()
loss.backward()
optimizer.step()
if (i+1) % 10 == 0:
print('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}'.format(epoch+1, num_epochs, i+1, len(train_loader), loss.item()))
# 测试模型
with torch.no_grad():
correct = 0
total = 0
for data, targets in test_loader:
data = data.to(device)
targets = targets.to(device)
outputs = model(data)
_, predicted = torch.max(outputs.data, 1)
total += targets.size(0)
correct += (predicted == targets).sum().item()
print('Accuracy of the model on the test data: {} %'.format(100 * correct / total))
```
以上代码中,首先读取 csv 文件并进行特征和标签的分离,然后使用 `MinMaxScaler` 进行数据归一化,接着使用 `train_test_split` 函数进行训练集和测试集的划分。然后,将数据转换为张量,并定义了一个 `TimeSeriesDataset` 类,并使用 `DataLoader` 将数据集加载到模型中。
接着,定义了一个 `LSTMModel` 类来实现 LSTM 模型,定义了超参数和设备,并将模型实例化。然后,定义了损失函数和优化器,进行模型训练。最后,使用测试数据集测试模型,并输出模型的准确率。
阅读全文