min_scaler = preprocessing.MinMaxScaler(feature_range=(0,1)).fit(train_data)
时间: 2024-05-17 07:15:47 浏览: 161
这段代码的作用是使用 scikit-learn 库中的 preprocessing 模块,将训练数据 train_data 进行最小-最大归一化处理,并将转换器对象赋值给变量 min_scaler。具体来说,preprocessing.MinMaxScaler() 函数通过 feature_range=(0,1) 参数指定将特征缩放到 [0, 1] 的范围内,然后使用 fit(train_data) 方法拟合训练数据,计算最小值和最大值,并将其保存在转换器对象中。最后,可以使用 min_scaler.transform() 方法将测试数据进行归一化处理。
相关问题
import pandas as pd import numpy as np import matplotlib.pyplot as plt import tensorflow as tf from tensorflow.keras.models import Sequential from tensorflow.keras.layers import LSTM, Dense data = pd.read_csv('车辆:274序:4结果数据.csv') x = data[['车头间距', '原车道前车速度']].values y = data['本车速度'].values train_size = int(len(x) * 0.7) test_size = len(x) - train_size x_train, x_test = x[0:train_size,:], x[train_size:len(x),:] y_train, y_test = y[0:train_size], y[train_size:len(y)] from sklearn.preprocessing import MinMaxScaler scaler = MinMaxScaler(feature_range=(0, 1)) x_train = scaler.fit_transform(x_train) x_test = scaler.transform(x_test) model = Sequential() model.add(LSTM(50, input_shape=(2, 1))) model.add(Dense(1)) model.compile(loss='mean_squared_error', optimizer='adam') history = model.fit(x_train.reshape(-1, 2, 1), y_train, epochs=100, batch_size=32, validation_data=(x_test.reshape(-1, 2, 1), y_test)) plt.plot(history.history['loss']) plt.plot(history.history['val_loss']) plt.title('Model loss') plt.ylabel('Loss') plt.xlabel('Epoch') plt.legend(['Train', 'Test'], loc='upper right') plt.show() train_predict = model.predict(x_train.reshape(-1, 2, 1)) test_predict = model.predict(x_test.reshape(-1, 2, 1)) train_predict = scaler.inverse_transform(train_predict) train_predict = train_predict.reshape(-1, 1) y_train = scaler.inverse_transform([y_train]) test_predict = scaler.inverse_transform(test_predict) y_test = scaler.inverse_transform([y_test]) plt.plot(y_train[0], label='train') plt.plot(train_predict[:,0], label='train predict') plt.plot(y_test[0], label='test') plt.plot(test_predict[:,0], label='test predict') plt.legend() plt.show()报错Traceback (most recent call last): File "C:\Users\马斌\Desktop\NGSIM_data_processing\80s\lstmtest.py", line 42, in <module> train_predict = scaler.inverse_transform(train_predict) File "D:\python\python3.9.5\pythonProject\venv\lib\site-packages\sklearn\preprocessing\_data.py", line 541, in inverse_transform X -= self.min_ ValueError: non-broadcastable output operand with shape (611,1) doesn't match the broadcast shape (611,2)
这个错误提示是因为在使用 `scaler.inverse_transform()` 函数时,输入的 `train_predict` 和 `y_train` 的维度不匹配,导致无法进行运算。具体来说,`train_predict` 是一个形状为 `(611, 1)` 的二维数组,而 `y_train` 是一个形状为 `(1, 611)` 的二维数组,因此无法直接进行运算。
你可以尝试使用以下代码将 `y_train` 转换为一维数组,再进行运算:
```
train_predict = scaler.inverse_transform(train_predict)
train_predict = train_predict.reshape(-1) # 将结果变为一维数组
y_train = scaler.inverse_transform(y_train.reshape(-1, 1)).reshape(-1) # 将结果变为一维数组
```
同样的,你也需要将 `y_test` 转换为一维数组,再进行相应的操作。
写一个python Flask销售预测系统,系统有一个suanfa.py文件,该文件的作用:先读取shuju.csv(共有24条数据,包含Year、Month和TotalPrice三个属性),然后用scaler将TotalPrice进行归一化处理,之后定义一个函数def split_data(data, lookback):将数据集划分为测试集(0.2)和训练集(0.8),data_raw = data.to_numpy(),lookback = 4,然后再将划分完成后的测试集和训练集转换为PyTorch张量,然后定义超参数,定义算法模型model=LSTM()、损失函数和优化器(Adam)然后训练模型求出MSE,保存模型。有一个predict.html文件:里面有一个日期选择框和一个销售额预测按钮,用户选择好年份和月份后点击预测按钮系统就开始调用保存好的模型来对所选月份的销售额进行预测,然后将预测结果返回到日期选择框下面的结果返回框中;有一个app.py文件:定义路径。用flask和bootstrap、LayUI写出完整详细代码
这是一个比较完整的 Flask 销售预测系统,包含了前端页面和后端算法模型。使用了 Bootstrap 和 LayUI 两个前端框架,用 PyTorch 框架实现了 LSTM 模型进行销售预测。
suanfa.py 文件:
```python
import torch
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler
def split_data(data, lookback):
data_raw = data.to_numpy()
data = []
# create all possible sequences of length lookback
for index in range(len(data_raw) - lookback):
data.append(data_raw[index: index + lookback])
data = np.array(data)
test_set_size = int(np.round(0.2*data.shape[0]))
train_set_size = data.shape[0] - (test_set_size)
x_train = torch.from_numpy(np.array(data[:train_set_size,:-1,:]))
y_train = torch.from_numpy(np.array(data[:train_set_size,-1,:]))
x_test = torch.from_numpy(np.array(data[train_set_size:,:-1,:]))
y_test = torch.from_numpy(np.array(data[train_set_size:,-1,:]))
return x_train.float(), y_train.float(), x_test.float(), y_test.float()
class LSTM(torch.nn.Module):
def __init__(self, input_dim, hidden_dim, num_layers, output_dim):
super(LSTM, self).__init__()
self.hidden_dim = hidden_dim
self.num_layers = num_layers
self.lstm = torch.nn.LSTM(input_dim, hidden_dim, num_layers, batch_first=True)
self.fc = torch.nn.Linear(hidden_dim, output_dim)
def forward(self, x):
h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim).requires_grad_()
c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim).requires_grad_()
out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))
out = self.fc(out[:, -1, :])
return out
if __name__ == '__main__':
data = pd.read_csv('shuju.csv')
scaler = MinMaxScaler(feature_range=(-1, 1))
data['TotalPrice'] = scaler.fit_transform(data['TotalPrice'].values.reshape(-1,1))
x_train, y_train, x_test, y_test = split_data(data[['Year','Month','TotalPrice']], 4)
input_dim = 3
hidden_dim = 12
num_layers = 1
output_dim = 1
num_epochs = 1000
model = LSTM(input_dim=input_dim, hidden_dim=hidden_dim, output_dim=output_dim, num_layers=num_layers)
loss_fn = torch.nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
for t in range(num_epochs):
y_pred = model(x_train)
loss = loss_fn(y_pred, y_train)
if t % 100 == 0:
print("Epoch ", t, "MSE: ", loss.item())
optimizer.zero_grad()
loss.backward()
optimizer.step()
torch.save(model.state_dict(), 'model_lstm.pth')
```
predict.html 文件:
```html
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>销售预测系统</title>
<link rel="stylesheet" href="https://cdn.bootcdn.net/ajax/libs/layui/2.5.7/css/layui.min.css">
<link rel="stylesheet" href="https://cdn.bootcdn.net/ajax/libs/twitter-bootstrap/4.5.3/css/bootstrap.min.css">
<script src="https://cdn.bootcdn.net/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
<script src="https://cdn.bootcdn.net/ajax/libs/layui/2.5.7/layui.min.js"></script>
<script src="https://cdn.bootcdn.net/ajax/libs/twitter-bootstrap/4.5.3/js/bootstrap.min.js"></script>
</head>
<body>
<div class="container">
<div class="row justify-content-center mt-5">
<div class="col-md-6">
<div class="form-group">
<label for="year">年份:</label>
<select class="form-control" id="year">
<option value="2014">2014</option>
<option value="2015">2015</option>
<option value="2016">2016</option>
<option value="2017">2017</option>
<option value="2018">2018</option>
<option value="2019">2019</option>
<option value="2020">2020</option>
</select>
</div>
<div class="form-group">
<label for="month">月份:</label>
<select class="form-control" id="month">
<option value="1">1</option>
<option value="2">2</option>
<option value="3">3</option>
<option value="4">4</option>
<option value="5">5</option>
<option value="6">6</option>
<option value="7">7</option>
<option value="8">8</option>
<option value="9">9</option>
<option value="10">10</option>
<option value="11">11</option>
<option value="12">12</option>
</select>
</div>
<div class="form-group">
<button class="btn btn-primary" onclick="predict()">销售额预测</button>
</div>
<div class="form-group">
<label for="result">预测结果:</label>
<input type="text" class="form-control" id="result" disabled>
</div>
</div>
</div>
</div>
<script>
function predict() {
var year = $('#year').val();
var month = $('#month').val();
$.ajax({
url: '/predict',
method: 'POST',
data: {
'year': year,
'month': month
},
success: function(response) {
$('#result').val(response);
}
});
}
</script>
</body>
</html>
```
app.py 文件:
```python
from flask import Flask, render_template, request, jsonify
import suanfa
import torch
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
app = Flask(__name__)
@app.route('/')
def index():
return render_template('predict.html')
@app.route('/predict', methods=['POST'])
def predict():
year = int(request.form['year'])
month = int(request.form['month'])
data = pd.read_csv('shuju.csv')
scaler = MinMaxScaler(feature_range=(-1, 1))
data['TotalPrice'] = scaler.fit_transform(data['TotalPrice'].values.reshape(-1,1))
x_train, y_train, x_test, y_test = suanfa.split_data(data[['Year','Month','TotalPrice']], 4)
input_dim = 3
hidden_dim = 12
num_layers = 1
output_dim = 1
model = suanfa.LSTM(input_dim=input_dim, hidden_dim=hidden_dim, output_dim=output_dim, num_layers=num_layers)
model.load_state_dict(torch.load('model_lstm.pth'))
test_inputs = torch.from_numpy(np.array(data[['Year','Month','TotalPrice']][-3:].values))
test_inputs = test_inputs.view(1, -1, 3).float()
model.eval()
y_pred = scaler.inverse_transform(model(test_inputs).detach().numpy())
return jsonify(round(float(y_pred)), 2)
if __name__ == '__main__':
app.run(debug=True)
```
注意:在运行之前需要将 `shuju.csv` 文件和 `model_lstm.pth` 文件放到与 `app.py` 文件同一目录下。另外,需要安装 Flask 和 PyTorch 库,可以使用以下命令进行安装:
```shell
pip install flask torch pandas scikit-learn
```
阅读全文