请为DataFrame表示的数据集利用Pytorch设计一个全连接神经网络以实现分类
时间: 2024-03-10 07:47:24 浏览: 221
好的,下面是一个基本的全连接神经网络的PyTorch实现,可以用于对DataFrame表示的数据集进行分类。
首先,需要导入必要的库:
```python
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.model_selection import train_test_split
```
然后,加载数据集并进行预处理:
```python
# 加载数据集
data = pd.read_csv('data.csv')
# 将标签列转换为数值类型
data['label'] = data['label'].astype('category').cat.codes
# 将特征和标签分开
X = data.drop('label', axis=1)
y = data['label']
# 将数据集分为训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 将数据转换为PyTorch张量
X_train = torch.tensor(X_train.values, dtype=torch.float32)
X_test = torch.tensor(X_test.values, dtype=torch.float32)
y_train = torch.tensor(y_train.values, dtype=torch.float32)
y_test = torch.tensor(y_test.values, dtype=torch.float32)
```
接下来,定义神经网络的结构:
```python
# 定义神经网络结构
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(X_train.shape[1], 128)
self.fc2 = nn.Linear(128, 64)
self.fc3 = nn.Linear(64, 32)
self.fc4 = nn.Linear(32, 1)
self.sigmoid = nn.Sigmoid()
def forward(self, x):
x = nn.functional.relu(self.fc1(x))
x = nn.functional.relu(self.fc2(x))
x = nn.functional.relu(self.fc3(x))
x = self.sigmoid(self.fc4(x))
return x
# 创建神经网络实例
model = Net()
```
这里定义了一个包含三个隐藏层和一个输出层的神经网络。第一个隐藏层有128个神经元,第二个隐藏层有64个神经元,第三个隐藏层有32个神经元。激活函数使用的是ReLU,输出层使用的是Sigmoid。
然后,定义损失函数和优化器:
```python
# 定义损失函数和优化器
criterion = nn.BCELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
```
这里使用二元交叉熵损失函数和Adam优化器进行训练。
接下来,开始训练模型:
```python
# 训练模型
for epoch in range(100):
optimizer.zero_grad()
outputs = model(X_train)
loss = criterion(outputs, y_train.unsqueeze(1))
loss.backward()
optimizer.step()
if (epoch+1) % 10 == 0:
print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, 100, loss.item()))
```
这里训练了100个轮次,每次使用所有训练样本进行训练。每10个轮次输出一次损失。
最后,使用测试集评估模型的性能:
```python
# 使用测试集评估模型性能
with torch.no_grad():
outputs = model(X_test)
predicted = (outputs >= 0.5).squeeze().int()
accuracy = (predicted == y_test.int()).sum().item() / y_test.shape[0]
print('Test accuracy: {:.2f}%'.format(accuracy * 100))
```
这里使用测试集对模型进行评估,并输出测试集上的准确率。
完整的代码如下:
```python
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.model_selection import train_test_split
# 加载数据集
data = pd.read_csv('data.csv')
# 将标签列转换为数值类型
data['label'] = data['label'].astype('category').cat.codes
# 将特征和标签分开
X = data.drop('label', axis=1)
y = data['label']
# 将数据集分为训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 将数据转换为PyTorch张量
X_train = torch.tensor(X_train.values, dtype=torch.float32)
X_test = torch.tensor(X_test.values, dtype=torch.float32)
y_train = torch.tensor(y_train.values, dtype=torch.float32)
y_test = torch.tensor(y_test.values, dtype=torch.float32)
# 定义神经网络结构
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(X_train.shape[1], 128)
self.fc2 = nn.Linear(128, 64)
self.fc3 = nn.Linear(64, 32)
self.fc4 = nn.Linear(32, 1)
self.sigmoid = nn.Sigmoid()
def forward(self, x):
x = nn.functional.relu(self.fc1(x))
x = nn.functional.relu(self.fc2(x))
x = nn.functional.relu(self.fc3(x))
x = self.sigmoid(self.fc4(x))
return x
# 创建神经网络实例
model = Net()
# 定义损失函数和优化器
criterion = nn.BCELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# 训练模型
for epoch in range(100):
optimizer.zero_grad()
outputs = model(X_train)
loss = criterion(outputs, y_train.unsqueeze(1))
loss.backward()
optimizer.step()
if (epoch+1) % 10 == 0:
print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, 100, loss.item()))
# 使用测试集评估模型性能
with torch.no_grad():
outputs = model(X_test)
predicted = (outputs >= 0.5).squeeze().int()
accuracy = (predicted == y_test.int()).sum().item() / y_test.shape[0]
print('Test accuracy: {:.2f}%'.format(accuracy * 100))
```
需要根据具体的数据集进行调整和优化神经网络的结构和参数,以实现更好的分类效果。
阅读全文