写一段利用神经网络对20种特征值进行二分类的代码
时间: 2024-03-15 15:05:24 浏览: 102
基于卷积神经网络-长短期记忆网络CNN-LSTM分类预测,matlab代码,要求2019及以上版本 多特征输入单输出的二分类
以下是使用 PyTorch 实现一个简单的神经网络对20种特征值进行二分类的代码:
```python
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, Dataset
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
# 下载数据集
newsgroups_train = fetch_20newsgroups(subset='train', remove=('headers', 'footers', 'quotes'))
# 特征提取
vectorizer = CountVectorizer(stop_words='english', max_features=5000)
X = vectorizer.fit_transform(newsgroups_train.data).toarray()
y = newsgroups_train.target
# 数据集划分
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 定义数据集类
class MyDataset(Dataset):
def __init__(self, x, y):
super(MyDataset, self).__init__()
self.x = torch.tensor(x, dtype=torch.float32)
self.y = torch.tensor(y, dtype=torch.long)
def __getitem__(self, index):
return self.x[index], self.y[index]
def __len__(self):
return len(self.x)
# 定义神经网络模型
class Net(nn.Module):
def __init__(self, input_dim, hidden_dim, output_dim):
super(Net, self).__init__()
self.fc1 = nn.Linear(input_dim, hidden_dim)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(hidden_dim, output_dim)
self.softmax = nn.Softmax(dim=1)
def forward(self, x):
out = self.fc1(x)
out = self.relu(out)
out = self.fc2(out)
out = self.softmax(out)
return out
# 定义训练函数
def train(model, criterion, optimizer, train_loader, num_epochs):
for epoch in range(num_epochs):
for i, (inputs, labels) in enumerate(train_loader):
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
if (i+1) % 10 == 0:
print('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}'.format(epoch+1, num_epochs, i+1, len(train_loader), loss.item()))
# 准备数据
input_dim = X_train.shape[1]
hidden_dim = 10
output_dim = len(set(y_train))
train_dataset = MyDataset(X_train, y_train)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
# 定义模型、损失函数和优化器
model = Net(input_dim, hidden_dim, output_dim)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)
# 开始训练
num_epochs = 10
train(model, criterion, optimizer, train_loader, num_epochs)
# 测试模型
test_x = torch.tensor(X_test, dtype=torch.float32)
pred_y = model(test_x)
_, pred_y = torch.max(pred_y, dim=1)
accuracy = (pred_y == torch.tensor(y_test)).sum().item() / len(y_test)
print('Test accuracy: {:.4f}'.format(accuracy))
```
上述代码中,我们使用了 `fetch_20newsgroups` 函数从 scikit-learn 库中下载了新闻分类数据集,并使用 `CountVectorizer` 对文本进行了特征提取。在模型方面,我们定义了一个包含一个隐藏层和一个输出层的神经网络模型,输入维度为特征数,输出维度为类别数。在训练方面,我们使用了交叉熵损失函数和 Adam 优化器。最后,我们使用训练好的模型对测试数据进行了预测,并计算了准确率。
阅读全文