pytorch 泰坦尼克
时间: 2023-11-09 14:03:21 浏览: 145
好的,我明白了。关于 PyTorch 和泰坦尼克数据集的问题,我猜测你是想问如何使用 PyTorch 对泰坦尼克数据集进行分类预测。如果是这样,可以按照以下步骤进行:
1. 导入必要的库和数据集
```python
import torch
import torch.nn as nn
import pandas as pd
from sklearn.model_selection import train_test_split
# 读取数据集
data = pd.read_csv('titanic.csv')
```
2. 数据预处理
```python
# 去除无用特征
data.drop(['PassengerId', 'Name', 'Ticket', 'Cabin'], axis=1, inplace=True)
# 处理缺失值
data['Age'].fillna(data['Age'].mean(), inplace=True)
data['Embarked'].fillna('S', inplace=True)
# 将分类变量转换为数值变量
data['Sex'] = data['Sex'].map({'male': 0, 'female': 1})
data['Embarked'] = data['Embarked'].map({'S': 0, 'C': 1, 'Q': 2})
# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(data.drop('Survived', axis=1), data['Survived'], test_size=0.2, random_state=42)
# 转换为 PyTorch 张量
X_train = torch.tensor(X_train.values, dtype=torch.float32)
X_test = torch.tensor(X_test.values, dtype=torch.float32)
y_train = torch.tensor(y_train.values, dtype=torch.float32)
y_test = torch.tensor(y_test.values, dtype=torch.float32)
```
3. 定义模型
```python
class TitanicModel(nn.Module):
def __init__(self):
super(TitanicModel, self).__init__()
self.fc1 = nn.Linear(7, 64)
self.fc2 = nn.Linear(64, 32)
self.fc3 = nn.Linear(32, 1)
self.relu = nn.ReLU()
self.sigmoid = nn.Sigmoid()
def forward(self, x):
out = self.fc1(x)
out = self.relu(out)
out = self.fc2(out)
out = self.relu(out)
out = self.fc3(out)
out = self.sigmoid(out)
return out
model = TitanicModel()
```
4. 训练模型
```python
criterion = nn.BCELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
for epoch in range(1000):
optimizer.zero_grad()
y_pred = model(X_train)
loss = criterion(y_pred.squeeze(), y_train)
loss.backward()
optimizer.step()
if epoch % 100 == 0:
print('Epoch {}: train loss: {}'.format(epoch, loss.item()))
```
5. 评估模型
```python
with torch.no_grad():
y_pred = model(X_test)
y_pred_class = y_pred.round()
accuracy = (y_pred_class == y_test).sum() / float(len(y_test))
print('Test accuracy:', accuracy.item())
```
阅读全文