给出实现python代码
时间: 2023-07-23 10:26:17 浏览: 34
好的,以下是一个简单的实现,使用逻辑回归模型进行预测。由于数据集比较简单,所以这个模型的表现已经很不错了。当然,您也可以尝试使用其他的分类模型来进行预测。
```python
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# 读取数据集
train_df = pd.read_csv('train.csv')
test_df = pd.read_csv('test.csv')
# 数据预处理
train_df.drop(['PassengerId', 'Name', 'Ticket', 'Cabin'], axis=1, inplace=True)
test_df.drop(['PassengerId', 'Name', 'Ticket', 'Cabin'], axis=1, inplace=True)
train_df['Age'].fillna(train_df['Age'].mean(), inplace=True)
test_df['Age'].fillna(test_df['Age'].mean(), inplace=True)
train_df['Embarked'].fillna('S', inplace=True)
test_df['Embarked'].fillna('S', inplace=True)
train_df = pd.get_dummies(train_df, columns=['Sex', 'Embarked'])
test_df = pd.get_dummies(test_df, columns=['Sex', 'Embarked'])
# 特征选择
features = ['Pclass', 'Age', 'SibSp', 'Parch', 'Sex_female', 'Sex_male', 'Embarked_C', 'Embarked_Q', 'Embarked_S']
X_train = train_df[features]
y_train = train_df['Survived']
X_test = test_df[features]
# 模型训练
lr = LogisticRegression()
lr.fit(X_train, y_train)
# 模型评估
y_pred = lr.predict(X_train)
train_acc = accuracy_score(y_train, y_pred)
print('训练集准确率:', train_acc)
# 模型预测
y_test = lr.predict(X_test)
test_df['Survived'] = y_test
test_df.to_csv('submission.csv', columns=['Survived'])
```
需要注意的是,这只是一个简单的示例代码,仅供参考。在实际应用中,您需要根据具体情况进行数据处理、特征选择和模型调优等工作。