kaggle入门项目代码
时间: 2023-10-19 07:09:20 浏览: 150
kaggle:kaggle 代码
以下是一个简单的 Kaggle 入门项目的代码示例,该项目是 Titanic 生存预测:
```python
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
# 加载数据
train_data = pd.read_csv('train.csv')
test_data = pd.read_csv('test.csv')
# 特征工程
train_data['Sex'] = train_data['Sex'].map({'female': 0, 'male': 1}).astype(int)
test_data['Sex'] = test_data['Sex'].map({'female': 0, 'male': 1}).astype(int)
train_data['Embarked'] = train_data['Embarked'].fillna('S')
train_data['Embarked'] = train_data['Embarked'].map({'S': 0, 'C': 1, 'Q': 2}).astype(int)
test_data['Embarked'] = test_data['Embarked'].fillna('S')
test_data['Embarked'] = test_data['Embarked'].map({'S': 0, 'C': 1, 'Q': 2}).astype(int)
train_data['Age'] = train_data['Age'].fillna(train_data['Age'].median())
test_data['Age'] = test_data['Age'].fillna(test_data['Age'].median())
train_data = train_data.drop(['PassengerId', 'Name', 'Ticket', 'Cabin'], axis=1)
test_data = test_data.drop(['PassengerId', 'Name', 'Ticket', 'Cabin'], axis=1)
# 训练模型
X_train = train_data.drop('Survived', axis=1)
y_train = train_data['Survived']
X_test = test_data
model = RandomForestClassifier(n_estimators=100, max_depth=5, random_state=1)
model.fit(X_train, y_train)
# 预测结果
y_pred = model.predict(X_test)
# 保存结果
output = pd.DataFrame({'PassengerId': test_data_org['PassengerId'], 'Survived': y_pred})
output.to_csv('submission.csv', index=False)
```
阅读全文