基于支持向量机的泰坦尼克号生还预测代码
时间: 2023-10-05 22:14:01 浏览: 108
以下是基于支持向量机(SVM)的泰坦尼克号生还预测代码示例:
首先,导入必要的库和数据集:
```
import pandas as pd
from sklearn import svm
from sklearn.model_selection import train_test_split
# 读取数据集
titanic_data = pd.read_csv('titanic.csv')
# 数据预处理
titanic_data['Age'].fillna(titanic_data['Age'].median(), inplace=True)
titanic_data.drop('Cabin', axis=1, inplace=True)
titanic_data.dropna(inplace=True)
# 特征选择
features = ['Pclass', 'Sex', 'Age', 'SibSp', 'Parch', 'Embarked']
X = pd.get_dummies(titanic_data[features])
y = titanic_data['Survived']
```
接下来,将数据集分为训练集和测试集:
```
# 将数据集分为训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
```
然后,使用SVM模型进行训练和预测:
```
# 创建SVM模型并训练
model = svm.SVC(kernel='linear', C=0.1)
model.fit(X_train, y_train)
# 预测测试集结果
y_pred = model.predict(X_test)
```
最后,输出模型的准确率和混淆矩阵:
```
from sklearn.metrics import accuracy_score, confusion_matrix
# 输出准确率
print('Accuracy:', accuracy_score(y_test, y_pred))
# 输出混淆矩阵
print('Confusion Matrix:\n', confusion_matrix(y_test, y_pred))
```
完整的代码示例如下:
```
import pandas as pd
from sklearn import svm
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix
# 读取数据集
titanic_data = pd.read_csv('titanic.csv')
# 数据预处理
titanic_data['Age'].fillna(titanic_data['Age'].median(), inplace=True)
titanic_data.drop('Cabin', axis=1, inplace=True)
titanic_data.dropna(inplace=True)
# 特征选择
features = ['Pclass', 'Sex', 'Age', 'SibSp', 'Parch', 'Embarked']
X = pd.get_dummies(titanic_data[features])
y = titanic_data['Survived']
# 将数据集分为训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# 创建SVM模型并训练
model = svm.SVC(kernel='linear', C=0.1)
model.fit(X_train, y_train)
# 预测测试集结果
y_pred = model.predict(X_test)
# 输出准确率
print('Accuracy:', accuracy_score(y_test, y_pred))
# 输出混淆矩阵
print('Confusion Matrix:\n', confusion_matrix(y_test, y_pred))
```
阅读全文