航空公司客户流失预测python代码
时间: 2023-07-23 08:16:45 浏览: 49
以下是一个基于Python的航空公司客户流失预测代码示例,使用了逻辑回归模型:
```
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score
# 读取数据
data = pd.read_csv("airline.csv")
# 特征工程
features = ["age", "gender", "flight_frequency", "flight_class"]
X = pd.get_dummies(data[features])
y = data["churn"]
# 数据集划分
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 模型建立和训练
model = LogisticRegression()
model.fit(X_train, y_train)
# 模型预测和评估
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
print("Accuracy: ", accuracy)
print("Precision: ", precision)
print("Recall: ", recall)
```
其中,"airline.csv"是包含航空公司客户信息和流失情况的数据集,"age"、"gender"、"flight_frequency"和"flight_class"是用于特征工程的特征,通过对这些特征进行独热编码得到了模型的输入特征向量X,"churn"是客户流失情况的标签y。使用train_test_split函数将数据集划分为训练集和测试集,使用LogisticRegression模型进行建立和训练,使用accuracy_score、precision_score和recall_score函数对模型进行评估,得出模型的准确率、精确率和召回率。