请写出一段特征工程代码,包含13个特征,输出有用特征名字
时间: 2023-06-14 19:04:35 浏览: 94
一种特征提取代码
以下是一个简单的特征工程代码示例,包含13个特征:
```python
import pandas as pd
# 读取数据
data = pd.read_csv('data.csv')
# 特征1:年龄(数值型)
data['age'] = 2021 - data['birth_year']
# 特征2:性别(分类型)
data['gender'] = data['gender'].map({'male': 0, 'female': 1})
# 特征3:身高(数值型)
data['height'] = data['height_cm']
# 特征4:体重(数值型)
data['weight'] = data['weight_kg']
# 特征5:BMI(数值型)
data['bmi'] = data['weight'] / (data['height'] / 100) ** 2
# 特征6:收缩压(数值型)
data['sbp'] = data['systolic_bp_mmhg']
# 特征7:舒张压(数值型)
data['dbp'] = data['diastolic_bp_mmhg']
# 特征8:血糖(数值型)
data['glucose'] = data['fasting_glucose_mg_dl']
# 特征9:是否吸烟(分类型)
data['smoking'] = data['smoking'].map({'never': 0, 'former': 1, 'current': 2})
# 特征10:是否饮酒(分类型)
data['alcohol'] = data['alcohol'].map({'never': 0, 'former': 1, 'current': 2})
# 特征11:是否运动(分类型)
data['exercise'] = data['exercise'].map({'none': 0, '1-2 times per week': 1, '3-4 times per week': 2, '5 or more times per week': 3})
# 特征12:睡眠时间(数值型)
data['sleep'] = data['sleep_hours']
# 特征13:是否有心血管疾病(分类型)
data['cardio'] = data['cardio'].astype(int)
# 输出有用特征名字
useful_features = ['age', 'gender', 'height', 'weight', 'bmi', 'sbp', 'dbp', 'glucose', 'smoking', 'alcohol', 'exercise', 'sleep', 'cardio']
print('有用特征:', useful_features)
```
说明:以上特征均为模拟数据,仅供参考。在实际特征工程中,需要根据数据集的具体情况进行选择和处理。
阅读全文