nsga2算法实现特征选择的python代码
时间: 2023-10-27 22:03:29 浏览: 195
NSGAII算法的Python实现代码
5星 · 资源好评率100%
NSGA-II(Non-dominated Sorting Genetic Algorithm II,非支配排序遗传算法 II)是一种常用于多目标优化的遗传算法。下面是一个基于NSGA-II算法实现的特征选择的Python代码实例:
```python
import numpy as np
from sklearn.feature_selection import SelectKBest
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from deap import base, creator, tools, algorithms
# 加载数据
data = np.loadtxt('data.csv', delimiter=',')
X = data[:, :-1]
y = data[:, -1]
# 特征选择适应度函数
def evaluate(individual):
# 特征选择
selected_features = [index for index, value in enumerate(individual) if value]
X_new = X[:, selected_features]
# 数据归一化
scaler = MinMaxScaler()
X_scaled = scaler.fit_transform(X_new)
# 数据划分
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)
# 模型训练与预测
model = SVC()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
# 计算准确率
accuracy = accuracy_score(y_test, y_pred)
# 返回准确率和特征个数
return accuracy, sum(individual),
# 个体和种群定义
creator.create('FitnessMax', base.Fitness, weights=(1.0, 1.0))
creator.create('Individual', list, fitness=creator.FitnessMax)
toolbox = base.Toolbox()
toolbox.register('attr_bool', np.random.randint, 0, 2)
toolbox.register('individual', tools.initRepeat, creator.Individual, toolbox.attr_bool, n=len(X[0]))
toolbox.register('population', tools.initRepeat, list, toolbox.individual)
toolbox.register('evaluate', evaluate)
toolbox.register('mate', tools.cxOnePoint)
toolbox.register('mutate', tools.mutFlipBit, indpb=0.05)
toolbox.register('select', tools.selNSGA2)
# 运行遗传算法
population_size = 100
num_generations = 50
population = toolbox.population(n=population_size)
for generation in range(num_generations):
offspring = algorithms.varAnd(population, toolbox, cxpb=0.5, mutpb=0.1)
fits = toolbox.map(toolbox.evaluate, offspring)
for fit, ind in zip(fits, offspring):
ind.fitness.values = fit
population = toolbox.select(offspring, k=population_size)
# 输出最终结果
best_individual = tools.selBest(population, k=1)[0]
selected_features = [index for index, value in enumerate(best_individual) if value]
print('Selected Features:', selected_features)
print('Number of Selected Features:', len(selected_features))
```
以上代码通过遗传算法实现了特征选择过程。首先,代码加载数据,然后定义了一个适应度函数`evaluate`,该函数根据特征选择结果进行数据处理、模型训练和评估,并返回准确率及选中特征个数。
接着,代码定义了个体和种群,并注册了遗传算子函数。然后,代码设定了遗传算法的参数,包括种群数量和迭代次数,并进行了遗传算法的运行。
最后,从种群中选择出最佳个体,并根据最佳个体的特征选择结果输出选中特征的索引及个数。
需要注意的是,为了运行以上代码,需要安装相应的依赖库,如`numpy`、`scikit-learn`、`deap`等。另外,代码中的`data.csv`需要替换为实际的数据文件路径。
阅读全文