给我一份用excel数据集的ga预测有对比图的代码
时间: 2024-02-06 13:09:12 浏览: 24
好的,以下是一个简单的用 Excel 数据集进行 GA 预测并生成对比图的 Python 代码:
```python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
# 读取 Excel 数据集
data = pd.read_excel('data.xlsx')
# 将数据集拆分为自变量和因变量
X = data.iloc[:, :-1].values
y = data.iloc[:, -1].values
# 将数据集拆分为训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
# 定义适应度函数,这里使用 R^2 作为评价指标
def fitness(individual):
# 将个体解码为参数
a, b, c = individual
# 训练线性回归模型
model = LinearRegression()
model.fit(X_train ** a, y_train * b + c)
# 预测测试集结果
y_pred = model.predict(X_test ** a) / b - c
# 计算 R^2 得分
score = r2_score(y_test, y_pred)
return score,
# 定义 GA 参数
n_population = 50
n_generation = 100
crossover_rate = 0.8
mutation_rate = 0.1
# 定义 GA 算法
from deap import algorithms, base, creator, tools
creator.create('FitnessMax', base.Fitness, weights=(1.0,))
creator.create('Individual', list, fitness=creator.FitnessMax)
toolbox = base.Toolbox()
toolbox.register('attr_float', np.random.uniform, -1, 1)
toolbox.register('individual', tools.initRepeat, creator.Individual, toolbox.attr_float, n=3)
toolbox.register('population', tools.initRepeat, list, toolbox.individual)
toolbox.register('evaluate', fitness)
toolbox.register('mate', tools.cxBlend, alpha=0.5)
toolbox.register('mutate', tools.mutGaussian, mu=0, sigma=0.1, indpb=0.1)
toolbox.register('select', tools.selTournament, tournsize=3)
population = toolbox.population(n_population)
# 运行 GA 算法
history = algorithms.eaSimple(population, toolbox, cxpb=crossover_rate, mutpb=mutation_rate, ngen=n_generation)
# 打印最佳个体
best_individual = tools.selBest(population, k=1)[0]
print('Best individual:', best_individual)
# 生成对比图
a, b, c = best_individual
model = LinearRegression()
model.fit(X_train ** a, y_train * b + c)
y_pred_train = model.predict(X_train ** a) / b - c
y_pred_test = model.predict(X_test ** a) / b - c
plt.scatter(X_train, y_train, color='blue')
plt.scatter(X_test, y_test, color='green')
plt.plot(X_train, y_pred_train, color='red')
plt.plot(X_test, y_pred_test, color='orange')
plt.show()
```
这个代码会读取名为 `data.xlsx` 的 Excel 数据集,然后拆分为自变量和因变量。接着会将数据集拆分为训练集和测试集,并使用 GA 算法对训练集进行优化。最后,将 GA 得到的最佳个体解码为线性回归模型的参数,预测训练集和测试集结果,并生成对比图。