使用python编写基于紫外-可见光谱的水质COD检测代码,读取三份excel文件,分别为10mg/L,15mg/L,20mg/L的标准溶液,excel文件中第一列数据为波长,第二列为透过率,对数据进行一阶导数计算,得到一阶导数谱图,并绘制在同一个折线图上,再计算一阶导数谱的排列熵,通过排列熵值进行特征波长的提取,最后,利用提取的特征波长处的特征值与对应的COD浓度进行PLS建模,得到COD的预测模型
时间: 2023-10-05 11:05:23 浏览: 90
好的,这是一个比较复杂的任务,需要用到多个Python库,我会一步一步地为您讲解代码实现过程。在开始之前,请确保已经安装了以下库:
- pandas
- numpy
- matplotlib
- scipy
- scikit-learn
首先,读取三份excel文件,分别为10mg/L,15mg/L,20mg/L的标准溶液。假设这三份文件分别为"10mg.xlsx","15mg.xlsx"和"20mg.xlsx",并且这三个文件都有两列数据,第一列为波长,第二列为透过率。
```python
import pandas as pd
# 读取10mg/L的标准溶液
df1 = pd.read_excel('10mg.xlsx')
wavelength = df1.iloc[:,0] # 波长
transmittance1 = df1.iloc[:,1] # 透过率
# 读取15mg/L的标准溶液
df2 = pd.read_excel('15mg.xlsx')
transmittance2 = df2.iloc[:,1] # 透过率
# 读取20mg/L的标准溶液
df3 = pd.read_excel('20mg.xlsx')
transmittance3 = df3.iloc[:,1] # 透过率
```
接下来,对数据进行一阶导数计算,得到一阶导数谱图,并绘制在同一个折线图上。
```python
import numpy as np
import matplotlib.pyplot as plt
# 计算一阶导数
def derivative(x, y):
dy = np.gradient(y, x)
dx = np.gradient(x)
dydx = dy / dx
return dydx
# 计算一阶导数谱
def derivative_spectrum(wavelength, transmittance):
dydx = derivative(wavelength, transmittance)
dydx_spectrum = np.abs(dydx)
return dydx_spectrum
# 计算三个标准溶液的一阶导数谱
dydx_spectrum1 = derivative_spectrum(wavelength, transmittance1)
dydx_spectrum2 = derivative_spectrum(wavelength, transmittance2)
dydx_spectrum3 = derivative_spectrum(wavelength, transmittance3)
# 绘制三个标准溶液的一阶导数谱图
plt.plot(wavelength, dydx_spectrum1, label='10mg/L')
plt.plot(wavelength, dydx_spectrum2, label='15mg/L')
plt.plot(wavelength, dydx_spectrum3, label='20mg/L')
plt.xlabel('Wavelength (nm)')
plt.ylabel('Derivative spectrum')
plt.legend()
plt.show()
```
下一步是计算一阶导数谱的排列熵,通过排列熵值进行特征波长的提取。
```python
from scipy.stats import entropy
# 计算一阶导数谱的排列熵
def permutation_entropy(spectrum):
p = np.zeros((len(spectrum), len(spectrum)))
for i in range(len(spectrum)):
for j in range(len(spectrum)):
if spectrum[i] <= spectrum[j]:
p[i,j] = 1
pe = entropy(p.sum(axis=1))
return pe
# 计算三个标准溶液的一阶导数谱的排列熵
pe1 = permutation_entropy(dydx_spectrum1)
pe2 = permutation_entropy(dydx_spectrum2)
pe3 = permutation_entropy(dydx_spectrum3)
# 提取特征波长
feature_wavelength = wavelength[np.argmin([pe1, pe2, pe3])]
```
最后,利用提取的特征波长处的特征值与对应的COD浓度进行PLS建模,得到COD的预测模型。假设COD浓度数据保存在"COD.xlsx"文件中,第一列为COD浓度,第二列为特征波长处的特征值。
```python
from sklearn.cross_decomposition import PLSRegression
# 读取COD浓度数据
df_cod = pd.read_excel('COD.xlsx')
cod_concentration = df_cod.iloc[:,0] # COD浓度
feature_value = df_cod.iloc[:,1] # 特征波长处的特征值
# 进行PLS建模
X = np.array(feature_value).reshape(-1, 1)
y = np.array(cod_concentration)
pls = PLSRegression(n_components=1)
pls.fit(X, y)
# 预测COD浓度
predict_value = pls.predict(np.array(feature_wavelength).reshape(1, -1))[0]
print('Predicted COD concentration:', predict_value)
```
完整代码如下:
```python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import entropy
from sklearn.cross_decomposition import PLSRegression
# 读取10mg/L的标准溶液
df1 = pd.read_excel('10mg.xlsx')
wavelength = df1.iloc[:,0] # 波长
transmittance1 = df1.iloc[:,1] # 透过率
# 读取15mg/L的标准溶液
df2 = pd.read_excel('15mg.xlsx')
transmittance2 = df2.iloc[:,1] # 透过率
# 读取20mg/L的标准溶液
df3 = pd.read_excel('20mg.xlsx')
transmittance3 = df3.iloc[:,1] # 透过率
# 计算一阶导数
def derivative(x, y):
dy = np.gradient(y, x)
dx = np.gradient(x)
dydx = dy / dx
return dydx
# 计算一阶导数谱
def derivative_spectrum(wavelength, transmittance):
dydx = derivative(wavelength, transmittance)
dydx_spectrum = np.abs(dydx)
return dydx_spectrum
# 计算三个标准溶液的一阶导数谱
dydx_spectrum1 = derivative_spectrum(wavelength, transmittance1)
dydx_spectrum2 = derivative_spectrum(wavelength, transmittance2)
dydx_spectrum3 = derivative_spectrum(wavelength, transmittance3)
# 绘制三个标准溶液的一阶导数谱图
plt.plot(wavelength, dydx_spectrum1, label='10mg/L')
plt.plot(wavelength, dydx_spectrum2, label='15mg/L')
plt.plot(wavelength, dydx_spectrum3, label='20mg/L')
plt.xlabel('Wavelength (nm)')
plt.ylabel('Derivative spectrum')
plt.legend()
plt.show()
# 计算一阶导数谱的排列熵
def permutation_entropy(spectrum):
p = np.zeros((len(spectrum), len(spectrum)))
for i in range(len(spectrum)):
for j in range(len(spectrum)):
if spectrum[i] <= spectrum[j]:
p[i,j] = 1
pe = entropy(p.sum(axis=1))
return pe
# 计算三个标准溶液的一阶导数谱的排列熵
pe1 = permutation_entropy(dydx_spectrum1)
pe2 = permutation_entropy(dydx_spectrum2)
pe3 = permutation_entropy(dydx_spectrum3)
# 提取特征波长
feature_wavelength = wavelength[np.argmin([pe1, pe2, pe3])]
# 读取COD浓度数据
df_cod = pd.read_excel('COD.xlsx')
cod_concentration = df_cod.iloc[:,0] # COD浓度
feature_value = df_cod.iloc[:,1] # 特征波长处的特征值
# 进行PLS建模
X = np.array(feature_value).reshape(-1, 1)
y = np.array(cod_concentration)
pls = PLSRegression(n_components=1)
pls.fit(X, y)
# 预测COD浓度
predict_value = pls.predict(np.array(feature_wavelength).reshape(1, -1))[0]
print('Predicted COD concentration:', predict_value)
```
阅读全文