写一段MFCC算法,要求将其输出的特征值转换为整数
时间: 2024-02-27 16:58:54 浏览: 90
以下是一个简单的MFCC算法示例,它使用Python实现,可以将输出的特征值转换为整数:
```python
import numpy as np
import scipy.fftpack as fft
from scipy.signal import hamming
from scipy.fftpack.realtransforms import dct
# 常数定义
num_mel_filters = 26
num_cepstral_coeffs = 13
frame_length = 0.025 # 25ms
frame_step = 0.010 # 10ms
sample_rate = 16000
lower_edge_frequency = 0
upper_edge_frequency = sample_rate / 2
# 预处理Hamming窗函数
hamming_window = hamming(int(frame_length * sample_rate))
def pre_emphasis(signal, alpha=0.97):
# 预加重滤波器
return np.append(signal[0], signal[1:] - alpha * signal[:-1])
def get_mel_filterbank(num_filters, fft_size, sample_rate, lower_freq, upper_freq):
# 计算梅尔滤波器组
mel_filterbank = np.zeros((num_filters, int(fft_size / 2 + 1)))
mel_low_freq = 1125 * np.log(1 + lower_freq / 700)
mel_high_freq = 1125 * np.log(1 + upper_freq / 700)
mel_freq_points = np.linspace(mel_low_freq, mel_high_freq, num_filters + 2)
hz_freq_points = 700 * (np.exp(mel_freq_points / 1125) - 1)
bin_freq_points = np.floor((fft_size + 1) * hz_freq_points / sample_rate)
for i in range(0, num_filters):
left = int(bin_freq_points[i])
center = int(bin_freq_points[i+1])
right = int(bin_freq_points[i+2])
for j in range(left, center):
mel_filterbank[i, j] = (j - bin_freq_points[i]) / (bin_freq_points[i+1] - bin_freq_points[i])
for j in range(center, right):
mel_filterbank[i, j] = (bin_freq_points[i+2] - j) / (bin_freq_points[i+2] - bin_freq_points[i+1])
return mel_filterbank
def mfcc(signal):
# 获取预加重信号
pre_emphasis_signal = pre_emphasis(signal)
# 将信号分帧
frame_length_sample = int(frame_length * sample_rate)
frame_step_sample = int(frame_step * sample_rate)
signal_length = len(pre_emphasis_signal)
num_frames = int(np.ceil(float(np.abs(signal_length - frame_length_sample)) / frame_step_sample))
padsignal_length = num_frames * frame_step_sample + frame_length_sample
padsignal = np.zeros((padsignal_length,))
padsignal[:signal_length] = pre_emphasis_signal
indices = np.tile(np.arange(0, frame_length_sample), (num_frames, 1)) + np.tile(np.arange(0, num_frames * frame_step_sample, frame_step_sample), (frame_length_sample, 1)).T
frames = padsignal[indices.astype(np.int32, copy=False)]
# 加窗
windowed_frames = frames * hamming_window
# 傅里叶变换
NFFT = 512
mag_frames = np.absolute(np.fft.rfft(windowed_frames, NFFT)) # Magnitude of the FFT
pow_frames = ((1.0 / NFFT) * ((mag_frames) ** 2)) # Power Spectrum
# 计算Mel滤波器组
mel_filterbank = get_mel_filterbank(num_mel_filters, NFFT, sample_rate, lower_edge_frequency, upper_edge_frequency)
# 应用Mel滤波器组
mel_filtered_frames = np.dot(pow_frames, mel_filterbank.T)
# 取对数
log_mel_filtered_frames = 20 * np.log10(mel_filtered_frames)
# 应用离散余弦变换
dct_frames = dct(log_mel_filtered_frames, type=2, axis=1, norm='ortho')[:, 1:(num_cepstral_coeffs + 1)]
# 将特征值转换为整数
dct_frames = dct_frames * 10000
dct_frames = dct_frames.astype(np.int32)
return dct_frames
```
这个算法首先对信号进行预加重,然后将其分帧,并应用Hamming窗来减少频谱泄露。接下来,对每一帧进行傅里叶变换,然后计算Mel滤波器组并应用它们。将结果取对数,然后应用离散余弦变换以获得MFCC系数。最后,将得到的特征值乘以一个大的数(例如10000),并将其转换为整数。
阅读全文