print(get_mfcc_simplify(r"C:\Users\Administrator\Desktop\tensorflow-speech-commands\bed\00f0204f_nohash_0,wav").shape)

get_mfcc_simplify函数是一个用于提取音频信号的MFCC特征的函数。它接受一个音频文件路径作为输入，并返回该音频文件的MFCC特征的形状。在你提供的代码中，print(get_mfcc_simplify(r"C:\Users\Administrator\Desktop\tensorflow-speech-commands\bed\00f0204f_nohash_0,wav").shape)的作用是打印出该音频文件的MFCC特征的形状。请注意，由于我无法直接运行代码，因此无法提供确切的结果。但是，一般来说，MFCC特征是一个二维数组，其形状可能是（时间步长，MFCC系数）。

subtools/kaldi/steps/make_mfcc_pitch.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.

这个日志信息来自于Kaldi的make_mfcc_pitch.sh脚本，它指示没有segments文件存在，因此假设wav.scp文件已按照utterance进行索引。segments文件是一个用于指定音频段落（segments）的文件，而wav.scp文件则是一个指定音频文件路径的文件。由于缺少segments文件，脚本会默认使用wav.scp文件按照utterance进行索引。

copy-feats --compress=true --write-num-frames=ark,t:exp/features/mfcc/data_mfcc_23_pitch_seg/log/utt2num_frames.1 ark:- ark,scp:/work/VPR/subtools_1229/exp/features/mfcc/data_mfcc_23_pitch_seg/raw_mfcc_pitch_seg.1.ark,/work/VPR/subtools_1229/exp/features/mfcc/data_mfcc_23_pitch_seg/raw_mfcc_pitch_seg.1.scp paste-feats --length-tolerance=2 'ark:compute-mfcc-feats --write-utt2dur=ark,t:exp/features/mfcc/data_mfcc_23_pitch_seg/log/utt2dur.1 --verbose=2 --config=subtools/conf/sre-mfcc-23.conf scp,p:exp/features/mfcc/data_mfcc_23_pitch_seg/log/wav_seg.1.scp ark:- |' 'ark,s,cs:compute-kaldi-pitch-feats --verbose=2 --config=subtools/conf/pitch.conf scp,p:exp/features/mfcc/data_mfcc_23_pitch_seg/log/wav_seg.1.scp ark:- | process-kaldi-pitch-feats ark:- ark:- |' ark:- compute-mfcc-feats --write-utt2dur=ark,t:exp/features/mfcc/data_mfcc_23_pitch_seg/log/utt2dur.1 --verbose=2 --config=subtools/conf/sre-mfcc-23.conf scp,p:exp/features/mfcc/data_mfcc_23_pitch_seg/log/wav_seg.1.scp ark:- VLOG[2] (compute-mfcc-feats[5.5]:main():compute-mfcc-feats.cc:182) Processed features for key 001_20230623160347_0319007398_mentianyu-1 compute-kaldi-pitch-feats --verbose=2 --config=subtools/conf/pitch.conf scp,p:exp/features/mfcc/data_mfcc_23_pitch_seg/log/wav_seg.1.scp ark:- ERROR (compute-kaldi-pitch-feats[5.5]:main():compute-kaldi-pitch-feats.cc:88) Sample frequency mismatch: you specified 16000 but data has 8000 (use --sample-frequency option). Utterance is 001_20230623160347_0319007398_mentianyu-1

这些命令是用于特征处理的一系列操作。首先，copy-feats命令将特征数据进行复制并保存到指定的ark和scp文件中。接下来，paste-feats命令将两个特征数据进行拼接。其中，compute-mfcc-feats命令用于计算MFCC特征，并将结果存储在ark文件中。compute-kaldi-pitch-feats命令用于计算音高特征，并将结果存储在ark文件中。然而，在最后一条命令中，compute-kaldi-pitch-feats命令出现了错误。错误信息指示样本频率不匹配，你指定了16000Hz，但数据实际上是8000Hz。建议使用--sample-frequency选项来指定正确的样本频率。该错误发生在utterance为"001_20230623160347_0319007398_mentianyu-1"的情况下。

print(get_mfcc_simplify(r"C:\Users\Administrator\Desktop\tensorflow-speech-commands\bed\00f0204f_nohash_0,wav").shape)

subtools/kaldi/steps/make_mfcc_pitch.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.

相关推荐

mfc实现简单文字输出

MFC操作默认打印机打印简单语句

MFC读入文本文件精简代码

mfcc_feat = python_speech_features.mfcc作用

feature = np.hstack((wav_feature, d_mfcc_feat, d_mfcc_feat2))

librosa.feature.mfcc里面的n_mfcc=40

from python_speech_features import mfcc出错原因

def extrat_mfcc(wav_fill_name): y,sr=librosa.load(wav_fill_name) mfccs=np.mean(librosa.feature.mfcc(y=y,sr=sr,n_mfcc=40).T,axis=0) return mfccs

speech_commands.input_data

怎么在tensorflow环境下安装python_speech_features工具

from python_speech_features import mfcc代码出错

d_mfcc_feat = delta(wav_feature, 1)

tensorflow2.x版本支持python_speech_features不

from python_speech_features import mfcc from dtw import dtw

plot_images(mfccs[100:109], cls_true[100:109])

最新推荐

语音SDK_speech毕业论文

NanoAirline航空公司管理系统.zip

基于Tensorflow的手势识别代码+数据集+文档说明（期末大作业）

十种常见电感线圈电感量计算公式详解

管理建模和仿真的文件

【树结构遍历操作】：JavaScript深度优先与广度优先算法详解

年份是否为闰年C语言判断

军用车辆：CAN总线的集成与优势

"互动学习：行动中的多样性与论文攻读经历"

【图结构优化】：在JavaScript中实现与提升性能的策略