数字MEMS麦克风阵列设计与语音识别实验

下载需积分: 50 | PDF格式 | 1.57MB | 更新于2024-07-19 | 155 浏览量 | 举报

3 收藏

"该资源是一份关于数字麦克风阵列设计、实现及语音识别实验的硕士论文，由Erich Zwyssig在2009年8月21日提交给爱丁堡大学。论文主要关注未来会议室中的关键组件——语音记录设备，即麦克风阵列，特别强调了便携性和成本效益。随着新型数字MEMS（微电子机械系统）麦克风的出现，这一领域有望取得突破。论文详述了首次成功实施的数字MEMS麦克风阵列的设计、构建、测试和评估过程，并通过与现有模拟麦克风阵列的对比，利用先进的自动语音识别（ASR）系统和自适应算法，证明了其性能相当。" 在数字麦克风阵列这一主题中，我们涉及到以下几个关键知识点： 1. **数字麦克风**：与传统的模拟麦克风相比，数字麦克风直接将声音信号转换为数字形式，减少了模拟信号处理阶段可能引入的噪声和失真，提高了信号质量。 2. **PDM（脉冲密度调制）与PCM（脉冲编码调制）**：这两种都是数字音频编码技术。PDM是一种简单的二进制调制方式，它通过改变连续时间内的脉冲密度来表示模拟信号的幅度；而PCM是更为常见且标准的编码方式，它将模拟信号采样并量化为离散值。PDM通常用于低功耗应用，如微型传感器，而PCM则在更广泛的音频系统中使用。 3. **MEMS麦克风**：微电子机械系统麦克风是一种集成在半导体芯片上的微型麦克风，具有体积小、功耗低、成本效益高等优点。它们在数字麦克风阵列中扮演重要角色，尤其适合需要便携性和成本控制的场合，如未来智能会议室。 4. **麦克风阵列设计**：阵列布局、麦克风间的距离和指向性设计等是提高声源定位和语音清晰度的关键因素。通过优化这些参数，可以实现更好的空间隔离，减少噪声干扰，提升语音识别效果。 5. **语音识别系统**：文中提到的ASR（Automatic Speech Recognition）系统是将语音转换为文本的自动化技术。结合自适应算法，可以实时或后期分析会议记录，提高会议效率。 6. **性能评估**：通过比较数字MEMS麦克风阵列与传统模拟阵列的词错误率（Word Error Rate, WER），可以衡量两者在语音识别准确度上的差异。较低的WER表明识别性能更优。这份论文深入探讨了数字麦克风阵列的潜力，特别是在便携式和经济高效的应用场景中，为未来会议室和其他类似环境的音频处理提供了有价值的参考。

Introduction

Page 16

Figure 1 Fields of Speech Recognition (with kind permission of [8])

Speech recognition problems which are considered as having well developed solutions are [77]:

• search, using the Viterbi algorithm

• acoustic modelling, using HMMs

• language modelling, using large corpora

• adaptation, using algorithms such as MLLR or MAP

• feature enhancement, using MFCCs or LPCs

• beamforming, using microphone arrays

While each of the above disciplines on its own is considered to be completed task, the

combination of all of the above into a DSR does not yet produce acceptable performances. The

aim of the AMI/AMIDA projects is to develop the smart meeting room and the necessary

applications. The current implementation of the IMR built by the CSTR has known limitations.

These limitations include:

• portability

• cheap commodity HW

The existing IMR (G3.07) is not portable and is built using specially designed equipment

housed in large racks.

The aim of this dissertation is to look at one device from the IMR, the microphone array. The

microphone array used by the CSTR is an array built of eight expensive analogue

language

modelling

acoustic

modelling

automatic speech

recognition

adaptation

feature

enhance-

beam-

forming

ment

robust speech

recognition

distant speech

recognition

Review

Page 18

Review

Building a digital MEMS microphone array is a multi-disciplinary task, as the name already

implies. The word digital implies computing and (digital) signal processing (D)SP. MEMS

(Micro Electro Mechanical System) indicates a nano-scale system containing both electrical

and mechanical components, while a microphone is a device that converts an acoustic signal

into an electrical signal. If multiple (e.g. eight) microphones are built together it is called a

microphone array.

This section first defines the terms analogue and digital. Next, reviews of MEMS microphones

and microphone arrays are presented. Finally, speech recognition including distant speech

recognition is reviewed.

Analogue vs. Digital

The real world is analogue. Why then are most devices digital? The answer is simple:

processing analogue signals is infinitely more complex and difficult then working in the digital

domain. It was only the invention of digital logic that led to the enormous technological

progress of the last few decades.

The first problem a system designer using a digital core faces is how to communicate with the

real world, which is still analogue. Analogue to digital conversion of input signals (e.g. audio

signals) and the conversion from the digital to the analogue domain are therefore a critical

system design factor.

The analogue to digital conversion of audio signals is a key issue for the digital microphone

array. While analogue microphones were invented more than a century ago, digital microphones

have only been around for about a decade and miniaturised MEMS digital microphones have

only been available recently [95] [127].

The application for which the digital microphone array presented in this dissertation has been

designed is speech recognition in meetings. Existing microphones and microphone arrays use

expensive analogue microphones and off-the-shelf converters, i.e. the conversion of the

acoustic signal into its digital representation is located several metres from the membrane which

captures the acoustic wave. The digital microphone (array) addresses this by putting the

conversion less than a millimetre away from the membrane [125]. This aims to simplify the

microphone array and reduce costs.

Review

Page 19

MEMS microphones

Twenty years of research and development were required for the first silicon (MEMS)

microphone to be commercially available ([62] in [63], [95]). In the first ten years research

focused mainly on the sensor structure (piezoelectric vs. piezoresistive vs. capacitive) and the

amplifier that follows the acoustic sensor (see Scheeper et al. [63] (1994) for a review).

Manufacturing MEMS microphones involved many problems, e.g. poor uniformity of the

microphone sensitivity on the same wafer, sticking of the membrane, non-linear frequency

response and process choice (see Ning et al. [55] 1996). From the mid 1990s capacitive sensors

were the dominant choice (Pederson et al. [58] 1998). A further ten years of research were

necessary to produce MEMS microphones which would operate in a customer application

environment. Such requirements are, for example:

• support of SMD mounting (Brauer et al. [23] 2001)

• use of standard CMOS processes (Neumann and Gabriel [54] 2002)

• good SNR performance (Neumann and Gabriel [53] 2003)

• operation with standard supply voltages (Weigold et al. [71] 2006)

Achievements such as listed above led to a breakthrough in MEMS microphone production and

usage today. The uptake of MEMS microphones from the mobile phone market, for example,

increased from annual sales worth $2 million in 2004 [78] to $140 million in 2006 and is

estimated at $922 Million for 2011 [118].

Commercial interest in MEMS microphones has, as a consequence, increased significantly and

research and development shifted away from the universities to industry, therefore leading to a

change in the type of publication from academic papers to patents. Current commercial interest,

for example, are:

• manufacturing and yield improvement [82],

• packaging [81],

• sensitivity improvement [83], and

• performance improvement (using calibration schemes [84])

Currently over twenty companies offer digital MEMS microphones with Akustica, Knowles

Acoustics, Sonion MEMS A/S, MEMS Technology Bhd or Wolfson Microelectronics leading

the field ([98], [102], [110], [113], [109]). Every single one of the companies mentioned above

claims to have the best performing MEMS microphone with key competitive features being

剩余117页未读，继续阅读

kttlmm

粉丝: 3

数字MEMS麦克风阵列设计与语音识别实验

麦克风阵列前端语音信号处理

麦克风阵列的语音测试

立体声音源定位stm32-M0

数字麦克风阵列在实际生活中的运用

数字麦克风阵列在实际生活中的运用案例

基于USB2.0的MEMS数字麦克风阵列采集系统设计.pdf

模拟技术中的数字麦克风阵列在实际生活中的运用案例

数字麦克风阵列：挑战与应用

数字麦克风阵列：改变拾音技术的潮流

数字麦克风阵列在移动设备中的应用与优势解析

最新资源