深度学习技术在识别孟加拉语方言中的应用

人工智能

需积分: 5 131 浏览量更新于2024-06-28 收藏 2.13MB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

"这篇论文利用深度学习方法来预测孟加拉语的不同地区口音，由Md. Abu Ibrahim、Md. Nawaz-S-Salekeen Nayeem和Sadaf Al Arabi等人撰写，提交给布拉格大学计算机科学与工程系作为计算机科学学士学位的部分要求。" 深度学习预测孟加拉语地区口音的研究是一项在人工智能领域的重要工作，它旨在通过先进的机器学习技术理解并区分孟加拉语的各种地方性变体。孟加拉语是一种广泛使用的语言，其不同区域的口音和方言可能显著不同，这使得理解和识别这些差异成为一项挑战。深度学习，作为一种模仿人脑神经网络结构的机器学习技术，已经在语音识别、自然语言处理等领域展现出强大的潜力。论文的作者们可能采用了深度神经网络（DNN）、卷积神经网络（CNN）或循环神经网络（RNN）等模型来处理音频数据，这些模型能够从大量的特征中学习并提取口音的模式。训练过程中，他们可能使用了大量带有标注的孟加拉语音频样本，这些样本来自不同地区的说话者，以确保模型能学习到各种口音的特征。在预处理阶段，音频文件可能会被转化为声谱图，这是一种将声音信号转化为图像的技术，便于深度学习模型进行处理。接着，模型会学习声谱图中的模式，如频率变化、持续时间等，这些特征与特定口音相关。模型的训练通常涉及反向传播算法和优化器，如梯度下降或Adam，以最小化预测口音与实际口音之间的误差。完成训练后，模型可以对新的未标注音频进行预测，判断其所属的地区口音。评估模型性能时，可能会使用准确率、精确率、召回率和F1分数等指标，同时通过交叉验证来减少过拟合的风险。这项工作对于语音识别系统、翻译工具以及文化交流研究都具有重要意义。它不仅可以帮助提高语音识别软件的本地化程度，使其更适应各种口音，还能为语言学家提供工具，以更好地理解语言的地域性差异。此外，这一研究还可能启发其他多方言语言的类似研究，推动全球语言识别技术的进步。

资源详情

资源推荐

Chapter 1

Introduction

1.1 Thoughts behind the thesis

In the recent years, we have seen drastic improvements in various Speech Recogni-

tion technologies. Smart speech recognizing AIs like Siri, Alexa, Google Assistant,

Cortana etc. are just one tap away from us. These systems work awlessly in most

of the cases. However, depending on the variation in population of a country, every

language has many dierent dialects based on various regions. Even sophisticated

and highly advanced systems like the aforementioned ones have to face trouble when

the speaker does not speak in the standard accent of that language [14] . To tackle

this issue, quite a lot of research is already done in English, Mandarin and few other

prominent languages. However, while we were researching for related papers in this

eld, we barely found any work that was done in Bengali language. Worldwide al-

most 210 million [15] people speak in Bengali language. Among them, 100 million

are from Bangladesh. Bangladesh is divided into 8 divisions and these divisions

are divided into dierent number of districts which totals in 64 districts. People

from these divisions speak in dierent dialect and accent. Furthermore, even in the

same division people’s accent varies greatly from district to district. These dierent

accents and dialect aect the performance of any Speech Recognizing system signif-

icantly. In this paper, we will try to build a model by training it on various audio

samples from few divisions and districts of Bangladesh. By using the approaches

discussed in this paper, it will be possible to easily distinguish accent of the speaker.

If separate ASR systems are created for each accents then our models will be able

to redirect the speakers audio to the ASR model which was built for the accent

of that speaker. Using the prediction of our models it might also be possible to

build a single ASR system which will be able to detect the spoken words accurately

regardless of the accent.

1.2 Research Problem

Our research paper focuses on detecting dierent Bengali accents. So, the primary

research problem of our thesis if nding out exactly what makes each accent so

dierent from each other. The local people of the district Noakhali speak in a

noticeably dierent accent from standard Bengali accent which is known as Cholito

Bhasa. Moreover, even though Old Dhaka is within the Dhaka district but people

from this place speaks in a very dierent accent from people who lives in the main

city of Dhaka. Let’s look at a table with few examples of how the word ‘Khabo’

which translates to ‘Will Eat’ is dierent based on few dierent regions of Bangladesh

[13]. As we can see, pronunciation of words can vary greatly depending on the region

Figure 1.1: Phonetics variation of the same word across regions.

a speaker is from. Now, these dierence in accent can cause various problems and

signicantly reduce the accuracy of a Speech Recognizing system. Our paper tries to

tackle this issue for Bengali Language. To implement an ASR system for a language,

gender dependent models are created. However, this does not solve the issue that

is caused by dierent accent. In the year 2000, extensive experiments were done

on Microsoft Mandarin Speech Engine. It was found that tone-related-information

are the most important feature of a language in dierent accents. Later, in the

year 2004, on a paper [5] by Chao Huang, Tao Chen, Eric I-Chao Chang, which

was based on the information from the aforementioned experiments, they described

cross-accent models for ASR system had 40 to 50 percent more error that than

an accent-dependent model. There are few important factors in this rise of error

rate. There are a lot of features when creating a model for a speaker as there is

a lot of variability in the tone, pitch etc. This results in a very complex model as

the number of dimensions increases. Few tools such as PCA (Principal Component

Analysis) and ICA (Independent Component Analysis) can help to lower the number

of dimensions and reduce the complexity. Quite a long time has passed since these

papers were published. Moreover, in today’s age these problems are mostly solved

for the prominent languages like English and Mandarin. However, these problems

still exist in the Bengali Language.

剩余47页未读，继续阅读

承让@

粉丝: 6
资源: 380

会员权益专享

深度学习技术在识别孟加拉语方言中的应用

孟加拉语2012新规范

孟加拉语字符集标准.pdf

写几篇关于ocr的文献阅读总结

使用人数前100的语言分别是那些

chitgpt的数据集

统计一下，现在各个语言所占的比例并用饼状图给我展示

pytesseract.image_to_string有哪些识别的语言

sql查询出来的结果集为：俄罗斯联邦 柬埔寨 孟加拉国 文莱 阿富汗 能否拼成俄罗斯联邦、柬埔寨、孟加拉、孟加拉国、文莱、阿富汗

igbp土地覆盖17类

下面代码的执行结果是 lcat =['狮子','猎豹','虎猫','花豹','孟加拉虎','美洲豹','雪豹'] for s in lcat: if '豹' in s: continue print(s,end=" ") 单选题 (2 分) A. 狮子 B. 猎豹 花豹 美洲豹 雪豹 C. 狮子 虎猫 孟加拉虎 D. 猎豹

GEE多源数据的耕地提取国内外现状

python：lcat = ["狮子", "猎豹", "虎猫", "花豹", "孟加拉虎", "美洲豹", "雪豹"] for s in lcat if “豹”in s：print（s，end=“”）continue

设city.csv文件内容如下： 巴哈马,巴林,孟加拉国,巴巴多斯 白俄罗斯,比利时,伯利兹 下面代码的执行结果是： f = open("city.csv", "r") ls = f.read().split(",") f.close() print(ls)

三年自然灾害发生了什么

一带一路沿线国家有哪些？

lcat =["狮子","猎豹","虎猫","花豹","孟加拉虎","美洲豹","雪豹"] for s in lcat: if "豹" in s: print(s,end="") continue

2、在Python语言中，以下语句的执行结果是： lcat =["狮子", "猎豹", "虎猫", "花豹", "孟加拉虎", "美洲豹", "雪豹"] for s in lcat: if "豹" in s: print(s,end=",")

国内运营商最全MNC

用echarts绘制世界地图

会员权益专享

最新资源

sql查询出来的结果集为：俄罗斯联邦柬埔寨孟加拉国文莱阿富汗能否拼成俄罗斯联邦、柬埔寨、孟加拉、孟加拉国、文莱、阿富汗

下面代码的执行结果是 lcat =['狮子','猎豹','虎猫','花豹','孟加拉虎','美洲豹','雪豹'] for s in lcat: if '豹' in s: continue print(s,end=" ") 单选题 (2 分) A. 狮子 B. 猎豹花豹美洲豹雪豹 C. 狮子虎猫孟加拉虎 D. 猎豹

设city.csv文件内容如下：巴哈马,巴林,孟加拉国,巴巴多斯白俄罗斯,比利时,伯利兹下面代码的执行结果是： f = open("city.csv", "r") ls = f.read().split(",") f.close() print(ls)