离线手写文本行分割算法的研究与比较

66 浏览量更新于2024-08-27 2 收藏 325KB PDF 举报

"本文主要探讨了手写图像中文本行分割的算法，强调了文本行分割在手写文字处理中的基础地位以及其对识别、识别和检索等领域的关键影响。由于离线手写失去了写作顺序和其他信息，使得分割更具挑战性。文章针对不同离线手写风格（如倾斜、粘连、重叠等）导致的分割问题的复杂性进行了分析，并对比了近年来的相关解决方案，最后提出了线性分割中的一些问题和未来研究方向。" 正文：手写图像中文本行的分割算法是计算机视觉和自然语言处理领域的重要研究内容，尤其在手写文字识别（HWR）、手写文字识别（OCR）和手写文本检索中扮演着至关重要的角色。文本行分割的准确性直接影响到后续处理的准确性和效率，对于离线手写图像而言，这个任务显得尤为复杂。离线手写图像的特性在于，它们通常丢失了实时书写时的顺序信息和笔画动态，这使得图像内部的文本行难以被精确分割。例如，手写笔迹可能会有倾斜，相邻字符或单词之间可能存在粘连，甚至有些文字会相互重叠，这些都给文本行的自动分割带来了挑战。近年来，研究人员提出了一系列解决手写文本行分割问题的方法。这些方法通常包括基于图像处理的技术，如边缘检测、阈值分割、连通组件分析，以及机器学习和深度学习模型。例如，Canny边缘检测算法可以用于找出图像中的显著边界，而Otsu的二值化方法则可以帮助区分背景和前景。在连通组件分析中，通过连接相同颜色或灰度的像素来识别出单独的字符或文本行。此外，还有一些方法利用形状分析和模板匹配来识别和分割特定的书写样式。机器学习模型，如支持向量机（SVM）和随机森林，可以通过训练数据学习到手写特征，用于分割任务。近年来，随着深度学习的发展，卷积神经网络（CNN）和循环神经网络（RNN）等模型在手写识别和分割中展现出强大的性能，能够适应多种手写风格的复杂性。尽管已有诸多方法，但手写文本行分割仍然存在一些未解决的问题。例如，如何处理高度倾斜的文字或极小的字符间距，以及如何有效地处理重叠文字的分割。此外，对于非标准和非结构化的手写，现有算法的鲁棒性仍需提高。未来的研究方向可能集中在开发更高级的深度学习模型，以更好地理解和捕捉手写的多样性，以及利用强化学习或其他自适应方法来优化分割策略。同时，集成多个分割技术以实现互补优势，也是提高整体性能的一种可能途径。手写图像中文本行的分割是一个既具有挑战性又充满机遇的研究领域。通过不断的技术创新和深入理解手写的特点，我们可以期待在这一领域取得更大的突破，进一步推动手写文字处理技术的发展。

The algorithms for segmentation of text-lines in handwriting images

Huo Liulei, Kamil Moydin, Abdusalam Dawut, Askar Hamdulla*

Institute of Information Science and Engineering, Xinjiang University Urumqi 830046, China

*corresponding author’s email: askarhamdulla@sina.com

ABSTRACT—Text line segmentation from

handwriting image is the basis of handwriting

text image processing, and the accuracy of

line segmentation plays a decisive role in

handwriting identification, handwriting

recognition, handwriting retrieval and other

research fields. The accuracy of line

segmentation may directly lead to the

accuracy and efficiency of handwriting

identification, character recognition and text

retrieval. Because offline handwriting has lost

the order of writing and other information,

which makes it more difficult to segment the

offline handwriting image. This paper mainly

aims at the complexity of the segmentation

problem caused by the diversity of off-line

handwriting styles, such as tilt, adhesion,

overlap and so on, and compares the related

solutions in recent years. In the end, some

problems in line segmentation research are

put forward or omitted, which is more

convenient for readers to understand the

field.

Keywords—Offline, Handwritten scripts, Text

line, Segmentation

Ⅰ. INTRODUCTION

Text line segmentation is the first step in

processing text information, and then subsequent

research such as words recognition or retrieval

and even information extraction of historical

documents. Chirography can be divided different

form.

Compared with the printed text line, the

distribution is very neat, so the projection

method can be used to segment the text image by

the projection method. Handwritten text is not as

simple as printing text lines, handwritten fonts

are more random, and text layout is not regular.

The following figure is divided into the

renderings obtained by using the projection

method when the threshold is 30 and 75. The

experimental results in this paper are shown in

figure 1

Figure 1: projection method with thresholds of 30 and 75

From the above image, it is possible to lose

some of the smaller structures in the text even

when the threshold it’s set improperly, such as

the point above the word '代'. This kind of loss

will bring great obstacles to handwriting

retrieval and have a small structure with many

additional parts for Uyghur scripts. As a result,

handwriting is generally not separated by

projection alone.

This article mainly selects the articles from

2014 to 2018 to introduce them, and compares

the advantages and disadvantages of their

methods, which helps readers to more easily

understand the advantages and disadvantages of

each algorithm in recent years, and the progress

of algorithm for row segmentation.

Ⅱ. CLASSIFICATION OF TEXT LINE

SEGMENTATION

In 1982, the first RLSA algorithm (RLSA)

[1]

was proposed by K.Y. Wong, R.G. Casey, F.M.

Wahl et al. [1]. At present, text line segmentation

or extraction methods for handwritten text

images are mainly divided into the following

three types: bottom-up, top-down, Hybrid.

A. Related bottom-up algorithms

The Bottom-up text image is segmented by

pixels, pixel block (font), and text line. Such

methods mainly include spectral clustering

[2]

feature corner aggregation

[3]

, smearing effect

[4]

Mumford-Shah model

[5]

, minimum spanning

tree clustering

[6]

, convolutional neural network

[7]

, Markov decision process

[8]

and so on.

Ayman Al-Dmour and Fares Fraij use the

already well-developed horizontal projection

method to perform handwritten Arabic text

segmentation

[9]

, which is better for handwritten

text images with well-written and large line

spacing. The operation is simple and the running

time is relatively short. Yi Xiaofang et al.

proposed a Uyghur handwritten text image

segmentation based on connected domains

[10]

This method firstly divides the connected

domains into three categories according to the

size of the connected domain, and application an

adaptive smear algorithm and deal with inflation.

In this case, the text line skeleton has been

basically formed. The area of the third type of

connected domain is detected as a sticky

character, and then processed.

Alireza Alaei et al. proposed an unconstrained

handwritten text line segmentation method

[11]

which divides the text image into different

vertical parts according to the line spacing after

the line spacing obtained by the line spacing of

the statistical text line. The text image is applied

based on the average width smear. After the

smear, the smaller black frame is removed. Then

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38625708

粉丝: 4
资源: 944

离线手写文本行分割算法的研究与比较

关于图像中文字分割的算法

Matlab图像文字区域分割

数学公式识别器,有源代码,很好手写识别参考.

银行票据手写数字串切分：滴水算法的应用

银行票据手写数字串识别的预处理与分割.pdf

基于snake算法实现数字图像的边缘检测，图像分割以及特征提取附matlab代码.zip

手写识别基于图像融合和模糊逻辑

手写体数字识别、nn算法、svm算法matlab实现.zip

【图像去噪】基于图像加噪去噪算法合集附matlab代码+运行结果.zip

tuxiang.rar_visual c_手写体识别_数字图像识别

最新资源