The algorithms for segmentation of text-lines in handwriting images
Huo Liulei, Kamil Moydin, Abdusalam Dawut, Askar Hamdulla*
Institute of Information Science and Engineering, Xinjiang University Urumqi 830046, China
*corresponding author’s email: askarhamdulla@sina.com
ABSTRACT—Text line segmentation from
handwriting image is the basis of handwriting
text image processing, and the accuracy of
line segmentation plays a decisive role in
handwriting identification, handwriting
recognition, handwriting retrieval and other
research fields. The accuracy of line
segmentation may directly lead to the
accuracy and efficiency of handwriting
identification, character recognition and text
retrieval. Because offline handwriting has lost
the order of writing and other information,
which makes it more difficult to segment the
offline handwriting image. This paper mainly
aims at the complexity of the segmentation
problem caused by the diversity of off-line
handwriting styles, such as tilt, adhesion,
overlap and so on, and compares the related
solutions in recent years. In the end, some
problems in line segmentation research are
put forward or omitted, which is more
convenient for readers to understand the
field.
Keywords—Offline, Handwritten scripts, Text
line, Segmentation
Ⅰ. INTRODUCTION
Text line segmentation is the first step in
processing text information, and then subsequent
research such as words recognition or retrieval
and even information extraction of historical
documents. Chirography can be divided different
form.
Compared with the printed text line, the
distribution is very neat, so the projection
method can be used to segment the text image by
the projection method. Handwritten text is not as
simple as printing text lines, handwritten fonts
are more random, and text layout is not regular.
The following figure is divided into the
renderings obtained by using the projection
method when the threshold is 30 and 75. The
experimental results in this paper are shown in
figure 1
Figure 1: projection method with thresholds of 30 and 75
From the above image, it is possible to lose
some of the smaller structures in the text even
when the threshold it’s set improperly, such as
the point above the word '代'. This kind of loss
will bring great obstacles to handwriting
retrieval and have a small structure with many
additional parts for Uyghur scripts. As a result,
handwriting is generally not separated by
projection alone.
This article mainly selects the articles from
2014 to 2018 to introduce them, and compares
the advantages and disadvantages of their
methods, which helps readers to more easily
understand the advantages and disadvantages of
each algorithm in recent years, and the progress
of algorithm for row segmentation.
Ⅱ. CLASSIFICATION OF TEXT LINE
SEGMENTATION
In 1982, the first RLSA algorithm (RLSA)
[1]
was proposed by K.Y. Wong, R.G. Casey, F.M.
Wahl et al. [1]. At present, text line segmentation
or extraction methods for handwritten text
images are mainly divided into the following
three types: bottom-up, top-down, Hybrid.
A. Related bottom-up algorithms
The Bottom-up text image is segmented by
pixels, pixel block (font), and text line. Such
methods mainly include spectral clustering
[2]
,
feature corner aggregation
[3]
, smearing effect
[4]
,
Mumford-Shah model
[5]
, minimum spanning
tree clustering
[6]
, convolutional neural network
[7]
, Markov decision process
[8]
and so on.
Ayman Al-Dmour and Fares Fraij use the
already well-developed horizontal projection
method to perform handwritten Arabic text
segmentation
[9]
, which is better for handwritten
text images with well-written and large line
spacing. The operation is simple and the running
time is relatively short. Yi Xiaofang et al.
proposed a Uyghur handwritten text image
segmentation based on connected domains
[10]
.
This method firstly divides the connected
domains into three categories according to the
size of the connected domain, and application an
adaptive smear algorithm and deal with inflation.
In this case, the text line skeleton has been
basically formed. The area of the third type of
connected domain is detected as a sticky
character, and then processed.
Alireza Alaei et al. proposed an unconstrained
handwritten text line segmentation method
[11]
,
which divides the text image into different
vertical parts according to the line spacing after
the line spacing obtained by the line spacing of
the statistical text line. The text image is applied
based on the average width smear. After the
smear, the smaller black frame is removed. Then