自然场景图像中鲁棒文本检测的图形分割算法

需积分: 10 23 浏览量更新于2024-09-10 收藏 1.92MB PDF 举报

"本文提出了一种在自然场景图像中检测文本的准确且鲁棒的方法，主要关注图形分割在文本检测中的应用。" 在计算机视觉领域，图形分割是一种关键的技术，它涉及将图像分解成多个有意义的部分或区域，以便进行进一步的分析和理解。在这个背景下，"图形分割"尤其在自然场景文本检测中扮演着重要角色。自然场景图像中的文本检测是一项挑战性的任务，因为文本可能以各种形状、大小和方向出现，并且常常与复杂的背景相互融合。该论文《Robust Text Detection in Natural Scene Images》由Xu-Cheng Yin等人发表在2014年的IEEETRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE上，提出了一种用于检测自然场景图像中文本的新方法。方法的核心是利用最大化稳定极值区域（MSERs）来提取可能的字符候选。MSERs是一种常见的图像分割技术，它能够有效地捕获图像中的不变特征，如文本中的单个字符。为了快速并有效地提取MSERs，作者设计了一种基于最小化正则化变差策略的剪枝算法。此算法有助于减少噪声和背景区域的影响，从而提高字符候选的准确性。接下来，通过单链聚类算法将字符候选人组合成可能的文本候选，其中距离权重和聚类阈值是通过一种新颖的自我训练距离度量学习算法自动学习的。为了进一步提升检测的准确性，论文采用了两种分类器。首先，一个字符分类器估计每个文本候选的非文本后验概率，高概率的非文本候选被剔除。然后，一个文本分类器用于最终识别剩余的文本候选，确保只有真正的文本被保留下来。这种两步验证策略提高了系统的鲁棒性，使其能够在复杂背景中准确地识别文本。该系统在ICDAR2011 Robust Reading Competition数据集上进行了评估，这是一个广泛使用的文本检测基准，证明了其在自然场景文本检测上的有效性和可靠性。通过结合图形分割技术、机器学习算法和深度学习方法，这种方法为文本检测提供了新的视角，对后续的图像分析任务如光学字符识别（OCR）和信息检索有着深远的影响。

972 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 36, NO. 5, MAY 2014

Fig. 1. Flowchart of the proposed system and results after each step of

the sample. Text candidates are labeled by blue bounding rectangles;

character candidates identiﬁed as characters are colored green, and

others red.

2) Text candidates construction. Distance weights and clus-

tering threshold are learned simultaneously using the pro-

posed metric learning algorithm; character candidates are

clustered into text candidates by the single-link clustering

algorithm using the learned parameters. More details are

presented in Section 3.2.

3) Text candidates elimination. The posterior probabilities

of text candidates corresponding to non-texts are estimated

using the character classiﬁer and text candidates with

high non-text probabilities are removed. More details are

presented in Section 3.3.

4) Text candidates classiﬁcation. Text candidates corre-

sponding to true texts are identiﬁed by the text classiﬁer.

An AdaBoost classiﬁer is trained to decide whether an text

candidate corresponding to the true text or not [28].

3.1 Character Candidates Extraction

3.1.1 Pruning Algorithm Overview

Repeating components is the major pitfall when the MSER

algorithm is applied as a character segmentation algorithm.

Considering the MSERs tree presented in Fig. 3(a), this ﬁg-

ure shows that fourteen MSERs are detected for the word

“PACT” but only four of them are really interested to us.

The hierarchical structure of MSERs is quite useful for

designing a pruning algorithm. As characters cannot “con-

tain” or be “contained” by other characters in real world,

it is safe to remove children once the parent is known

to be a character, and vice versa. If the MSERs tree is

pruned by applying this kind of parent-children elimination

operation recursively, we are still “safe” and all charac-

ters are preserved after the elimination. As an example,

Fig. 3(e) shows that a set of disconnected nodes contain-

ing all the desired characters can be extracted by applying

this algorithm to the MSERs tree in Fig. 3. However, it

can be computationally expensive to identify characters,

which usually entails the computations of complex features.

Fig. 2. Character correspondence in MSERs trees. (a) A MSERs tree

whose children corresponds to characters. (b) A MSERs tree whose

parent corresponds to character.

Fortunately, rather than identifying the character, we can

simply choose the one that is more likely to be characters

in a parent-children relationship. [21] claimed that such

pairwise relationships may not be sufﬁcient to eliminate

non-character MSERs, and pruning should exploit some

complicated higher-order properties of text. Alternatively,

our empirical study indicates that this probability can be

fast estimated using our regularized variation scheme with

reasonable accuracy. As there are different situations (one-

child and multiple-children) in MSERs trees, we design

two algorithms based on the parent-children elimination

operation, namely the linear reduction and tree accumulation

algorithm. The linear reduction algorithm is used to remove

line segments in the MSERs tree at ﬁrst and the accumu-

lation algorithm is then used to further remove repeated

characters.

3.1.2 Variation and Its Regularization

According to Matas et al.[16], an “extremal region” is a con-

nected component of an image whose pixels have either

higher or lower intensity than its outer boundary pixels.

Extremal regions of the whole image are extracted as a

rooted tree. An extremal region is in fact a set of pixels.

Its variation is deﬁned as follows. Let R

be an extremal

region, B(R

) = (R

, R

l+1

,...,R

l+

) ( is a parameter) be

the branch of the tree rooted at R

, the variation (instability)

of R

is deﬁned as

V(R

) =

l+

− R

, (1)

where |R| denotes the number of pixels in R. An extremal

region R

is a maximally stable extremal region if its varia-

tion is lower and more stable than its parent R

l−1

and child

l+1

[16]. Informally, a maximally stable extremal region is

an extremal region whose size remains virtually unchanged

over a range of intensity levels [20].

As MSERs with lower variations have sharper borders

and are more likely to be characters, one possible strat-

egy for the parent-children elimination operation is to select

parent or children based on who have the lowest variation.

However, this strategy alone will not work because MSERs

corresponding to characters may not necessarily have low-

est variations. Consider a very common situation depicted

in Fig. 2. The children of the MSERs tree in Fig. 2(a) cor-

respond to characters while the parent of the MSRE tree

in Fig. 2(b) corresponds to a character. The “minimizing

variation” strategy cannot deal with this situation because

either the parent or children may have the lowest varia-

tions. However, our experiment shows that this difﬁculty

can be easily overcome by variation regularization. The

basic idea is to penalize variations of MSERs with too large

or too small aspect ratios. Note that we are not requir-

ing characters to have the lowest variations globally, and

剩余13页未读，继续阅读

jixiansuper

粉丝: 0

自然场景图像中鲁棒文本检测的图形分割算法

图像分割算法的一些常见算法

计算机图形学分割算法源码

基于Matlab算法的零件图形高效分割技术,基于Matlab技术的零件图像高精度分割算法研究,基于matlab的零件图形分割 ,基于Matlab; 零件图形; 分割; 图形处理; 算法优化,基于Ma

FCM图形分割算法实现与应用

图形学中点分割算法 FLASH

案例13-直线段中点分割算法.rar_中点分割算法_割线法_投影透视算法_线段投影_计算机图形学

一种快速鲁棒核空间图形模糊聚类分割算法.pdf

基于图形处理器加速的医学图像分割算法研究.pdf

图像分割算法

中点分割算法

最新资源