边缘线索与多特征驱动的场景文本检测方法

63 浏览量更新于2024-08-27 收藏 723KB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

资源详情

资源推荐

Scene Text Detection via Edge Cue and Multi-Features

Youbao Tang, Xiangqian Wu

The School of Computer Science and Technology

Harbin Institute of Technology

Harbin, China

tangyoubao@hit.edu.cn, xqwu@hit.edu.cn

Abstract—Inspired by the fact that edge is an important cue to

distinguish texts from background, we propose a novel scene

text detection method via edge cue and multiple features, which

has two main parts, i.e. candidate character region (CCR)

extraction and region classification. For CCR extraction, the

edges are first extracted from the input image, which are then

broken and merged based on color features to form the final

edge image. For each edge connected component, a number of

image patches are extracted by translating and scaling its

boundary rectangle to generate the CCRs. For region

classification, the character regions are extracted from the

CCRs by using a region classification technique, which extracts

both the hand-designed low-level features and deep convolution

neural network based high-level features of the regions for

classification. And then the character regions are merged to

form the candidate text regions, based on which the final text

region are detected by using the region classification technique.

The proposed method is evaluated on two latest ICDAR

benchmark datasets and the experimental results demonstrate

that the proposed method outperforms the state-of-the-art

approaches of scene text detection.

Keywords-scene text detection; candidate region extraction;

region classification; edge cue; multiple features

NTRODUCTION

Scene text detection aims to locate the position of texts in

different scenes, e.g. guideposts, store marks, and warning

signs, as shown in Fig. 1, which is one of the most important

steps for end-to-end scene text recognition. Effective scene

text detection can enhance the performances of numerous

multimedia applications, e.g. mobile visual search, content-

based image retrieval, and sign automatically translation.

Because of the unconstrained scene environments, e.g.

different text sizes, colors, and complex backgrounds, scene

text detection is still a challenging problem in computer vision

community. Over the past years, a large number of scene text

detection approaches [1-20] have been proposed, most of

which have been summarized by Ye and Doermann [21]. And

a series of international scene text detection competitions have

been successfully organized [22-24]. Generally, the existing

approaches can be roughly divided into two groups: sliding

window based approaches and connected component based

approaches. Here, we simply summarize the previous scene

text detection approaches, and then discuss the most related

work with ours in detail.

The sliding window based approaches [10-13] first slide a

large number of windows with different scales through all

possible positions of the image and then extract features to

classify the regions in sliding windows into texts or

background. One advantage of these approaches is keeping

almost all of the true text regions. At the same time, they

generate numerous candidate regions, which need to be

classified in the next processes, resulting in high computation.

The key factor to decide the detection performance is the

discriminability of the extracted features. At the beginning,

the hand-designed low-level features, e.g. HOG and SIFT, are

extracted [11, 12]. To improve the classification performance,

some researchers [10, 13] use the convolution neural network

(CNN) to learn deep high-level features in recent and get the

state-of-the-art results. The connected component based

approaches [1-9, 14, 15, 18, 20] first cluster the pixels into

larger connected components according to the pixels’

properties, e.g. intensity, color, and stroke width, and then

extract features from connected components for classification.

One advantage of these approaches is greatly reducing the

number of candidate regions, but losing some true character

regions. Almost all of the above two kinds of approaches only

use the hand-designed low-level features or CNN based high-

level feature for region classification.

Inspired by the fact that edge is one of the most important

cues to distinguish texts from background, this paper proposes

a novel scene text detection method via edge cue and multiple

features, which consists of two main stages, i.e. candidate

character region extraction and region classification. For the

stage of candidate character region extraction, the proposed

(a) (b) (c)

Figure 1. Detection results of the proposed method (indicated by blue

rectangle) on different scene images, which nearly match the ground truths

(indicated by red and green rectangles). (a) Guidepost. (b) Store mark. (c)

Warning sign.

2016 15th International Conference on Frontiers in Handwriting Recognition

DOI 10.1109/ICFHR.2016.37

156

下载后可阅读完整内容，剩余5页未读，立即下载

weixin_38714653

粉丝: 3
资源: 929

边缘线索与多特征驱动的场景文本检测方法

前端开源库-cue-command-release.zip

前端开源库-cue-command-release

批量修改cue 编码

批量修改cue编码utf8

css input password圆点大小

cue目录文件制作工具

ue4 c++ 播放cue

cue连接restful

详细解释pos = (888, SCREEN_HEIGHT / 2) cue_ball = create_ball(dia / 2, pos) balls.append(cue_ball)

AX-CPT范式的psychopy代码

台球辅助瞄准python代码

videojs动态字幕

self.toolButton_folder.setCursor(QtGui.QCursor(QtCore.Qt.PointingHandCursor))

UGameplayCueNotify_Burst

setSortIndicator

最新资源