形状鲁棒文本检测：渐进式尺度扩展网络

深度学习

图像识别

需积分: 10 100 浏览量更新于2024-08-31 收藏 4.06MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

"ShapeRobustTextDetectionwithProgressiveScaleExpansionNetwork是一项针对自然场景文本检测的研究，旨在解决现有技术在处理形状多样性和密集文本实例分离上的挑战。该研究由南京大学和南京理工大学的研究人员提出，采用了一种名为Progressive Scale Expansion Network（PSENet）的新型像素级分割基础检测器。" 在自然场景文本检测领域，由于前景文本和背景物体的多样性，以及文本形状的变化，检测任务变得极其复杂。传统的基于四边形边界框的检测器在定位任意形状的文本时存在困难，因为它们难以完美地将不规则形状的文本包围在矩形中。另一方面，大多数基于像素级分割的检测器可能无法有效地区分相互靠近的文本实例，这在密集文本场景中尤为明显。 PSENet为每个文本实例提供了多个预测，这些预测对应于通过对原始文本实例进行不同尺度收缩而产生的不同“内核”。通过逐步扩展算法，从小尺度的内核开始，逐渐扩大到最大且完整的文本实例，从而实现最终的检测。这种方法允许网络在扩张过程中逐渐细化文本轮廓，提高检测的精确度和鲁棒性。 PSENet的设计思路是利用深度学习的力量，通过多阶段的规模扩展策略来适应各种形状的文本。这种策略能够更准确地捕捉文本边缘，特别是在处理弯曲、不规则或极小的文本时表现优异。此外，对于紧密相邻的文本实例，PSENet的多预测和逐步扩张机制有助于区分并独立检测每一个实例，解决了密集文本检测的难题。 ShapeRobustTextDetectionwithProgressiveScaleExpansionNetwork是一个创新的解决方案，它提升了自然场景文本检测的性能，特别是在处理形状复杂和密集文本的情况下。这一技术有望在场景理解、产品识别、自动驾驶和目标地理定位等应用中发挥重要作用。通过深度学习和像素级分割的结合，PSENet为未来文本检测技术的发展开辟了新的路径。

资源详情

资源推荐

𝐹

𝑃

𝐶

𝑆

𝑛−1

𝑆

𝑛

…

Progressive Scale Expansion

𝑅

Figure 2: Illustration of our overall pipeline. The left part is implemented from FPN [16]. The right

part denotes the feature fusion and the progressive scale expansion algorithm.

[19] utilized corner localization to ﬁnd suitable irregular quadrangles for text instances. The de-

tection manners are evolving from horizontal rectangle to rotated rectangle and further to irregular

quadrangle. However, besides the quadrangular shape, there are many other shapes of text instances

in natural scene. Therefore, some researches began to explore curve text detection and obtained

certain results. [18] tried to regress the relative positions for the points of a 14-sided polygon. [31]

detected curve text by locating two end points in the sliding line which slides both horizontally and

vertically. A fused detector was proposed in [1] based on bounding box regression and semantic

segmentation. However, since their current performances are not very satisﬁed, there is still a large

space for promotion in curve text detection, and the detectors for arbitrary-shaped texts still need

more explorations.

3 Proposed Method

In this section, we ﬁrst introduce the overall pipeline of the proposed Progressive Scale Expansion

Network (PSENet). Next, we present the details of progressive scale expansion algorithm, and show

how it can effectively distinguish the adjacent text instances. Further, the way of generating label

and the design of loss function are introduced. At last, we describe the implementation details of

PSENet.

3.1 Overall Pipeline

The overall pipeline of the proposed PSENet is illustrated in Fig. 2. Inspired by FPN [16], we

concatenate low-level feature maps with high-level feature maps and thus have four concatenated

feature maps. These maps are further fused in F to encode informations with various receptive

views. Intuitively, such fusion is very likely to facilitate the generations of the kernels with various

scales. Then the feature map F is projected into n branches to produce multiple segmentation

results S

, S

, ..., S

. Each S

would be one segmentation mask for all the text instances at a certain

scale. The scales of different segmentation mask are decided by the hyper-parameters which will be

discussed in Sec. 3.3. Among these masks, S

gives the segmentation result for the text instances

with smallest scales (i.e., the minimal kernels) and S

denotes for the original segmentation mask

(i.e., the maximal kernels). After obtaining these segmentation masks, we use progressive scale

expansion algorithm to gradually expand all the instances’ kernels in S

, to their complete shapes in

, and obtain the ﬁnal detection results as R.

3.2 Progressive Scale Expansion Algorithm

As shown in Fig. 1 (c), it is hard for segmentation-based method to separate the text instances that

are close to each other. To solve this problem, we propose the progressive scale expansion algorithm.

Here is a vivid example (see Fig. 3) to explain the procedure of progressive scale expansion algo-

rithm, whose central idea is brought from the Breadth-First-Search (BFS) algorithm. In the example,

we have 3 segmentation results S = {S

, S

} (see Fig. 3 (a), (e), (f)). At ﬁrst, based on the

剩余11页未读，继续阅读

艾尔_1222

粉丝: 790
资源: 10

形状鲁棒文本检测：渐进式尺度扩展网络

基于java的校园美食交流系统设计与实现.docx

#_ssm_126_mysql_实习支教中小学学校信息管理系统_.zip

基于python的酒店评论中文情感分析系统源码+设计文档+数据集.zip

ASP.NET公交车管理系统的实现与设计(源代码+论文).zip

ASP基于WEB楼宇专业网站毕业设计(源代码+论文).zip

django基于协同过滤算法的小说推荐系统 -论文.zip

ASP.NET基于web的订餐系统的设计与实现(源代码+论文).zip

2020数字孪生技术应用与发展概述

基于java的的德云社票务系统的设计与实现.docx

基于java的宜佰丰超市进销存管理系统设计与实现.docx

基于java的削面快餐店点餐服务系统的设计与实现.docx

用户体验分享和讨论.ppt

#_ssm_137_mysql_数据结构课堂学生考勤管理系统_.zip

ASP.NET基于WEB的工作计划流程管理系统的设计与实现(源代码+论文).zip

#_ssm_153_mysql_健身房众筹系统_.zip

一款基于UNITY的MMORPG游戏.zip(毕设&课设&实训&大作业&竞赛&项目)

java-ssm+vue志愿者招募网站实现源码(项目源码-说明文档)

Java设计基础-图书馆管理系统

采用Spring+Struts2+Hibernate框架，实现一个仿天猫购物网站的web工程(毕设&课设&实训&大作业&竞赛&项

基于Asp.Net的电商后台管理系统.zip(毕设&课设&实训&大作业&竞赛&项目)

最新资源