大规模图像检索：最大视觉同质区域检测器

176 浏览量更新于2024-08-26 收藏 3.1MB PDF 举报

"用于大规模图像检索的最大视觉同质区域检测器" 在计算机视觉领域，图像检索是一项关键任务，尤其在处理大规模数据集时。传统的局部特征检测器，如SIFT、SURF或MSER（Maximally Stable Extremal Regions），在提取图像中的显著区域时可能会遇到问题，因为它们可能会捕获到大量重复且不具代表性的纹理区域，这可能导致错误匹配。针对这一问题，本文提出了一个新的检测器——最大视觉同质区域（Maximally Visual-Homogeneous Region, MVHR）检测器，旨在寻找更独特且具有代表性的局部不变区域，以提高大规模图像检索的准确性。该论文的主要贡献在于两个方面： 1. **创新的排序方法**：不同于传统的MSER检测器将单一像素强度作为排名依据，MVHR检测器引入了一种基于局部补丁的视觉同质性分析排序方法。这种方法考虑了相邻像素间的视觉相似性，而不是仅依赖于像素强度，从而能够更好地识别出视觉上一致的区域，减少错误匹配的可能性。 2. **比例选择算法**：作者发现观察尺度与视觉同质性之间存在紧密关系。基于此，他们设计了一种启发式比例选择算法。该算法能够在一定尺度范围内动态调整，根据视觉同质性的变化来选择最合适的尺度。这样，检测器可以在不同的尺度上更加准确地捕捉到具有高重复性的区域，同时保持较高的检测精度。实验结果证实，MVHR检测器在找到较少但具有代表性的区域的同时，其性能与大型检测器相比毫不逊色，尤其是在大规模图像检索中，它能够提供更高的召回率和精确度。这表明，MVHR检测器对于优化图像检索系统的效率和准确性具有重要意义。总结起来，这个研究提供了一种新的局部特征检测策略，通过改进的排序机制和智能的比例选择，有效地减少了不必要的重复特征，提高了图像检索的性能。这对于在海量图像库中快速准确地定位和检索目标图像有着实际的应用价值，尤其对于需要高效处理和分析大量图像的系统，如搜索引擎、监控系统和图像数据库。

Maximally Visual-Homogeneous Region Detector

for Large Scale Image Retrieval

Gang Wang, Ke Gao, Jintao Li

Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS)

Institute of Computing Technology, CAS, Beijing 100190, China

{wanggang01, kegao, jtli}@ict.ac.cn

ABSTRACT

Conventional local detectors often extract numerous small

repeated regions in textured areas, which easily results in

false matching. In order to ﬁnd representative and

distinctive local invariant regions, this paper proposes a

Maximally Visual-Homogeneous Region (MVHR) detector.

The main contributions can be summarized as 2 parts: (1)

Being diﬀerent from original MSER which employs single

pixel intensity as ranking unit, we propose a novel sorting

method based on visual homogeneity analysis on a local

patch. (2) Identifying the observation scale has a close

relationship with visual homogeneity analysis, a heuristic

scale selection algorithm is developed to choose a proper

scale according to the changes of visual homogeneity

evaluation over a range of scales. Experiments demonstrate

our detector can ﬁnd less but representative regions with

high repeatability, while still perserving competitive

precision compared to the state-of-art detectors for large

scale image retrieval.

Categories and Subject Descriptors

H.5.1 [Multimedia Information Systems]: Methodology

General Terms

Algorithms, Design, Performance, Theory.

Keywords

local feature detector, visual homogeneity, scale selection

1. INTRODUCTION

Using local detector to extract regions of interest is an

important statistic method to represent a picture [10]. For

robust image matching, a “qualiﬁed” local feature detector is

desired to be distinctive and repeatable under various image

transformations.

MSER [6] is one of the most popular local feature

detectors with good performance [7]. In principle, the

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for proﬁt or commercial advantage and that copies bear this notice and the full citation

on the ﬁrst page. Copyrights for components of this work owned by others than

ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or

republish, to post on servers or to redistribute to lists, requires prior speciﬁc permission

and/or a fee. Request permissions from permissions@acm.org.

ICMR’15, June 23–26, 2015, Shanghai, China.

http://dx.doi.org/10.1145/2671188.2749315.

(a) Harris, 406 (b) SIFT, 998

Figure 1: Output regions detected by local feature

detectors and their corresponding numbers. The

red contours denote regions of interest detected by

four methods. Intensity-based methods (Harris,

SIFT, MSER) often split the integral object into

“pieces”, while the proposed detector can ﬁnd few

but representative visual-homogeneous regions. The

blue contour shows the exact boundary detected by

our method.

method searches for closed regions which achieve local

maximal stability over a range of gray value, however when

coming to the textured areas, it always extracts a lot of

small and redundant patches. Figure 1 (c) shows an

example detection result on a handbag which is covered by

many black and yellow blobs. Similar phenomenon also

exists in Harris and SIFT like Figure 1 (a) and (b). These

regions have similar appearances after normalization.

In order to deal with this problem, our major

contribution extends MSER to perceptual consistency

awareness. Visual homogeneity is deﬁned on a certain size

of window according to the space distribution of color

classes. It serves as the basic visual unit instead of gray

value in MSER. As is shown in Figure 1 (d), the proposed

MVHR can ﬁnd less but representative regions and well

solves the problem as mentioned above.

Noticing that observation scale has a great inﬂuence on

visual homogeneity evaluation, therefore our another

下载后可阅读完整内容，剩余3页未读，立即下载

weixin_38742409

粉丝: 14
资源: 954

大规模图像检索：最大视觉同质区域检测器

大规模图像检索

京东-规模图像检索系统的挑战与实践-2020.76-34页精品报告2020.pdf

通过显着区域检测进行图像检索

图像检索图像检索matlab

计算机视觉（二）：图像检索以及基于图像描述符的搜索 计算机视觉.pdf

基于注意深度局部特征的大规模图像检索

大规模图像检索：随机树量化与BOVW模型

自适应Dense-SIFT在大规模图像检索中的应用

OpenCV实现图像检索：基于Surf特征检测

LexLIP：大规模图像文本检索的词汇瓶颈预训练方法

最新资源

计算机视觉（二）：图像检索以及基于图像描述符的搜索计算机视觉.pdf