ONE算法：统一图像分类与检索的Max-SIFT方法

23 浏览量更新于2024-08-26 收藏 1.59MB PDF 举报

"使用Max-SIFT描述符进行图像分类" 在图像处理和计算机视觉领域，图像分类和检索是两个核心任务，它们通常被视为不同的问题来处理。然而，这篇研究论文提出了一个全新的观点，即这两个任务的本质是相同的，都可以通过度量图像之间的相似性来解决。论文的标题是"使用Max-SIFT描述符进行图像分类"，这意味着它探讨了一种基于Max-SIFT（最大SIFT）特征的图像分类方法。 SIFT（尺度不变特征转换）是一种经典的局部特征提取技术，它能识别出图像中的兴趣点，并对这些点进行描述，使得这些描述在尺度、旋转和光照变化下保持不变。Max-SIFT是对SIFT的一种增强，它不仅考虑单个特征点，而是考虑邻近区域内的多个SIFT特征，从而获得更丰富的上下文信息，这有助于提高图像分类和检索的准确性。论文中提出了一种名为ONE（在线最近邻估计）的统一算法，该算法适用于图像分类和检索。ONE算法简单但有效，它包括三个主要步骤：手动对象定义、区域描述和最近邻搜索。手动对象定义是指用户或系统定义感兴趣的物体或场景；区域描述则涉及使用Max-SIFT描述符来捕捉图像的特征；最后，通过最近邻搜索来比较和匹配这些描述符，以确定图像的类别或检索结果。为了处理大规模图像搜索，论文还引入了PCA（主成分分析）和PQ（Product Quantization）近似技术，这两种技术可以有效地降低数据维度，减少存储需求，并加速计算过程。同时，利用GPU（图形处理器）的并行计算能力，进一步提升了算法的执行效率。实验结果显示，无论是在图像分类还是检索任务上，ONE算法都能达到最先进的性能。这表明，即使采用简单直观的方法，也能在复杂的问题上取得良好的效果，挑战了传统上认为需要复杂模型和大量计算资源的观念。总结来说，这篇论文的核心贡献在于提出了一种基于Max-SIFT特征的统一图像分类和检索算法ONE，该算法具有简洁性和高效性，并结合PCA和PQ优化，能够适应大规模图像数据。实验验证了其在各类图像分类和检索任务上的优越性能，对于推动图像处理和计算机视觉领域的研究有着积极的影响。

Image Classiﬁcation and Retrieval are ONE

Lingxi Xie

, Richang Hong

, Bo Zhang

, and Qi Tian

1,3

LITS, TNLIST, Dept. of Computer Sci&Tech, Tsinghua University, Beijing 100084, China

School of Computer and Information, Hefei University of Technology, Hefei 230009, China

Department of Computer Science, University of Texas at San Antonio, TX 78249, USA

198808xc@gmail.com,

hongrc@hfut.edu.cn,

dcszb@mail.tsinghua.edu.cn,

qitian@cs.utsa.edu

ABSTRACT

In this paper, we demonstrate that the essentials of image

classiﬁcation and retrieval are the same, since both tasks

could be tackled by measuring the similarity between im-

ages. To this end, we propose ONE (Online Nearest-neighbor

Estimation), a uniﬁed algorithm for both image classiﬁca-

tion and retrieval. ONE is surprisingly simple, which only

involves manual object deﬁnition, regional description and

nearest-neighbor search. We take advantage of PCA and

PQ approximation and GPU parallelization to scale our al-

gorithm up to large-scale image search. Experimental results

verify that ONE achieves state-of-the-art accuracy in a wide

range of image classiﬁcation and retrieval benchmarks.

Categories and Subject Descriptors

I.4.10 [Image Processing and Computer Vision]: Im-

age Representation—Statistical; I.4.7 [Image Processing

and Computer Vision]: Feature Measurement—Feature

representation

General Terms

Algorithms, Experiments, Performance

Keywords

Image Classiﬁcation, Image Retrieval, ONE, CNN

1. INTRODUCTION

Past decades have witnessed an impressive bloom of mul-

timedia applications based on image understanding. For ex-

ample, the number of categories in image classiﬁcation has

grown from a few to tens of thousands [13], and deep Convo-

lutional Neural Networks (CNN) have been veriﬁed eﬃcient

in large-scale learning [25]. Meanwhile, image retrieval has

been transplanted from toy programs to commercial search

engines indexing billions of images, and new user intention-

s such as ﬁne-grained concept search [62] are realized and

proposed in this research ﬁeld.

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are

not made or distributed for proﬁt or commercial advantage and that copies

bear this notice and the full citation on the ﬁrst page. To copy otherwise, to

republish, to post on servers or to redistribute to lists, requires prior speciﬁc

permission and/or a fee.

ICMR ’15, June 23 – 26, 2015, Shanghai, China

natural scene

mountain

terrace

Search by “mountain”

Search by “natural scene”

Search by “terrace”

QUERY

TP TP

TP TP TP

Fused Results

TP TP TP TP TP

Figure 1: An image retrieval example illustrating the intu-

ition of ONE (best viewed in color PDF). On a query image,

it is possible to ﬁnd a number of semantic objects. Searching

for nearest neighbors with one object might not capture the

exact query intention, but fusing yields satisfying results.

A yellow circle with the word TP indicates a true-positive

image. Images are collected from the Holiday dataset [20].

Both image classiﬁcation and retrieval receive a query

image at a time. Classiﬁcation tasks aim at determining

the class or category of the query, for which a number of

training samples are provided and an extra training process

is often required. For retrieval, the goal is to rank a large

number of candidates according to their relevance to the

query, and candidates are considered as independent units,

i.e., without explicit relationship between them. Both im-

age classiﬁcation and retrieval tasks could be tackled by the

Bag-of-Visual-Words (BoVW) model. However, the ways of

performing classiﬁcation [10][26] and retrieval [46][38] are,

most often, very diﬀerent. Although all the above algo-

rithms start from extracting patch or regional descriptors,

the subsequent modules, including feature encoding, index-

ing/training and online querying, are almost distinct.

In this paper, we suggest using only ONE (Online Nearest-

neighbor Estimation) algorithm for both image classiﬁcation

and retrieval. This is achieved by computing similarity be-

tween the query and each category or image candidate. In-

spired by [4], we detect multiple object proposals on the

query and each indexed image, and extract high-quality fea-

tures on each object to provide better image description. On

下载后可阅读完整内容，剩余7页未读，立即下载

weixin_38715048

粉丝: 7
资源: 960

ONE算法：统一图像分类与检索的Max-SIFT方法

CGCI-SIFT：创新的图像匹配与识别局部描述符

PCA-SIFT：局部图像描述符的更独特表示

MBR-SIFT：一种镜像反射不变的图像匹配新描述符

particular object retrieval with integral max-pooling of cnn activations.pdf

sift+kd-tree

SIFT-C#基于Emgu-cv的程序实现

MATLAB实现R-MAC描述符与最大池化技术详解

独特性描述符在图像处理中的作用与实际应用

基于SIFT和SURF的图像匹配算法

【PyCharm图像识别技术】：使用SIFT和SURF算法实现图像识别（深度学习预览）

最新资源