随机子空间优化大规模图像检索的二进制代码学习

50 浏览量更新于2024-08-26 收藏 552KB PDF 举报

"大规模图像检索中用于二进制代码学习的随机子空间" 这篇研究论文主要探讨了在大规模图像检索中使用二进制代码学习的问题。二进制代码（Hashing）是一种有效的近似最近邻搜索方法，因其快速查询速度和低存储成本而受到广泛关注。在现有的许多先进方法中，基于特征值分解的技术占据了主导地位，但这些方法存在一个显著问题：不同维度的信息获取不平衡，大部分信息往往集中在顶部的特征向量中。作者指出，这种不平衡导致了一个意外的现象，即更长的哈希编码并不一定会带来更好的性能。为了解决这个问题，他们提出了一种名为“随机子空间”（Random Subspace）的策略。该策略首先随机采样整个特征空间的一小部分来训练哈希算法，每次仅保留顶部的特征向量生成一个短码。这一过程会重复多次，最终将获得的多个短码拼接起来，形成一个综合的、更稳定的二进制表示。这种方法有以下几个关键优点： 1. **随机性与多样性**：通过随机采样特征空间，每个子空间都可能捕获到不同的信息，增加了编码的多样性，从而提高了检索的准确性。 2. **平衡信息**：不同于传统方法中对顶部特征向量的过度依赖，随机子空间策略使得各部分特征都有机会被利用，有助于信息的均衡分布。 3. **高效性**：尽管每次只训练一部分特征，但通过多次迭代和短码拼接，仍能保持整体编码的效率，同时降低了计算复杂度。 4. **可扩展性**：随着数据规模的扩大，随机子空间策略可以更好地适应并维持良好的检索性能。 5. **鲁棒性**：由于使用了多组短码，即使某些编码受到噪声或异常值的影响，整体系统仍然能够保持一定的稳定性。该研究提供了一种新的视角来优化大规模图像检索中的二进制编码学习，通过引入随机子空间策略，有效地解决了长编码不等于高性能的问题，提高了检索效率和精度。这种方法对于实际应用，特别是在大数据环境下的图像检索系统具有重要的理论价值和实践意义。

Random Subspace for Binary Codes Learning in Large

Scale Image Retrieval

Cong Leng, Jian Cheng, Hanqing Lu

National Laboratory of Pattern Recognition

Institute of Automation, Chinese Academy of Sciences

Beijing, China

{cong.leng, jcheng, luhq}@nlpr.ia.ac.cn

ABSTRACT

Due to the fast query speed and low storage cost, hashing

based approximate nearest neighbor search methods have

attracted much attention recently. Many state of the art

methods are based on eigenvalue decomposition. In these

approaches, the information caught in diﬀerent dimensions

is unbalanced and generally most of the information is con-

tained in the top eigenvectors. We demonstrate that this

leads to an unexpected phenomenon that longer hashing

code does not necessarily yield better performance. In this

work, we introduce a random subspace strategy to address

this limitation. At ﬁrst, a small fraction of the whole feature

space is randomly sampled to train the hashing algorithms

each time and only the top eigenvectors are kept to generate

one piece of short code. This process will be repeated sev-

eral times and then the obtained many pieces of short codes

are concatenated into one piece of long code. Theoretical

analysis and experiments on two benchmarks conﬁrm the

eﬀectiveness of the proposed strategy for hashing.

Categories and Subject Descriptors

H.3.3 [Information Systems]: Information Search and Re-

trieval

General Terms

Algorithms, Experimentation, Measurement

Keywords

Image Retrieval, Random Subspace, Binary Codes, Ham-

ming ranking

1. INTRODUCTION

With the rapid development of Internet, large scale visu-

al databases with high dimensionality are everywhere on the

Web. These huge databases pose signiﬁcant challenges to vi-

sual search since the linear exhaustive search is really time

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for proﬁt or commercial advantage and that copies bear this notice and the full cita-

tion on the ﬁrst page. Copyrights for components of this work owned by others than

ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-

publish, to post on servers or to redistribute to lists, requires prior speciﬁc permission

and/or a fee. Request permissions from permissions@acm.org.

SIGIR’14, July 6–11, 2014, Gold Coast, Queensland, Australia.

consuming. To address this issue, many hashing based meth-

ods for approximation nearest neighbor (ANN) search have

been proposed recently [6, 1, 7, 2, 3]. In these approaches,

hash functions are learned to map the nearby points in the

original space into similar binary codes. Searching with bi-

nary codes will be very fast because the Hamming distance

between them can be eﬃciently calculated in the modern

CPU.

As one main branch of the existing hashing methods, the

eigenvalue decomposition based approaches [8, 7, 2, 9] have

attracted much attention. Spectral Hashing (SH) [8] treat-

s the hashing problem as spectral embedding problem and

calculates the bits by thresholding a subset of eigenvectors

of the graph Laplacian. Anchor Graph Hashing (AGH) [7]

follows the same idea of SH but utilizes anchor graph to ob-

tain tractable low-rank adjacency matrices. PCAH [2] sim-

ply generates linear hash functions with PCA projections,

which are the eigenvectors of the data covariance matrix.

For these eigenvalue decomposition based methods, typi-

cally the variances of diﬀerent projected dimensions are dif-

ferent and thus the information caught by diﬀerent dimen-

sions is unbalanced. In general, the top eigenvectors carry

most of the information (variance) while the remainders are

usually less informative or even noisy. This will result in

an unexpected phenomenon that longer hashing codes do

not necessarily yield better performance. As highlighted in

Figure 1, when the code length exceeds 8, increasing num-

ber of bits leads to poorer mean average precision (MAP)

performance on MNIST with both PCAH and AGH. Some

recent work such as Iterative Quantization (ITQ) [2] have

been proposed to overcome this problem, but there still lack

of theoretical guarantee that longer codes will give better

result than that of the shorter ones. Furthermore, the di-

mensionality of visual data is generally very high and this is

the main diﬃculty widely encountered in many eigenvalue

decomposition based methods, e.g., the time complexity of

PCA is O(nd

+ d

)wheren is the size of training set and

d is the dimensionality of the data.

In this work, we attempt to leverage the random subspace

strategy to deal with these problems mentioned above for

binary codes learning. We aim to concatenate many pieces

of short codes generated with the eigenvalue decomposition

based method into one piece of long code with the expec-

tation that the longer code will be “stronger”. However, it

is clear that if the many pieces of short codes are identi-

cal, the obtained long code won’t catch more information

and will yield the same retrieval result as the short code.

Inspired from random decision forests [4], we adopt the ran-

1031

下载后可阅读完整内容，剩余3页未读，立即下载

weixin_38696582

粉丝: 5

随机子空间优化大规模图像检索的二进制代码学习

matlab （SSI）随机子空间法程序

模拟二进制交叉分析

常用算法代码（数据结构）

手写图像检索：特征提取技术探析

视觉信息的频域奥秘：【图像处理中的傅里叶变换】的专业分析

MATLAB数组大数据处理：应对大规模数组处理，掌握高效处理策略

探索图像处理的奇妙世界：MATLAB图像处理，小白进阶

MATLAB矩阵输出与大数据分析：处理和输出大规模矩阵数据，探索数据价值，让数据更有意义

立体视觉里程计仿真数据管理：存储、检索与分析

存储和管理机器学习数据：HDF5在机器学习中的应用

最新资源