局部自相似描述子在形状检索中的应用与优化

需积分: 38 3 浏览量更新于2024-09-07 收藏 4.02MB PDF 举报

“局部自相似性（Local Self-Similarities，简称LSS）是图像处理和计算机视觉领域的一个重要概念，由Shechtman和Irani首次提出。这种描述子方法主要关注图像的局部特性，而非整体，使其在处理具有挑战性的图像，如形变物体的检索时表现出优越性。” 局部自相似描述子的特性与优势： 1. **局部特性**：LSS描述子强调图像的局部特征，而非全局特征。这意味着它能够捕捉到图像中的微小细节，即使在复杂的背景下或有部分遮挡的情况下也能识别目标，这对于图像识别和物体检测等任务尤其有用。 2. **对数极坐标**：采用对数极坐标系统来描述图像，允许LSS描述子适应局部的仿射变换。这种坐标系统可以有效地处理图像中因光照变化、视角转换或物体扭曲造成的形状变化。 3. **最大值选择**：在构建描述子时，选择每个bin中的最大值作为其代表值，这使得描述子对匹配位置的微小变化具有鲁棒性。这种设计有助于提高匹配的准确性，减少因定位误差导致的影响。 4. **区域信息**：LSS使用图像的patch（小区域）而非单一像素进行分析，这样可以捕获更多的上下文信息，包括纹理、边缘和颜色等，从而提供更全面的图像表示。 5. **丰富信息包含**：LSS描述子不仅包含物体的结构骨架，还包含了丰富的边缘、颜色和其他视觉属性。这使得它在识别形状各异但基本结构相似的物体时具有较高的准确性。在 Efﬁcient Retrieval of Deformable Shape Classes using Local Self-Similarities 这篇论文中，作者Ken Chatfield, James Philbin 和 Andrew Zisserman提出了一个基于LSS的高效对象检索系统，该系统能够识别抽象的可形变“形状”类别，即使目标在颜色、纹理、边缘等视觉属性上存在显著差异。为了实现高效的形状匹配和检索，他们做出了以下三个贡献： - **描述子点稀疏化**：通过找出图像中的鉴别性区域来减少描述子点的数量，降低了计算成本，同时保持了识别性能。 - **尺度不变性**：扩展了原始的LSS方法，使其能够在尺度变化的情况下进行匹配，增加了系统的适应性和通用性。 - **向量量化**：将描述子向量量化，即用有限数量的代码字表示描述子，降低了存储需求并提高了检索速度。这些贡献使得LSS在形变物体的检索中表现出高效率和高准确性，为图像检索和识别提供了新的思路和方法。

Efﬁcient Retrieval of Deformable Shape Classes

using Local Self-Similarities

Ken Chatﬁeld

Dept. of Engineering Science

University of Oxford, UK

ken.chatfield@oriel.oxon.org

James Philbin

Dept. of Engineering Science

University of Oxford, UK

james@robots.ox.ac.uk

Andrew Zisserman

Dept. of Engineering Science

University of Oxford, UK

az@robots.ox.ac.uk

Abstract

We present an efﬁcient object retrieval system based on

the identiﬁcation of abstract deformable ‘shape’ classes

using the self-similarity descriptor of Shechtman and

Irani [13]. Given a user-speciﬁed query object, we retrieve

other images which share a common ‘shape’ even if their

appearance differs greatly in terms of colour, texture, edges

and other common photometric properties.

In order to use the self-similarity descriptor for efﬁ-

cient retrieval we make three contributions: (i) we spar-

sify the descriptor points by locating discriminative regions

within each image, thus reducing the computational expense

of shape matching; (ii) we extend [13] to enable match-

ing despite changes in scale; and (iii) we show that vec-

tor quantizing the descriptor does not inhibit performance,

thus providing the basis of a large-scale shape-based re-

trieval system using a bag-of-visual-words approach. Per-

formance is demonstrated on the challenging ETHZ de-

formable shape dataset and a full episode from the televi-

sion series Lost, and is shown to be superior to appearance-

based approaches for matching non-rigid shape classes.

1. Introduction

We are interested in the rapid and accurate retrieval of

objects based on their shape from large unordered collec-

tions of images and videos. Our aim is to accurately re-

trieve these objects despite deformations caused by intra-

class variations or non-rigid materials. An example of the

kinds of images we would like to handle is shown in ﬁg-

ure 1. These images share almost none of the usual pho-

tometric properties such as colour, texture or edges and yet

clearly share similarities in shape, as deﬁned by a common

conﬁguration of repeating pattern elements. The ability to

match these generic shapes can be considered an important

sub-task for object class recognition. Due to the lack of any

shared appearance, descriptors such as SIFT [8], which use

Figure 1: Challenges of object class identiﬁcation. Although

all four images are of a heart, there is no obvious image property

(e.g. texture, edges or colour) shared between them.

intensity gradients, are not appropriate, and indeed (as will

be shown) often perform poorly in such cases.

The problem of matching such shapes is addressed by

the descriptor of Shechtman and Irani [13], which uses lo-

cal self-similarity patterns extracted from the image as a

descriptor (reviewed in section 3). However, their work

concentrates on matching templates at similar scales and

does not address the problems of false positive matches

or retrieval in large datasets. The questions we investigate

here are: (i) is the descriptor invariant to changes in scale

and (ii) can it be applied to efﬁcient large scale image re-

trieval [6, 9, 11, 15] by vector quantizing into visual words?

Here, we will show that both questions can be answered

positively: in section 4 it is shown that self-similarity de-

scriptors are somewhat unaffected by scale change and a

certain degree of deformation, and are sufﬁcient to support

multi-scale shape matching. Further, we show that intra-

image matching can be used to obtain discriminative de-

scriptors and reduce their density, and in section 5 it is

shown that matching performance is not adversely affected

by quantizing the descriptors into visual words.

Global shape deformations are modelled using the Im-

plicit Shape Model of Leibe et al [7]. This model has been

shown to be sufﬁciently invariant to the kind of deforma-

tions that occur within disparate object classes such as cars,

cows, horses and pedestrians [10, 14]. The vector quantiza-

tion that we introduce lays the foundation for efﬁcient re-

trieval under deformation and rendering changes – such as

matching from an image to a line drawing for example.

下载后可阅读完整内容，剩余7页未读，立即下载

李伯爵的指间沙

粉丝: 153
资源: 3

局部自相似描述子在形状检索中的应用与优化

Similarities between fourier and power series

java-similarities:从 code.google.compjava-similarities 自动导出

Algorithm-laravel-string-similarities.zip

cos_similarities.argsort()[-n:][::-1]是什么意思

基于协同过滤的推荐算法和基于热度的推荐算法结合生成算法代码

python用类实现word2vec

最新资源