提升场景与对象分类的空间金字塔匹配方法

研究论文

59 浏览量更新于2024-08-26 收藏 895KB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

"这篇研究论文提出了一种新颖的空间金字塔匹配方法，用于提升场景和对象分类的准确性，特别是在大型数据集中的应用。论文作者是来自北京航空航天大学自动化科学与电气工程学院的研究团队。他们针对现有空间金字塔匹配（SPM）方法存在的问题进行了改进，包括采用近似最近邻方法替代k-means聚类，根据图像数量和像素调整码书大小，以及通过计算子码书和设定阈值消除接近已注册码的代码，最后采用层次分类策略重新缩放直方图特征。实验结果显示，该方法在性能上优于现有的先进方法。" 本文主要探讨了在场景和对象分类任务中，如何提高识别精度的问题。传统的空间金字塔匹配（Spatial Pyramid Matching, SPM）虽然在解决大规模数据集的分类难题上有一定成效，但其本身存在一些局限性。论文作者针对这些问题提出了三个创新点： 1. **近似最近邻方法**：传统SPM通常采用k-means算法进行特征聚类，但这种方法可能会受到初始中心选择的影响。为改善这一情况，论文建议使用近似最近邻（Approximate Nearest Neighbor, ANN）方法，这能更有效地处理高维数据，并且在保持效率的同时降低对初始条件的敏感度。 2. **码书大小的自适应调整**：码书是SPM中存储特征向量的集合，其大小直接影响分类效果。研究者提出根据图像的数量和像素内容动态计算子码书，为每个类别建立独立的子码书，并删除与已注册码过于接近的码，以减少冗余并提高分类的准确性。 3. **特征直方图的重缩放与层次分类策略**：传统的SPM通常使用固定尺度的直方图特征，而新方法则引入了特征直方图的重缩放，这有助于捕捉不同尺度的细节信息。此外，论文还采用了层次分类策略，通过逐步细化的方式对场景进行分类，从而提高分类的精确性和鲁棒性。实验结果表明，这些改进显著提高了场景和对象分类的性能，证明了新方法的有效性。对于实际应用，如计算机视觉、图像处理和人工智能等领域，这种优化的空间金字塔匹配方法有潜力成为一种更强大的工具，尤其是在处理大规模复杂图像数据时。

资源详情

资源推荐

Novel Spatial Pyramid Matching for Scene

and Object Classification

Kai Ding, Weihai Chen ,Xingming Wu, Zhong Liu

School of Automation Science and Electrical Engineering,

Beijing University of Aeronautics and Astronautics

Beijing, P.R.China.

838383_dingkai@163.com, whchenbuaa@126.com, wuxingming307@126.com, liuzhong@buaa.edu.cn

Abstract—It is difficult to classify object or scene images with

high accuracy when the dataset is relatively large. Spatial

Pyramid Matching (SPM) was proposed to deal with this

problem, but there are some shortages. As an improvement for

SPM, we proposed three pieces of meliorations: first, use

approximate nearest neighbor method instead of k-means for

clustering; second, regulate the size of codebook referring to

quantity and pixels of the images, by calculating sub-codebook

for every category and eliminating the codes which are nearer to

the registered ones than the threshold; third, rescale the

histogram features, and classify the scene with hierarchical

strategy. Experiments prove that our approach make better

performance than other state-of-the-art classification methods

using just one matching kernel.

Keywords—Object Classification, Spatial Pyramid Matching,

ANN clustering.

I. INTRODUCTION

Scene and object classification is a high-level semantic

analysis in computer vision, and remain great challenging jobs

in the field, especially scene classification. Supposing that we

take a picture of a square, there may be some people in it,

some buildings and some kind of plants, but we call it square

briefly, that’s ‘scene’; if we deal with a dataset containing

several kind of scenes, and we are trying to identify which

category one picture belongs to by machine learning, that’s

‘scene categories classification’. By this method, we can get

an approximate predication for the category of an image,

ignoring many details in it. This method is the analog as a

person staring at natural scene that’s far away, trying to tell

what he is looking at. And so it’s the inspiration of Gist came

from, proposed by Torralba and Olive [1]. Object and scene

classification algorithm now is playing more and more

important roles in artificial intelligent system such as

autonomous mobile, cargo sorting, transportation monitoring,

and household robotics etc., some other applications could be

found in augment simulation or data compression techniques.

A wide range of algorithms have been proposed to tackle

this problem. Space subdivision and histogram method are

representative approaches in early phase. The features they

adopted to identify objects were color, edges, patches, which

are sensitive to illumination, scaling and affine distortion, and

classification accuracy was stuck at a low level. Then local

descriptors with illumination or scale invariance were

proposed, such as Harris and SIFT points [9]. These features

lead to prosperity in multi-images processing research. Some

notable progresses emerged, for example L.Feifei and

K.Grauman’s work. L.Feifei developed a bag-of-words(BoW)

method dealing with scene classification and object

recognition, here ‘feature’ means dense SIFT which is better

than SIFT feature, based on her comparative evaluation [2].

Her job had strong influnce on subsequent studies, like KNN-

SVM by Zhang.H, Spatial Pyramid Matching(SPM) by

S.Lazebnik and Spatial Pyramid Kernels by A.Bosch etc.

[4][5][6]. K.Grauman proposed Pyramid Matching method

which is also based on SIFT feature, resulting in a histogram

for each image using weighted intersection method on multi

resolution [3]. But neither of them took full advantage of

position information, what they focus on is the probability of

every matching feature appears in the unlabeled images. How

important is position information in image classification?

S.Lazebnik proposed spatial pyramid matching algorithm, and

simple position information was preserved by creating ordinal

regular-grid feature vectors, experiments shown remarkable

increase in classification accuracy comparing methods without

position information [5]. However the role of position

information is subtle, if too much is preserved, then the

adorable detail overlooking property will disappear, accuracy

may fall, so position information can only be auxiliary factor

in this condition. Recently, multi-kernels matching algorithms

became popular, that combining SIFT and Pyramid Histogram

of Oriented Gradients etc., whose result accuracy are higher

[7]. But in this paper, we only discuss image classification

algorithms with single kernel.

Because of impressive performance, spatial pyramid

approach is introduced to other image processing methods. But

when we study this algorithm, we find some shortages in large

dataset supporting and image representation. First, they used

k-means clustering in spatial pyramid, which could mistakenly

cluster uneven distributed dataset. Second, they generated

codebook by processing all the image features together, some

minority centers may be annexed which could restrain

recognition of related scene. Third, resulted histogram should

be preprocessed before training, because this can improve

classification accuracy, but they didn’t mention it.

In this paper we propose novel spatial pyramid approach,

using approximate nearest neighbor method to cluster data

下载后可阅读完整内容，剩余5页未读，立即下载

weixin_38622475

粉丝: 0
资源: 912

提升场景与对象分类的空间金字塔匹配方法

空间金字塔匹配

金字塔模板匹配算法_模板匹配_金字塔模板匹配

CVPR 2010 用于图像分类的位置约束线性编码 英文原版论文

SIFT特征提取：精确极值定位与匹配关键点

MBR-SIFT：一种镜像反射不变的图像匹配新描述符

Halcon图像金字塔与尺度空间处理技巧

【10个YOLO神经网络应用案例】：揭秘YOLO算法的广泛应用场景

工业检测利器：ORB算法在缺陷识别与分类中的应用

卷积神经网络程序-matlab

利用神经网络实现DNN信号均衡.zip

基于文本增强与争议融合的虚假信息检测模型设计源码

MATLAB 实现的蜜蜂算法优化卷积神经网络（Bee-CNN）进行图像分类预测的详细实例（包含详细的完整的程序和数据）

夏季研究：求解微分方程的神经网络.zip

学习股票相关知识，使用Python研究股票投资，通常包括爬行股票数据、分析技术

基于Go语言的爬虫项目设计源码分享

基于uni-app的MIMC即时消息云免费聊天框架设计源码

手动构建一个完整的神经网络；动手构建完整的神经网络_TinyNN.zip

构建ID3决策树的完整算法代码

最新资源

CVPR 2010 用于图像分类的位置约束线性编码英文原版论文