没有合适的资源?快使用搜索试试~ 我知道了~
首页基于FAST+BRIEF特征的高效图像位置识别方法
基于FAST+BRIEF特征的高效图像位置识别方法
需积分: 25 4 下载量 54 浏览量
更新于2024-09-07
收藏 1.07MB PDF 举报
本文档《Bags of Binary Words for Fast Place Recognition in Image Sequences》发表于IEEE Transactions on Robotics,探讨了一种新颖的视觉位置识别方法。该研究由Dorian Gálvez-López和Juan D. Tardós提出,他们作为IEEE会员共同撰写了这篇短论文。论文的核心思想是利用FAST+BRIEF特征构建的二进制词袋(Bag of Binary Words, BoBW)来实现快速而准确的地点识别,尤其是在图像序列中。 传统的视觉定位在长期的自主导航(如SLAM)中面临着挑战,特别是在场景重叠但时间间隔较长后,标准的匹配算法往往无法正确识别先前观察过的区域。作者创新性地提出了一种解决方案,即构建一个基于二进制描述符空间的词汇树(Vocabulary Tree)。这个词汇树能够对特征进行离散化,从而加速几何验证时的对应关系查找,显著提高了识别速度。 令人印象深刻的是,尽管使用完全相同的词汇和设置,这项技术在各种不同的数据集上展示了高度的竞争性能,而且不会产生误识别(false positives)。整个处理流程,包括特征提取,仅需每帧22毫秒,在包含26300张图像的序列中运行,比先前的方法快了一个数量级,实现了显著的时间效率提升。 关键词包括:地方识别、词袋模型、SLAM(同时定位与建图)、计算机视觉。通过本文的研究,作者不仅提供了一种实用的定位技术,还为视觉SLAM系统提供了重要的性能优化策略,对于实时和高效的机器人导航具有重要意义。
资源详情
资源推荐
IEEE TRANSACTIONS ON ROBOTICS, VOL. , NO. , MONTH, YEAR. SHORT PAPER 1
Bags of Binary Words for Fast Place Recognition in
Image Sequences
Dorian G
´
alvez-L
´
opez and Juan D. Tard
´
os, Member, IEEE
Abstract—We propose a novel method for visual place recognition using
bag of words obtained from FAST+BRIEF features. For the first time, we
build a vocabulary tree that discretizes a binary descriptor space, and
use the tree to speed up correspondences for geometrical verification.
We present competitive results with no false positives in very different
datasets, using exactly the same vocabulary and settings. The whole
technique, including feature extraction, requires 22ms per frame in a
sequence with 26300 images, being one order of magnitude faster than
previous approaches.
Index Terms—Place Recognition, Bag of Words, SLAM, Computer
Vision.
I. INTRODUCTION
One of the most significant requirements for long-term visual
SLAM (Simultaneous Localization and Mapping) is robust place
recognition. After an exploratory period, when areas non-observed
for long are re-observed, standard matching algorithms fail. When
they are robustly detected, loop closures provide correct data asso-
ciation to obtain consistent maps. The same methods used for loop
detection can be used for robot relocation after track lost, due for
example to sudden motions, severe occlusions or motion blur. In [1]
we concluded that, for small environments, map-to-image methods
achieve nice performance, but for large environments, image-to-image
(or appearance-based) methods such as FAB-MAP [2] scale better.
The basic technique consists in building a database from the images
collected online by the robot, so that the most similar one can be
retrieved when a new image is acquired. If they are similar enough,
a loop closure is detected.
In recent years, many algorithms that exploit this idea have
appeared [2]–[6], basing the image matching on comparing them as
numerical vectors in the bag-of-words space [7]. Bags of words result
in very effective and quick image matchers [8], but they are not a
perfect solution for closing loops, due mainly to perceptual aliasing
[6]. For this reason, a verification step is performed later by checking
the matching images to be geometrically consistent, requiring feature
correspondences. The bottleneck of the loop closure algorithms is
usually the extraction of features, which is around ten times more
expensive in computation cycles than the rest of steps. This may
cause SLAM algorithms to run in two decoupled threads: one to
perform the main SLAM functionality, and the other just to detect
loop closures, as in [5].
In this paper, we present a novel algorithm to detect loops and
establishing point correspondences between images in real time, with
a conventional CPU and a single camera. Our approach is based on
bag of words and geometrical check, with several important novelties
that make it much faster than current approaches. The main speed
improvement comes from the use of a slightly modified version of
the BRIEF descriptor [9] with FAST keypoints [10], as explained in
Section III. The BRIEF descriptor is a binary vector where each bit
is the result of an intensity comparison between a given pair of pixels
around the keypoint. Although BRIEF descriptors are hardly invariant
This research has been partly funded by the European Union under
project RoboEarth FP7-ICT-248942, the Direcci
´
on General de Investigaci
´
on
of Spain under projects DPI2009-13710, DPI2009-07130 and the Ministerio
de Educaci
´
on (scholarship FPU-AP2008-02272).
The authors are with the Instituto de Investigaci
´
on en Ingenier
´
ıa de Arag
´
on
(I3A), Universidad de Zaragoza, Mar
´
ıa de Luna 1, 50018 Zaragoza, Spain.
{dorian, tardos}@unizar.es.
to scale and rotation, our experiments show that they are very robust
for loop closing with planar camera motions, the usual case in mobile
robotics, offering a good compromise between distinctiveness and
computation time.
We introduce a bag of words that discretizes a binary space, and
augment it with a direct index, in addition to the usual inverse index,
as explained in Section IV. To the best of our knowledge, this is the
first time a binary vocabulary is used for loop detection. The inverse
index is used for fast retrieval of images potentially similar to a given
one. We show a novel use of the direct index to efficiently obtain
point correspondences between images, speeding up the geometrical
check during the loop verification.
The complete loop detection algorithm is detailed in Section V.
Similarly to our previous work [5,6], to decide that a loop has been
closed, we verify the temporal consistency of the image matches
obtained. One of the novelties in this paper is a technique to prevent
images collected in the same place from competing among them when
the database is queried. We achieve this by grouping together those
images that depict the same place during the matching.
Section VI contains the experimental evaluation of our work, in-
cluding a detailed analysis of the relative merits of the different parts
in our algorithm. We present comparisons between the effectiveness
of BRIEF and two versions of SURF features [11], the descriptor
most used for loop closing. We also analyze the performance of the
temporal and geometrical consistency tests for loop verification. We
finally present the results achieved by our technique after evaluating
it in five public datasets with 0 .7 –4 Km long trajectories. We
demonstrate that we can run the whole loop detection procedure,
including the feature extraction, in 52ms in 26300 images (22ms on
average), outperforming previous techniques by more than one order
of magnitude.
A preliminary version of this work was presented in [12]. In the
current paper we enhance the direct index technique and extend the
experimental evaluation of our approach. We also report results in
new datasets and make a comparison with the state-of-the-art FAB-
MAP 2.0 algorithm [13].
II. RELATED WORK
Place recognition based on appearance has obtained great attention
in the robotics community because of the excellent results achieved
[4,5,13,14]. An example of this is the FAB-MAP system [13], which
detects loops with an omnidirectional camera, obtaining a recall of
48.4% and 3.1%, with no false positives, in trajectories 70 Km
and 1000 Km in length. FAB-MAP represents images with a bag
of words, and uses a Chow Liu tree to learn offline the words’
co-visibility probability. FAB-MAP has become the gold standard
regarding loop detection, but its robustness decreases when the
images depict very similar structures for a long time, which can
be the case when using frontal cameras [5]. In the work of Angeli
et al. [4], two visual vocabularies (for appearance and color) are
created online in an incremental fashion. The two bag-of-words
representations are used together as input of a Bayesian filter that
estimates the matching probability between two images, taking into
account the matching probability of previous cases. In contrast to
these probabilistic approaches, we rely on a temporal consistency
check to consider previous matches and enhance the reliability of
the detections. This technique has proven successful in our previous
works [5,6]. Our work also differs from the ones above in that we
use a bag of binary words for the first time, as well as propose a
technique to prevent images collected close in time and depicting the
same place from competing between them during the matching, so
that we can work at a higher frequency.
下载后可阅读完整内容,剩余8页未读,立即下载
虾米一代
- 粉丝: 751
- 资源: 8
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- Flex垃圾回收与内存管理:防止内存泄露
- Python编程规范与最佳实践
- EJB3入门:实战教程与核心概念详解
- Python指南v2.6简体中文版——入门教程
- ANSYS单元类型详解:从Link1到Link11
- 深度解析C语言特性与实践应用
- Gentoo Linux安装与使用全面指南
- 牛津词典txt版:信息技术领域的便捷电子书
- VC++基础教程:从入门到精通
- CTO与程序员职业规划:能力提升与路径指南
- Google开放手机联盟与Android开发教程
- 探索Android触屏界面开发:从入门到设计原则
- Ajax实战:从理论到实践
- 探索Android应用开发:从入门到精通
- LM317T稳压管详解:1.5A可调输出,过载保护
- C语言实现SOCKET文件传输简单教程
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功