单图像物体识别与全姿态注册：面向机器人操作的3D建模与定位

需积分: 9 5 浏览量更新于2024-09-09 收藏 1.35MB PDF 举报

"物体识别与全姿态注册：单一图像的机器人操作关键技术" 在2009年的研究论文《Object Recognition and Full Pose Registration from a Single Image for Robotic Manipulation》中，作者Alvaro Collet、Dmitry Berenson、Siddhartha S. Srinivasa和Dave Ferguson探讨了在无结构环境中，尤其是引入机器人到家庭环境时，鲁棒感知能力的重要性。全姿态估计是实现这一目标的关键步骤，因为它涉及精确识别和定位场景中的物体。该研究主要关注如何利用单一图像构建物体的精确三维模型。他们通过结合局部特征描述符（如SIFT或SURF）从多张校准过的训练图片中提取信息。这些描述符能够提供对象的局部特征，使得模型能够适应实际物体。每种模型都是针对一组训练图像进行优化的，从而确保模型与真实物体之间的最佳匹配。在新的测试图像中，研究人员提出了一种创新方法，即使用RANSAC（随机采样一致性）和Mean Shift算法相结合，对每个对象的多个实例进行注册。这种组合有效地解决了姿态估计中的匹配问题，即使在复杂物体和密集场景中也能实现高精度的6自由度（6-DOF）姿态估计，包括旋转、平移和缩放等。为了保证初始的稳健性，系统设计了一个初始化步骤，允许对测试图像中的物体进行任意的姿态调整。实验结果展示了该方法的有效性和实用性，尤其是在处理复杂且杂乱的环境中的物体识别和姿态估计。这篇论文提出了一个用于机器人操作的创新技术，通过单一图像就能实现物体的精确三维建模和全姿态估计，这对于增强机器人在日常环境中的自主性和交互性具有重要意义。随着深度学习的发展，这种方法可能会进一步提升，为机器人技术的未来发展奠定基础。

Object Recognition and Full Pose Registration from a Single Image for

Robotic Manipulation

Alvaro Collet Dmitry Berenson Siddhartha S. Srinivasa Dave Ferguson

Abstract— Robust perception is a vital capability for robotic

manipulation in unstructured scenes. In this context, full pose

estimation of relevant objects in a scene is a critical step towards

the introduction of robots into household environments. In this

paper, we present an approach for building metric 3D models

of objects using local descriptors from several images. Each

model is optimized to ﬁt a set of calibrated training images, thus

obtaining the best possible alignment between the 3D model and

the real object. Given a new test image, we match the local de-

scriptors to our stored models online, using a novel combination

of the RANSAC and Mean Shift algorithms to register multiple

instances of each object. A robust initialization step allows for

arbitrary rotation, translation and scaling of objects in the test

images. The resulting system provides markerless 6-DOF pose

estimation for complex objects in cluttered scenes. We provide

experimental results demonstrating orientation and translation

accuracy, as well a physical implementation of the pose output

being used by an autonomous robot to perform grasping in

highly cluttered scenes.

I. INTRODUCTION

Autonomous robots operating in human environments

present some extremely challenging research topics in path

planning and dynamic perception, among others. Whether it

is in the workplace or in a household, a common character-

istic is the lack of static surroundings: people walk around,

tables and chairs are moved, objects are left in different

places. In order to successfully navigate in, and interact

with, such an environment, accurate and robust dynamic

perception is a must. In particular, an object recognition

system that provides accurate 6-DOF pose is very important

for performing complex manipulation tasks.

The object recognition and registration system we pro-

pose handles arbitrarily complex non-planar objects, is fully

automatic and based on natural (marker-free) features of

a single image. It is robust to outliers, partial occlusions,

changes in illumination, scale and rotation. It is able to detect

multiple objects and multiple instances of the same object

in a single image, and provide accurate pose estimation

for every instance. Using a calibrated camera, it is able to

localize each object in the robot’s coordinate frame to enable

on-line manipulation, as shown in Fig. 1.

Our system takes the core algorithm of Gordon and

Lowe [1] and extends it with a model alignment step that

enables accurate localization (section III-B), an automatic

A. Collet and D. Berenson are with The Robotics Institute, Carnegie

Mellon University, 5000 Forbes Ave., Pittsburgh, PA - 15213, USA.

{acollet, dberenson}@cs.cmu.edu

S. Srinivasa and D. Ferguson are with Intel Research

Pittsburgh, 4720 Forbes Ave., Suite 410, Pittsburgh,

PA - 15213, USA {siddhartha.srinivasa,

dave.ferguson}@intel.com

Fig. 1. Object grasping in a cluttered scene through pose estimation

performed with a single image. (top left) Scene observed by the robot’s

camera, used for object recognition/pose estimation. Coordinate frames

show the pose of each object. (top right) Virtual environment reconstructed

after running pose estimation algorithm. Each object is represented using a

simple geometry. (bottom) Our robot platform in the process of grasping an

object, using only the pose information from this algorithm.

initialization step for pose registration, and the combination

of RANSAC [2] with Mean-Shift [3] clustering to greatly

improve efﬁciency of recognizing multiple instances of the

same object. All these contributions make this algorithm

suitable for robotic manipulation of objects in cluttered

scenes, using only a single input image.

To accomplish these goals, the system we propose uses

SIFT features [4] to extract local descriptors from natural

features. As in [1], the system is separated into an off-

line object modelling stage and an on-line recognition and

registration stage. In the modelling stage, a sequence of

images of an object are taken from different viewpoints

using a camera with no pose information. The object is then

segmented in each training image, either manually or auto-

matically. Next, SIFT features are extracted for each image

and matched across the entire sequence. Using a structure-

from-motion bundle adjustment algorithm [5] described in

下载后可阅读完整内容，剩余7页未读，立即下载

maskete

粉丝: 0
资源: 1

单图像物体识别与全姿态注册：面向机器人操作的3D建模与定位

The PyTorch improved version of TPAMI 2017 paper: Face Alignment in Full Pose Range: A 3D Total Solution- cleardusk / 3 ddfa

Eigenspace approach for object recognition and its pose detection

Relation between object recognition and formation of hand shape: A computational approach to human grasping movements

Visual Object Recognition

2D Object Detection and Recognition

Generative Models for Visual Objects and Object Recognition via Bayesian Inference

Object Detection and Instance Recognition

Selective Search for Object Recognition

Object Recognition_OBJECT_hidelnf_recognition_keras_kerasmodel_源

OBJECT DETECTION AND RECOGNITION IN DIGITAL IMAGES

最新资源