利用Kinect打造的3D物体类别数据集

需积分: 10 10 浏览量更新于2024-09-12 收藏 1.75MB PDF 举报

"a category-level 3-D object dataset：putting the kinect to work" 这篇论文介绍了一个基于Kinect传感器的大型3D物体类别级别数据集，旨在解决当前3D物体检测数据集在场景、类别、实例和视角多样性上的不足。微软的Kinect深度传感器的普及引发了对具有挑战性的3D物体检测数据集的需求，而现有的数据集无法满足这一需求。作者Allison Janoch等人来自加州大学伯克利分校和马克斯普朗克信息研究所，他们在论文中强调了Kinect设备在获取高质量3D数据方面的潜力，并创建了一个包含真实家庭和办公环境中的彩色和深度图像对的数据集。该数据集目前涵盖超过50个类别，并通过众包方式持续增加图像数量，增加了数据的多样性和广泛性。在介绍部分，作者指出最近3D感知技术的复兴，特别是低成本但高精度的深度传感器如Kinect的广泛应用，这为3D对象识别和理解提供了新的可能性。然而，现有的3D数据集在场景、物体类别、实例和观察角度的多样性上存在局限，限制了这些技术的发展。因此，他们构建的这个数据集旨在填补这一空白。为了评估数据集的有效性，作者设立了基于PASCAL VOC风格的物体检测任务的基线性能。同时，他们还探讨了如何利用估计的物体世界尺寸来提升检测效果，这是利用3D信息的一个重要方面。论文中提出的这两种方法可能对提高3D物体检测的准确性具有积极影响。此外，该数据集和相关的注释可以在http://www.kinectdata.com上公开获取，为研究者和开发者提供了一个宝贵的资源，以推动3D物体检测和理解领域的进步。通过这个数据集，研究者可以训练和测试他们的算法，探索更复杂的3D视觉任务，如语义分割、6D姿态估计和3D重建等。这个数据集的发布不仅促进了3D计算机视觉技术的研究，也推动了Kinect传感器在实际应用中的潜力挖掘，为未来智能家居、自动驾驶、机器人导航等领域提供了强有力的数据支持。

A Category-Level 3-D Object Dataset: Putting the Kinect to Work

Allison Janoch, Sergey Karayev, Yangqing Jia, Jonathan T. Barron, Mario Fritz, Kate Saenko, Trevor Darrell

UC Berkeley and Max-Plank-Institute for Informatics

{allie, sergeyk, jiayq, barron, saenko, trevor}@eecs.berkeley.edu, mfritz@mpi-inf.mpg.de

Abstract

Recent proliferation of a cheap but quality depth sen-

sor, the Microsoft Kinect, has brought the need for a chal-

lenging category-level 3D object detection dataset to the

fore. We review current 3D datasets and ﬁnd them lack-

ing in variation of scenes, categories, instances, and view-

points. Here we present our dataset of color and depth

image pairs, gathered in real domestic and ofﬁce environ-

ments. It currently includes over 50 classes, with more

images added continuously by a crowd-sourced collection

effort. We establish baseline performance in a PASCAL

VOC-style detection task, and suggest two ways that in-

ferred world size of the object may be used to improve de-

tection. The dataset and annotations can be downloaded at

http://www.kinectdata.com.

1. Introduction

Recently, there has been a resurgence of interest in avail-

able 3-D sensing techniques due to advances in active depth

sensing, including techniques based on LIDAR, time-of-

ﬂight (Canesta), and projected texture stereo (PR2). The

Primesense sensor used on the Microsoft Kinect gaming

interface offers a particularly attractive set of capabilities,

and is quite likely the most common depth sensor available

worldwide due to its rapid market acceptance (8 million

Kinects were sold in just the ﬁrst two months).

While there is a large literature on instance recogni-

tion using 3-D scans in the computer vision and robotics

literatures, there are surprisingly few existing datasets for

category-level 3-D recognition, or for recognition in clut-

tered indoor scenes, despite the obvious importance of this

application to both communities. As reviewed below, pub-

lished 3-D datasets have been limited to instance tasks, or

to a very small numbers of categories. We have collected

and describe here the initial bulk of the Berkeley 3-D Ob-

ject dataset (B3DO), an ongoing collection effort using the

Kinect sensor in domestic environments. The dataset al-

ready has an order of magnitude more variation than previ-

Figure 1. Two scenes typical of our dataset.

ously published datasets. The latest version of the dataset is

available at http://www.kinectdata.com

As with existing 2-D challenge datasets, our dataset has

considerable variation in pose and object size. An impor-

tant observation our dataset enables is that the actual world

size distribution of objects has less variance than the image-

projected, apparent size distribution. We report the statistics

of these and other quantities for categories in our dataset.

A key question is what value does depth data offer for

category level recognition? It is conventional wisdom that

ideal 3-D observations provide strong shape cues for recog-

nition, but in practice even the cleanest 3-D scans may re-

veal less about an object than available 2-D intensity data.

Numerous schemes for deﬁning 3-D features analogous to

popular 2-D features for category-level recognition have

been proposed and can perform in uncluttered domains. We

evaluate the application of HOG descriptors on 3D data and

evaluate the beneﬁt of such a scheme on our dataset. We

also use our observation about world size distribution to

place a size prior on detections, and ﬁnd that it improves

detections as evaluated by average precision, and provides

a potential beneﬁt for detection efﬁciency.

下载后可阅读完整内容，剩余6页未读，立即下载

petalcs

粉丝: 2
资源: 8

利用Kinect打造的3D物体类别数据集

YOLO-People-Detection-Dataset-1: 7000+图像行人目标检测数据集

Fire-Detection-Image-Dataset：丰富场景下的火灾图像识别研究

YOLO-crosswalk-dataset-2.zip: 8000张行人道斑马线目标检测数据集

Object-Recognition-on-aYahoo-dataset:雅虎图像数据集上的对象识别算法

utl-python-panda-dataframe-to-sas-dataset:熊猫数据框到SAS数据集

Recommendation-System---Book-Crossing-Dataset:基于基于用户和基于项目的协作过滤方法的建书推荐系统

95-Cloud-An-Extension-to-38-Cloud-Dataset:卫星图像中云的二进制分割的巨大数据集

AM代码MATLAB--Gaussian-Process-Regression-GPR-for-UCI-dataset:高斯过程回归(GPR)

babel-plugin-styled-components-dataset：将自定义数据集添加到styled-components以进行更好的调试和测试

Complete-Blood-Cell-Count-Dataset:全血细胞计数（CBC）数据集包含总共360幅红细胞（RBC），白细胞（WBC）和带有注释的血小板的血液涂片图像

最新资源