KITTI数据集：自动驾驶与机器人视觉研究的里程碑

需积分: 0 34 浏览量更新于2024-08-05 收藏 2.95MB PDF 举报

"KITTI数据库介绍1" 是一篇针对移动机器人和自动驾驶领域的重要论文，由Andreas Geiger、Philip Lenz、Christoph Stiller 和 Raquel Urtasun合作编写。该研究的核心焦点是介绍一个名为KITTI（Klann Institute for Technology and Information）的新数据集，它旨在推动计算机视觉与机器人技术的融合。 KITTI数据集的特点在于其丰富的多模态传感器数据，包括高分辨率彩色和灰度立体相机、Velodyne 3D激光雷达以及精确的GPS/IMU（全球定位系统/惯性测量单元）导航系统。这些传感器在6小时的交通场景中持续工作，频率高达每秒10到100次，捕捉到了真实世界的交通环境，涵盖高速公路、乡村地区和密集的城市街道等各种复杂场景。数据集不仅包含经过校准、同步并带有时间戳的图像序列，还提供了经过处理的正则化图像以及原始数据。更重要的是，数据集中的对象被标记为三维跟踪片段，这为后续的研究提供了丰富的训练和测试素材。论文提供了一系列在线基准测试，涵盖了立体视觉、光流估计、物体检测等任务，以便研究人员能够量化和比较不同算法的性能。这些基准测试对于评估和改进自动驾驶系统的实时感知和决策能力至关重要。 KITTI数据集不仅仅是一个数据集，它是一个完整的平台，包括了数据采集、处理、标注和评估流程，为计算机视觉、机器学习和自动驾驶技术的发展提供了宝贵的研究资源。它的存在极大地推动了学术界和工业界在这两个领域的研究进步，为实现智能驾驶的实际应用奠定了坚实的基础。

Vision meets Robotics: The KITTI Dataset

Andreas Geiger, Philip Lenz, Christoph Stiller and Raquel Urtasun

Abstract—We present a novel dataset captured from a VW

station wagon for use in mobile robotics and autonomous driving

research. In total, we recorded 6 hours of trafﬁc scenarios at

10-100 Hz using a variety of sensor modalities such as high-

resolution color and grayscale stereo cameras, a Velodyne 3D

laser scanner and a high-precision GPS/IMU inertial navigation

system. The scenarios are diverse, capturing real-world trafﬁc

situations and range from freeways over rural areas to inner-

city scenes with many static and dynamic objects. Our data is

calibrated, synchronized and timestamped, and we provide the

rectiﬁed and raw image sequences. Our dataset also contains

object labels in the form of 3D tracklets and we provide online

benchmarks for stereo, optical ﬂow, object detection and other

tasks. This paper describes our recording platform, the data

format and the utilities that we provide.

Index Terms—dataset, autonomous driving, mobile robotics,

ﬁeld robotics, computer vision, cameras, laser, GPS, benchmarks,

stereo, optical ﬂow, SLAM, object detection, tracking, KITTI

I. INTRODUCTION

The KITTI dataset has been recorded from a moving plat-

form (Fig. 1) while driving in and around Karlsruhe, Germany

(Fig. 2). It includes camera images, laser scans, high-precision

GPS measurements and IMU accelerations from a combined

GPS/IMU system. The main purpose of this dataset is to

push forward the development of computer vision and robotic

algorithms targeted to autonomous driving [1]–[7]. While our

introductory paper [8] mainly focuses on the benchmarks,

their creation and use for evaluating state-of-the-art computer

vision methods, here we complement this information by

providing technical details on the raw data itself. We give

precise instructions on how to access the data and comment

on sensor limitations and common pitfalls. The dataset can

be downloaded from http://www.cvlibs.net/datasets/kitti. For

a review on related work, we refer the reader to [8].

II. SENSOR SETUP

Our sensor setup is illustrated in Fig. 3:

• 2 × PointGray Flea2 grayscale cameras (FL2-14S3M-C),

1.4 Megapixels, 1/2” Sony ICX267 CCD, global shutter

• 2 × PointGray Flea2 color cameras (FL2-14S3C-C), 1.4

Megapixels, 1/2” Sony ICX267 CCD, global shutter

• 4 × Edmund Optics lenses, 4mm, opening angle ∼ 90

◦

vertical opening angle of region of interest (ROI) ∼ 35

◦

• 1 × Velodyne HDL-64E rotating 3D laser scanner, 10 Hz,

64 beams, 0.09 degree angular resolution, 2 cm distance

accuracy, collecting ∼ 1.3 million points/second, ﬁeld of

◦

horizontal, 26.8

◦

vertical, range: 120 m

A. Geiger, P. Lenz and C. Stiller are with the Department of Measurement

and Control Systems, Karlsruhe Institute of Technology, Germany. Email:

{geiger,lenz,stiller}@kit.edu

R. Urtasun is with the Toyota Technological Institute at Chicago, USA.

Email: rurtasun@ttic.edu

Velodyne HDL-64E Laserscanner

Point Gray Flea 2

Video Cameras

OXTS

RT 3003

GPS / IMU

OXTS

RT 3003

GPS / IMU

Fig. 1. Recording Platform. Our VW Passat station wagon is equipped

with four video cameras (two color and two grayscale cameras), a rotating

3D laser scanner and a combined GPS/IMU inertial navigation system.

• 1 × OXTS RT3003 inertial and GPS navigation system,

6 axis, 100 Hz, L1/L2 RTK, resolution: 0.02m / 0.1

◦

Note that the color cameras lack in terms of resolution due

to the Bayer pattern interpolation process and are less sensitive

to light. This is the reason why we use two stereo camera

rigs, one for grayscale and one for color. The baseline of

both stereo camera rigs is approximately 54 cm. The trunk

of our vehicle houses a PC with two six-core Intel XEON

X5650 processors and a shock-absorbed RAID 5 hard disk

storage with a capacity of 4 Terabytes. Our computer runs

Ubuntu Linux (64 bit) and a real-time database [9] to store

the incoming data streams.

III. DATASET

The raw data described in this paper can be accessed from

http://www.cvlibs.net/datasets/kitti and contains ∼ 25% of our

overall recordings. The reason for this is that primarily data

with 3D tracklet annotations has been put online, though we

will make more data available upon request. Furthermore, we

have removed all sequences which are part of our benchmark

test sets. The raw data set is divided into the categories ’Road’,

’City’, ’Residential’, ’Campus’ and ’Person’. Example frames

are illustrated in Fig. 5. For each sequence, we provide the raw

data, object annotations in form of 3D bounding box tracklets

and a calibration ﬁle, as illustrated in Fig. 4. Our recordings

have taken place on the 26th, 28th, 29th, 30th of September

and on the 3rd of October 2011 during daytime. The total size

of the provided data is 180 GB.

下载后可阅读完整内容，剩余5页未读，立即下载

奔跑的楠子

粉丝: 31
资源: 299

KITTI数据集：自动驾驶与机器人视觉研究的里程碑

stereo_inertial_kitti.cc

kitti数据集

KITTI数据集介绍

KITTI数据集处理

matlab精度检验代码-toolbox.badacost.kitti.public:工具箱.badacost.kitti.public

计算机视觉数据集全部汇总介绍知识分享

基于opencv的前沿项目论文进展 2018.11.06 方建勇1

卷积神经网络中都有哪些数据库

yolov7进行车道线检测，有标记好的开源数据库吗？

2d激光slam数据集介绍

最新资源