Fast and Accurate Calibration of a Kinect Sensor
Carolina Raposo
Institute of Systems and Robotics
Dept. of Electrical and Comp. Eng.
University of Coimbra
3030-290 Coimbra, Portugal
carolinasraposo@gmail.com
Jo
˜
ao Pedro Barreto
Institute of Systems and Robotics
Dept. of Electrical and Comp. Eng.
University of Coimbra
3030-290 Coimbra, Portugal
jpbar@isr.uc.pt
Urbano Nunes
Institute of Systems and Robotics
Dept. of Electrical and Comp. Eng.
University of Coimbra
3030-290 Coimbra, Portugal
urbano@isr.uc.pt
Abstract—The article describes a new algorithm for calibrat-
ing a Kinect sensor that achieves high accuracy using only 6
to 10 image-disparity pairs of a planar checkerboard pattern.
The method estimates the projection parameters for both color
and depth cameras, the relative pose between them, and the
function that converts kinect disparity units (kdu) into metric
depth. We build on the recent work of Herrera et. al [8] that
uses a large number of input frames and multiple iterative
minimization steps for obtaining very accurate calibration
results. We propose several modifications to this estimation
pipeline that dramatically improve stability, usability, and
runtime. The modifications consist in: (i) initializing the relative
pose using a new minimal, optimal solution for registering
3D planes across different reference frames; (ii) including a
metric constraint during the iterative refinement to avoid a
drift in the disparity to depth conversion; and (iii) estimating
the parameters of the depth distortion model in an open-loop
post-processing step. Comparative experiments show that our
pipeline can achieve a calibration accuracy similar to [8] while
using less than 1/6 of the input frames and running in 1/30
of the time.
Keywords-Kinect; Camera Calibration, RGB-Depth Camera
Pair
I. INTRODUCTION
Nowadays, the joint information provided by cameras
and depth sensors has applications in areas including scene
reconstruction, indoor mapping, and mobile robotics. The
Kinect is a camera pair capable of providing such informa-
tion. Its depth sensor consists of a projector that emits a dot
pattern which is detected by an infrared (IR) camera. The
Kinect has been used for multiple purposes including 3D
modeling of indoor environments [6], and Structure from
Motion [12]. Most of these applications require the camera
pair to be calibrated both intrinsically and extrinsically. The
intrinsic calibration consists in determining the parameters
that enable to convert measurement units into metric units.
The extrinsic calibration consists in locating the sensors in a
common coordinate frame, for them to function as a whole.
The literature about color camera calibration is vast with
the methods that use a planar checkerboard pattern being
specially popular because they are stable, accurate, and
the calibration rig is easy to build [15], [1]. For depth
sensors, calibration methods depend on the technology used,
whether they are time-of-flight (ToF) cameras, laser range
scanners, or structured light scanners. Methods which use
color discontinuities [4], or planar surfaces [14], [12], [7],
[8], [13] have been developed.
In this work, we build on the recent work of Herrera
et. al [8] that uses image-disparity map pairs of planes
to accurately calibrate a Kinect device. They use tens of
images to estimate the intrinsic parameters of the color and
depth cameras, as well as their relative pose. The method
relies on multiple iterative optimization steps that take min-
utes to complete. We propose several modifications to this
calibration pipeline that improve stability, and dramatically
decrease the number of input images and runtime. The
experiments show that our method is able to accomplish
similar accuracy to [8], using as few as 6-10 images, as
opposed to 60 images, and running in 20-30 sec, instead of
15 min.
A. Related work
Kinect is a device for the consumer market of games
and entertainment. The intrinsic parameters of both depth
and color cameras, as well as their relative pose, are pre-
calibrated in factory and recorded in the firmware. Average
values for these parameters are known by the community and
commonly used in robotic applications [3]. However, it is
well known that these parameters vary from device to device,
and that the factory presets are not accurate enough for
many applications [6], [12]. This justifies the development
of calibration methods for the Kinect, or of methods to refine
and improve the accuracy of the factory presets.
Authors have tried to independently calibrate the intrinsics
of the depth sensor and color camera, and then register both
in a common reference frame [11], [13]. As pointed out by
Herrera et. al [8], the depth and the color camera must be
calibrated together both because the accuracy in the color
camera propagates to the depth camera, and because all
available information is being used.
Depth sensors may present depth distortions which de-
creases their accuracy. This is the case of the Kinect device
which has shown radially symmetric distortions [12] that
are not corrected in the manufacturer’s calibration. Herrera
2013 International Conference on 3D Vision
978-0-7695-5067-1/13 $26.00 © 2013 IEEE
DOI 10.1109/3DV.2013.52
342