Structure-from-Motion Revisited
Johannes L. Sch
¨
onberger
1,2∗
, Jan-Michael Frahm
1
1
University of North Carolina at Chapel Hill
2
Eidgen
¨
ossische Technische Hochschule Z
¨
urich
jsch@inf.ethz.ch, jmf@cs.unc.edu
Abstract
Incremental Structure-from-Motion is a prevalent strat-
egy for 3D reconstruction from unordered image collec-
tions. While incremental reconstruction systems have
tremendously advanced in all regards, robustness, accu-
racy, completeness, and scalability remain the key problems
towards building a truly general-purpose pipeline. We pro-
pose a new SfM technique that improves upon the state of
the art to make a further step towards this ultimate goal.
The full reconstruction pipeline is released to the public as
an open-source implementation.
1. Introduction
Structure-from-Motion (SfM) from unordered images
has seen tremendous evolution over the years. The early
self-calibrating metric reconstruction systems [
42, 6, 19,
16, 46] served as the foundation for the first systems on
unordered Internet photo collections [
47, 53] and urban
scenes [
45]. Inspired by these works, increasingly large-
scale reconstruction systems have been developed for hun-
dreds of thousands [
1] and millions [20, 62, 51, 50] to re-
cently a hundred million Internet photos [
30]. A variety
of SfM strategies have been proposed including incremen-
tal [53, 1, 20, 62], hierarchical [23], and global approaches
[
14, 61, 56]. Arguably, incremental SfM is the most popular
strategy for reconstruction of unordered photo collections.
Despite its widespread use, we still have not accomplished
to design a truly general-purpose SfM system. While the
existing systems have advanced the state of the art tremen-
dously, robustness, accuracy, completeness, and scalability
remain the key problems in incremental SfM that prevent its
use as a general-purpose method. In this paper, we propose
a new SfM algorithm to approach this ultimate goal. The
new method is evaluated on a variety of challenging datasets
and the code is contributed to the research community as an
open-source implementation named COLMAP available at
https://github.com/colmap/colmap.
∗
This work was done at the University of North Carolina at Chapel Hill.
Figure 1. Result of Rome with 21K registered out of 75K images.
2. Review of Structure-from-Motion
SfM is the process of reconstructing 3D structure from
its projections into a series of images taken from different
viewpoints. Incremental SfM (denoted as SfM in this paper)
is a sequential processing pipeline with an iterative recon-
struction component (Fig.
2). It commonly starts with fea-
ture extraction and matching, followed by geometric verifi-
cation. The resulting scene graph serves as the foundation
for the reconstruction stage, which seeds the model with
a carefully selected two-view reconstruction, before incre-
mentally registering new images, triangulating scene points,
filtering outliers, and refining the reconstruction using bun-
dle adjustment (BA). The following sections elaborate on
this process, define the notation used throughout the paper,
and introduce related work.
2.1. Correspondence Search
The first stage is correspondence search which finds
scene overlap in the input images I = {I
i
| i = 1...N
I
}
and identifies projections of the same points in overlapping
images. The output is a set of geometrically verified image
pairs
¯
C and a graph of image projections for each point.
Feature Extraction. For each image I
i
, SfM detects sets
F
i
= {(x
j
, f
j
) | j = 1...N
F
i
} of local features at loca-
tion x
j
∈ R
2
represented by an appearance descriptor f
j
.
The features should be invariant under radiometric and ge-
ometric changes so that SfM can uniquely recognize them
in multiple images [
41]. SIFT [39], its derivatives [59], and
more recently learned features [
9] are the gold standard in
terms of robustness. Alternatively, binary features provide
better efficiency at the cost of reduced robustness [
29].
4104