Bundle Adjustment — A Modern Synthesis 311
the image intensities in its region against either an idealized template or another image of
the feature, using an appropriate geometric deformation model, etc. For example, suppose
that the intensity matching model is f(u)=
1
2
ρ(δI(u)
2
) where the integration is
over some image patch, δI is the current intensity prediction error, u parametrizes the local
geometry (patch translation & warping), and ρ(·) is some intensity error robustifier. Then
the cost gradient in terms of u is g
u
=
df
du
=
ρ
δI
dI
du
. Similarly, the cost Hessian in
u in a Gauss-Newton approximation is H
u
=
d
2
f
du
2
≈
ρ
(
dI
du
)
dI
du
. In a feature based
model, we express u = u(x) as a function of the bundle parameters, so if J
u
=
du
dx
we have
a corresponding cost gradient and Hessian contribution g
x
= g
u
J
u
and H
x
= J
u
H
u
J
u
.
In other words, the intensity matching model is locally equivalent to a quadratic feature
matching one on the ‘features’ u(x), with effective weight (inverse covariance) matrix
W
u
= H
u
. All image feature error models in vision are ultimately based on such an
underlying intensity matching model. As feature covariances are a function of intensity
gradients
ρ
(
dI
du
)
dI
du
, they can be both highly variable between features (depending
on how much local gradient there is), and highly anisotropic (depending on howdirectional
the gradients are). E.g., for points along a 1D intensity edge, the uncertainty is large in the
along edge direction and small in the across edge one.
3.5 Implicit Models
Sometimes observations are most naturally expressed in terms of an implicit observation-
constraining model h(x, z)=0, rather than an explicit observation-predicting one z =
z(x). (The associated image error still has the form f(z
− z)). For example, if the model
is a 3D curve and we observe points on it (the noisy images of 3D points that may lie
anywhere along the 3D curve), we can predict the whole image curve, but not the exact
position of each observation along it. We only have the constraint that the noiseless image
of the observed point would lie on the noiseless image of the curve, if we knew these. There
are basically two ways to handle implicit models: nuisance parameters and reduction.
Nuisance parameters: In this approach, the model is made explicit by adding additional
‘nuisance’ parameters representing something equivalent to model-consistent estimates
of the unknown noise free observations, i.e.toz with h(x, z)=0. The most direct way
to do this is to include the entire parameter vector z as nuisance parameters, so that we
have to solve a constrained optimization problem on the extended parameter space (x, z),
minimizing f(z
− z) over (x, z) subject to h(x, z)=0. This is a sparse constrained
problem, which can be solved efficiently using sparse matrix techniques (§6.3). In fact,
for image observations, the subproblems in z (optimizing f(z
− z) over z for fixed z
and x) are small and for typical f rather simple. So in spite of the extra parameters z,
optimizing this model is not significantly more expensive than optimizing an explicit one
z = z(x) [14, 13, 105, 106]. For example, when estimating matching constraints between
image pairs or triplets [60, 62], instead of using an explicit 3D representation, pairs or
triplets of corresponding image points can be used as features z
i
, subject to the epipolar
or trifocal geometry contained in x [105, 106].
However, if a smaller nuisance parameter vector than z can be found, it is wise to use
it. In the case of a curve, it suffices to include just one nuisance parameter per observation,
saying where along the curve the corresponding noise free observation is predicted to
lie. This model exactly satisfies the constraints, so it converts the implicit model to an
unconstrained explicit one z = z(x, λ), where λ are the along-curve nuisance parameters.