没有合适的资源？快使用搜索试试~ 我知道了~

首页视觉slam综述论文总结很到位

资源详情

资源评论

资源推荐

1

Past, Present, and Future of Simultaneous

Localization And Mapping: Towards the

Robust-Perception Age

Cesar Cadena, Luca Carlone, Henry Carrillo, Yasir Latif,

Davide Scaramuzza, Jos

´

e Neira, Ian D. Reid, John J. Leonard

Abstract—Simultaneous Localization and Mapping (SLAM)

consists in the concurrent construction of a model of the

environment (the map), and the estimation of the state of the robot

moving within it. The SLAM community has made astonishing

progress over the last 30 years, enabling large-scale real-world

applications, and witnessing a steady transition of this technology

to industry. We survey the current state of SLAM. We start by

presenting what is now the de-facto standard formulation for

SLAM. We then review related work, covering a broad set of

topics including robustness and scalability in long-term mapping,

metric and semantic representations for mapping, theoretical

performance guarantees, active SLAM and exploration, and

other new frontiers. This paper simultaneously serves as a

position paper and tutorial to those who are users of SLAM. By

looking at the published research with a critical eye, we delineate

open challenges and new research issues, that still deserve careful

scientiﬁc investigation. The paper also contains the authors’ take

on two questions that often animate discussions during robotics

conferences: Do robots need SLAM? and Is SLAM solved?

Index Terms—Robots, SLAM, localization, mapping.

I. INTRODUCTION

S

LAM consists in the simultaneous estimation of the state

of a robot equipped with on-board sensors, and the con-

struction of a model (the map) of the environment that the

sensors are perceiving. In simple instances, the robot state is

described by its pose (position and orientation), although other

quantities may be included in the state, such as robot velocity,

sensor biases, and calibration parameters. The map, on the

other hand, is a representation of aspects of interest (e.g.,

position of landmarks, obstacles) describing the environment

in which the robot operates.

The need to build a map of the environment is twofold.

First, the map is often required to support other tasks; for

instance, a map can inform path planning or provide an

C. Cadena is with the Autonomous Systems Lab, ETH Zurich, Switzerland.

e-mail: cesarc@ethz.ch

L. Carlone is with the Laboratory for Information and Decision Systems,

Massachusetts Institute of Technology, USA. e-mail: lcarlone@mit.edu

H. Carrillo is with the Escuela de Ciencias Exactas e Ingenier

´

ıa, Universidad

Sergio Arboleda, Colombia. e-mail: henry.carrillo@usa.edu.co

Y. Latif and I.D. Reid are with the Australian Center for

Robotic Vision, University of Adelaide, Australia. e-mail:

yasir.latif@adelaide.edu.au, ian.reid@adelaide.edu.au

J. Neira is with the Departamento de Inform

´

atica e Ingenier

´

ıa de Sistemas,

Universidad de Zaragoza, Spain. e-mail: jneira@unizar.es

D. Scaramuzza is with the Robotics and Perception Group, University of

Zurich, Switzerland. e-mail: sdavide@ifi.uzh.ch

J.J. Leonard is with Marine Robotics Group, Massachusetts Institute of

Technology, USA. e-mail: jleonard@mit.edu

intuitive visualization for a human operator. Second, the map

allows limiting the error committed in estimating the state of

the robot. In the absence of a map, dead-reckoning would

quickly drift over time; on the other hand, given a map, the

robot can “reset” its localization error by re-visiting known

areas (the so called loop closure). Therefore, SLAM ﬁnds

applications in all scenarios in which a prior map is not

available and needs to be built.

In some robotics applications a map of the environment is

known a priori. For instance, a robot operating on a factory

ﬂoor can be provided with a manually-built map of artiﬁcial

beacons in the environment. Another example is the case in

which the robot has access to GPS data (the GPS satellites

can be considered as moving beacons at known locations).

The popularity of the SLAM problem is connected with the

emergence of indoor applications of mobile robotics. Indoor

operation rules out the use of GPS to bound the localization

error; furthermore, SLAM provides an appealing alternative

to user-built maps, showing that robot operation is possible in

the absence of an ad-hoc localization infrastructure.

A thorough historical review of the ﬁrst 20 years of the

SLAM problem is given by Durrant-Whyte and Bailey in

two surveys [14, 94]. These mainly cover what we call the

classical age (1986-2004); the classical age saw the intro-

duction of the main probabilistic formulations for SLAM,

including approaches based on Extended Kalman Filters, Rao-

Blackwellised Particle Filters, and maximum likelihood esti-

mation; moreover, it delineated the basic challenges connected

to efﬁciency and robust data association. Other two excellent

references describing the three main SLAM formulations of

the classical age are the chapter of Thrun and Leonard [299,

chapter 37] and the book of Thrun, Burgard, and Fox [298].

The subsequent period is what we call the algorithmic-analysis

age (2004-2015), and is partially covered by Dissanayake et

al. in [89]. The algorithmic analysis period saw the study

of fundamental properties of SLAM, including observability,

convergence, and consistency. In this period, the key role of

sparsity towards efﬁcient SLAM solvers was also understood,

and the main open-source SLAM libraries were developed.

We review the main SLAM surveys to date in Table I,

observing that most recent surveys only cover speciﬁc aspects

or sub-ﬁelds of SLAM. The popularity of SLAM in the last 30

years is not surprising if one thinks about the manifold aspects

that SLAM involves. At the lower level (called the front-end

in Section II) SLAM naturally intersects other research ﬁelds

arXiv:1606.05830v2 [cs.RO] 20 Jul 2016

2

such as computer vision and signal processing; at the higher

level (that we later call the back-end), SLAM is an appealing

mix of geometry, graph theory, optimization, and probabilistic

estimation. Finally, a SLAM expert has to deal with practical

aspects ranging from sensor modeling to system integration.

The present paper gives a broad overview of the current

state of SLAM, and offers the perspective of part of the

community on the open problems and future directions for

the SLAM research. The paper summarizes the outcome of

the workshop “The Problem of Mobile Sensors: Setting future

goals and indicators of progress for SLAM” [40], held during

the Robotics: Science and System (RSS) conference (Rome,

July 2015). Our main focus is on metric and semantic SLAM,

and we refer the reader to the recent survey of Lowry et

al. [198] for a more comprehensive coverage of vision-based

topological SLAM and place recognition.

Before delving into the paper, we provide our take on

two questions that often animate discussions during robotics

conferences.

Do autonomous robots really need SLAM? Answer-

ing this question requires understanding what makes SLAM

unique. SLAM aims at building a globally consistent rep-

resentation of the environment, leveraging both ego-motion

measurements and loop closures. The keyword here is “loop

closure”: if we sacriﬁce loop closures, SLAM reduces to

odometry. In early applications, odometry was obtained by

integrating wheel encoders. The pose estimate obtained from

wheel odometry quickly drifts, making the estimate unusable

after few meters [162, Ch. 6]; this was one of the main thrusts

behind the development of SLAM: the observation of external

landmarks is useful to reduce the trajectory drift and possibly

correct it [227]. However, more recent odometry algorithms

are based on visual and inertial information, and have very

small drift (< 0.5% of the trajectory length [109]). Hence the

question becomes legitimate: do we really need SLAM? Our

answer is three-fold.

First of all, we observe that the SLAM research done

over the last decade was the one producing the visual-inertial

odometry algorithms that currently represent the state of the

art [109, 201]; in this sense visual-inertial navigation (VIN)

is SLAM: VIN can be considered a reduced SLAM system,

in which the loop closure (or place recognition) module is

disabled. More generally, SLAM led to study sensor fusion

under more challenging setups (i.e., no GPS, low quality

sensors) than previously considered in other literature (e.g.,

inertial navigation in aerospace engineering).

The second answer regards loop closures. A robot perform-

ing odometry and neglecting loop closures interprets the world

as an “inﬁnite corridor” (Fig. 1a) in which the robot keeps

exploring new areas indeﬁnitely. A loop closure event informs

the robot that this “corridor” keeps intersecting itself (Fig. 1b).

The advantage of loop closure now becomes clear: by ﬁnding

loop closures, the robot understands the real topology of the

environment, and is able to ﬁnd shortcuts between locations

(e.g., point B and C in the map). Therefore, if getting the

right topology of the environment is one of the merits of

SLAM, why not simply drop the metric information and

just do place recognition? The answer is simple: the metric

TABLE I: Surveying the surveys

Year Topic Reference

2006

Probabilistic approaches

and data association

Durrant-Whyte and Bailey [14, 94]

2008 Filtering approaches Aulinas et al. [12]

2008 Visual SLAM Neira et al. (special issue) [220]

2011 SLAM back-end Grisetti et al. [129]

2011

Observability, consistency

and convergence

Dissanayake et al. [89]

2012 Visual odometry

Scaramuzza and Fraundofer [115,

274]

2016 Multi robot SLAM Saeedi et al. [271]

2016 Visual place recognition Lowry et al. [198]

information makes place recognition much simpler and more

robust; the metric reconstruction informs the robot about loop

closure opportunities and allows discarding spurious loop clo-

sures [187, 295]. Therefore, while SLAM might be redundant

in principle (an oracle place recognition module would sufﬁce

for topological mapping), SLAM offers a natural defense

against wrong data association and perceptual aliasing, where

similarly looking scenes, corresponding to distinct locations

in the environment, would deceive place recognition. In this

sense, the SLAM map provides a way to predict and validate

future measurements: we believe that this mechanism is key

to robust operation.

The third answer is that SLAM is needed since many appli-

cations implicitly or explicitly do require a globally consistent

map. For instance, in many military and civil applications the

goal of the robot is to explore an environment and report

a map to the human operator. Another example is the case

in which the robot has to perform structural inspection (of a

building, bridge, etc.); also in this case a globally consistent

3D reconstruction is a requirement for successful operation.

One can identify tasks for which different ﬂavors of SLAM

formulations are more suitable than others. Therefore, when a

roboticist has to design a SLAM system, he/she is faced with

multiple design choices. For instance, a topological map can

be used to analyze reachability of a given place, but it is not

suitable for motion planning and low-level control; a locally-

consistent metric map is well-suited for obstacle avoidance and

local interactions with the environment, but it may sacriﬁce

accuracy; a globally-consistent metric map allows the robot to

perform global path planning, but it may be computationally

demanding to compute and maintain. A more general way

to choose the most appropriate SLAM system is to think

about SLAM as a mechanism to compute a sufﬁcient statistic

that summarizes all past observations of the robot, and in

this sense, which information to retain in this compressed

representation is deeply task-dependent.

Is SLAM solved? This is another question that is often

asked within the robotics community, c.f. [118]. The difﬁculty

of answering this question lies in the question itself: SLAM

is such a broad topic that the question is well posed only

for a given robot/environment/performance combination. In

3

Fig. 1: Left: map built from odometry. The map is homotopic to a long corridor

that goes from the starting position A to the ﬁnal position B. Points that are

close in reality (e.g., B and C) may be arbitrarily far in the odometric map.

Right: map build from SLAM. By leveraging loop closures, SLAM estimates

the actual topology of the environment, and “discovers” shortcuts in the map.

particular, one can evaluate the maturity of the SLAM problem

once the following aspects are speciﬁed:

• robot: type of motion (e.g., dynamics, maximum speed),

available sensors (e.g., resolution, sampling rate), avail-

able computational resources;

• environment: planar or three-dimensional, presence of

natural or artiﬁcial landmarks, amount of dynamic ele-

ments, amount of symmetry and risk of perceptual alias-

ing. Note that many of these aspects actually depend on

the sensor-environment pair: for instance, two rooms may

look identical for a 2D laser scanner (perceptual aliasing),

while a camera may discern them from appearance cues;

• performance requirements: desired accuracy in the esti-

mation of the state of the robot, accuracy and type of

representation of the environment (e.g., landmark-based

or dense), success rate (percentage of tests in which the

accuracy bounds are met), estimation latency, maximum

operation time, maximum size of the mapped area.

For instance, mapping a 2D indoor environment with a

robot equipped with wheel encoders and a laser scanner,

with sufﬁcient accuracy (< 10cm ) and sufﬁcient robustness

(say, low failure rate), can be considered largely solved (an

example of industrial system performing SLAM is the Kuka

Navigation Solution [182]). Similarly, vision-based SLAM

with slowly-moving robots (e.g., Mars rovers [203], domes-

tic robots [148]), and visual-inertial odometry [126] can be

considered mature research ﬁelds. On the other hand, other

robot/environment/performance combinations still deserve a

large amount of fundamental research. Current SLAM algo-

rithms can be easily induced to fail when either the motion

of the robot or the environment are too challenging (e.g.,

fast robot dynamics, highly dynamic environments); similarly,

SLAM algorithms are often unable to face strict performance

requirements, e.g., high rate estimation for fast closed-loop

control. This survey will provide a comprehensive overview

of these open problems, among others.

We conclude this section with some broader considerations

about the future of SLAM. We argue that we are entering

in a third era for SLAM, the robust-perception age, which is

characterized by the following key requirements:

1) robust performance: the SLAM system operates with low

failure rate for an extended period of time in a broad set of

environments; the system includes fail-safe mechanisms

Fig. 2: Front-end and back-end in a typical SLAM system.

and has self-tuning capabilities

1

in that it can adapt the

selection of the system parameters to the scenario.

2) high-level understanding: the SLAM system goes be-

yond basic geometry reconstruction to obtain a high-

level understanding of the environment (e.g., semantic,

affordances, high-level geometry, physics);

3) resource awareness: the SLAM system is tailored to

the available sensing and computational resources, and

provides means to adjust the computation load depending

on the available resources;

4) task-driven inference: the SLAM system produces adap-

tive map representations, whose complexity can change

depending on the task that the robot has to perform.

Paper organization. The paper starts by presenting a

standard formulation and architecture for SLAM (Section II).

Section III tackles robustness in life-long SLAM. Section IV

deals with scalability. Section V discusses how to represent

the geometry of the environment. Section VI extends the

question of the environment representation to the modeling

of semantic information. Section VII provides an overview

of the current accomplishments on the theoretical aspects of

SLAM. Section VIII broadens the discussion and reviews the

active SLAM problem in which decision making is used to

improve the quality of the SLAM results. Section IX provides

an overview of recent trends in SLAM, including the use

of unconventional sensors. Section X provides ﬁnal remarks.

Throughout the paper, we provide many pointers to related

work outside the robotics community. Despite its unique traits,

SLAM is related to problems addressed in computer vision,

computer graphics, and control theory, and cross-fertilization

among these ﬁelds is a necessary condition to enable fast

progress.

For the non-expert reader, we recommend to read Durrant-

Whyte and Bailey’s SLAM tutorials [14, 94] before delving

in this position paper. The more experienced researchers can

jump directly to the section of interest, where they will ﬁnd

a self-contained overview of the state of the art and open

problems.

II. ANATOMY OF A MODERN SLAM SYSTEM

The architecture of a SLAM system includes two main

components: the front-end and the back-end. The front-end

abstracts sensor data into models that are amenable for

1

The SLAM community has been largely affected by the “curse of manual

tuning”, in that satisfactory operation is enabled by expert tuning of the system

parameters (e.g., stopping conditions, thresholds for outlier rejection).

4

estimation, while the back-end performs inference on the

abstracted data produced by the front-end. This architecture

is summarized in Fig. 2. We review both components, starting

from the back-end.

Maximum-a-posteriori (MAP) estimation and the SLAM

back-end. The current de-facto standard formulation of SLAM

has its origins in the seminal paper of Lu and Milios [199],

followed by the work of Gutmann and Konolige [133]. Since

then, numerous approaches have improved the efﬁciency and

robustness of the optimization underlying the problem [88,

108, 132, 159, 160, 235, 301]. All these approaches formulate

SLAM as a maximum-a-posteriori estimation problem, and

often use the formalism of factor graphs [180] to reason about

the interdependence among variables.

Assume that we want to estimate an unknown variable X ; as

mentioned before, in SLAM the variable X typically includes

the trajectory of the robot (as a discrete set of poses) and

the position of landmarks in the environment. We are given

a set of measurements Z = {z

k

: k = 1, . . . , m} such that

each measurement can be expressed as a function of X , i.e.,

z

k

= h

k

(X

k

)+

k

, where X

k

⊆ X is a subset of the variables,

h

k

(·) is a known function (the measurement or observation

model), and

k

is random measurement noise.

In MAP estimation, we estimate X by computing the

assignment of variables X

?

that attains the maximum of the

posterior p(X |Z) (the belief over X given the measurements):

X

?

.

= argmax

X

p(X |Z) = argmax

X

p(Z|X )p(X ) (1)

where the equality follows from the Bayes theorem. In (1),

p(Z|X ) is the likelihood of the measurements Z given the

assignment X , and p(X ) is a prior probability over X . The

prior probability includes any prior knowledge about X ; in

case no prior knowledge is available, p(X ) becomes a constant

(uniform distribution) which is inconsequential and can be

dropped from the optimization. In that case MAP estimation

reduces to maximum likelihood estimation.

Assuming that the measurements Z are independent (i.e.,

the noise terms affecting the measurements are not correlated),

problem (1) factorizes into:

X

?

= argmax

X

p(X )

m

Y

k=1

p(z

k

|X ) = p(X )

m

Y

k=1

p(z

k

|X

k

) (2)

where, on the right-hand-side, we noticed that z

k

only depends

on the subset of variables in X

k

.

Problem (2) can be interpreted in terms of inference over a

factors graph [180]. The variables correspond to nodes in the

factor graph. The terms p(z

k

|X

k

) and the prior p(X ) are called

factors, and they encode probabilistic constraints over a subset

of nodes. A factor graph is a graphical model that encodes

the dependence between the k-th factor (and its measurement

z

k

) and the corresponding variables X

k

. A ﬁrst advantage of

the factor graph interpretation is that it enables an insightful

visualization of the problem. Fig. 3 shows an example of factor

graph underlying a simple SLAM problem (more details in

Example 1 below). The ﬁgure shows the variables, namely,

the robot poses, the landmarks positions, and the camera

calibration parameters, and the factors imposing constraints

Fig. 3: SLAM as a factor graph: Blue circles denote robot poses at

consecutive time steps (x

1

, x

2

, . . .), green circles denote landmark positions

(l

1

, l

2

, . . .), red circle denotes the variable associated with the intrinsic

calibration parameters (K). Factors are shows are black dots: the label

“u” marks factors corresponding to odometry constraints, “v” marks factors

corresponding to camera observations, “c” denotes loop closures, and “p”

denotes prior factors.

among these variables. A second advantage is generality: a

factor graph can model complex inference problems with

heterogeneous variables and factors, and arbitrary interconnec-

tions; the connectivity of the factor graph in turns inﬂuences

the sparsity of the resulting SLAM problem as discussed

below.

In order to write (2) in a more explicit form, assume that

the measurement noise

k

is a zero-mean Gaussian noise with

information matrix Ω

k

(inverse of the covariance matrix).

Then, the measurement likelihood in (2) becomes:

p(z

k

|X

k

) ∝ exp(−

1

2

||h

k

(X

k

) − z

k

||

2

Ω

k

) (3)

where we use the notation ||e||

2

Ω

= e

T

Ωe. Similarly, assume

that the prior can be written as:

p(X ) ∝ exp(−

1

2

||h

0

(X ) − z

0

||

2

Ω

0

) (4)

for some given function h

0

(·), prior mean z

0

, and information

matrix Ω

0

. Since maximizing the posterior is the same as

minimizing the negative log-posterior, the MAP estimate

in (2) becomes:

X

?

= argmin

X

−log

p(X )

m

Y

k=1

p(z

k

|X

k

)

!

=

argmin

X

m

X

k=0

||h

k

(X

k

) − z

k

||

2

Ω

k

(5)

which is a nonlinear least squares problem, as in most prob-

lems of interest in robotics, h

k

(·) is a nonlinear function.

Note that the formulation (5) follows from the assumption of

Normally distributed noise. Other assumptions for the noise

distribution lead to different cost functions; for instance, if the

noise follows a Laplace distribution, the squared `

2

-norm in (5)

is replaced by the `

1

-norm. To increase resilience to outliers, it

is also common to substitute the squared `

2

-norm in (5) with

robust loss functions (e.g., Huber or Tukey loss) [146].

The computer vision expert may notice a resemblance

between problem (5) and bundle adjustment (BA) in Structure

from Motion [305]; both (5) and BA indeed stem from a

5

maximum-a-posteriori formulation. However, two key features

make SLAM unique. First, the factors in (5) are not con-

strained to model projective geometry as in BA, but include a

broad variety of sensor models, e.g., inertial sensors, wheel

encoders, GPS, to mention a few. For instance, in laser-

based mapping, the factors usually constrain relative poses

corresponding to different viewpoints, while in direct methods

for visual SLAM, the factors penalize differences in pixel

intensities across different views of the same portion of the

scene. The second difference with respect to BA is that prob-

lem (5) needs to be solved incrementally: new measurements

are made available at each time step as the robot moves.

The minimization problem (5) is commonly solved via

successive linearizations, e.g., the Gauss-Newton (GN) or

the Levenberg-Marquardt methods (more modern approaches,

based on convex relaxations and Lagrangian duality are re-

viewed in Section VII). The Gauss-Newton method proceeds

iteratively, starting from a given initial guess

ˆ

X . At each

iteration, GN approximates the minimization (5) as

δ

?

X

= argmin

δ

X

m

X

k=0

||A

k

δ

X

− b

k

||

2

Ω

k

= argmin

δ

X

||A δ

X

− b||

2

Ω

(6)

where δ

X

is a small “correction” with respect to the lineariza-

tion point

ˆ

X , A

k

.

=

∂h

k

(X )

∂X

is the Jacobian of the measurement

function h

k

(·) with respect to X , b

k

.

= z

k

− h

k

(

ˆ

X

k

) is the

residual error at

ˆ

X ; on the right-hand-side of (6), A (resp. b)

is obtained by stacking A

k

(resp. b

k

); Ω

k

is a block diagonal

matrix including the measurement information matrices Ω

k

as

diagonal blocks.

The optimal correction δ

?

X

which minimizes (6) can be

computed in closed form as:

δ

?

X

= (A

T

ΩA)

−1

A

T

Ωb (7)

and, at each iteration, the linearization point is updated via

ˆ

X ←

ˆ

X + δ

?

X

. The matrix (A

T

ΩA) is usually referred to

as the (approximate) Hessian. Note that this matrix is indeed

only an approximation of the Hessian of the cost function (5);

moreover, its invertibility is connected to the observability

properties of the underlying estimation problem.

So far we assumed that X belongs to a vector space (for

this reason the sum

ˆ

X +δ

?

X

is well deﬁned). When X includes

variables belonging to a smooth manifold (e.g., rotations), the

structure of the GN method remains unaltered, but the usual

sum (e.g.,

ˆ

X + δ

?

X

) is substituted with a suitable mapping,

called retraction [1]. In the robotics literature, the retraction

is often denoted with the operator ⊕ and maps the “small

correction” δ

?

X

deﬁned in the tangent space of the manifold at

ˆ

X , to an element of the manifold, i.e., the linearization point

is updated as

ˆ

X ←

ˆ

X ⊕ δ

?

X

. We also note that the derivation

presented so far can be extended to the case in which the

noise is not additive, as long as the noise can be written as an

explicit function of the state and the measurements.

The key insight behind modern SLAM solvers is that the

Jacobian matrix A appearing in (7) is sparse and its sparsity

structure is dictated by the topology of the underlying factor

graph. This enables the use of fast linear solvers to compute

δ

?

X

[159, 160, 183, 252]. Moreover, it allows designing in-

cremental (or online) solvers, which update the estimate of

X as new observations are acquired [159, 160, 252]. Current

SLAM libraries (e.g., GTSAM [86], g2o [183], Ceres [269],

iSAM [160], and SLAM++ [252]) are able to solve problems

with tens of thousands of variables in few seconds. The hands-

on tutorials [86, 129] provide excellent introductions to two of

the most popular SLAM libraries; each library also includes

a set of examples showcasing real SLAM problems.

The SLAM formulation described so far is commonly

referred to as maximum-a-posteriori estimation, factor graph

optimization, graph-SLAM, full smoothing, or smoothing and

mapping (SAM). A popular variation of this framework is pose

graph optimization, in which the variables to be estimated are

poses sampled along the trajectory of the robot, and each factor

imposes a constraint on a pair of poses.

MAP estimation has been proven to be more accurate

and efﬁcient than original approaches for SLAM based on

nonlinear ﬁltering. We refer the reader to the surveys [14, 94]

for an overview on ﬁltering approaches, and to [291] for a

comparison between ﬁltering and smoothing. However, we

remark that some SLAM systems based on EKF have also

been demonstrated to attain state-of-the-art performance. Ex-

cellent examples of EKF-based SLAM systems include the

Multi-State Constraint Kalman Filter of Mourikis and Roume-

liotis [214], and the VIN systems of Kottas et al. [176] and

Hesch et al. [139]. Not surprisingly, the performance mismatch

between ﬁltering and MAP estimation gets smaller when the

linearization point for the EKF is accurate (as it happens

in visual-inertial navigation problems), when using sliding-

window ﬁlters, and when potential sources of inconsistency in

the EKF are taken care of [139, 143, 176].

As discussed in the next section, MAP estimation is usually

performed on a pre-processed version of the sensor data. In

this regard, it is often referred to as the SLAM back-end.

Sensor-dependent SLAM front-end. In practical robotics

applications, it might be hard to write directly the sensor

measurements as an analytic function of the state, as required

in MAP estimation. For instance, if the raw sensor data is an

image, it might be hard to express the intensity of each pixel

as a function of the SLAM state; the same difﬁculty arises

with simpler sensors (e.g., a laser with a single beam). In both

cases the issue is connected with the fact that we are not able to

design a sufﬁciently general, yet tractable representation of the

environment; even in presence of such general representation,

it would be hard to write an analytic function that connects

the measurements to the parameters of such representation.

For this reason, before the SLAM back-end, it is common

to have a module, the front-end, that extracts relevant features

from the sensor data. For instance, in vision-based SLAM,

the front-end extracts the pixel location of few distinguishable

points in the environment; pixel observations of these points

are now easy to model within the back-end (see Example 1).

The front-end is also in charge of associating each measure-

ment to a speciﬁc landmark (say, 3D point) in the environment:

this is the so called data association. More abstractly, the data

association module associates each measurement z

k

with a

subset of unknown variables X

k

such that z

k

= h

k

(X

k

) +

k

.

剩余26页未读，继续阅读

安全验证

文档复制为VIP权益，开通VIP直接复制

信息提交成功

## 评论0