RANSAC算法详解与MATLAB工具箱应用

5星 · 超过95%的资源需积分: 10 84 浏览量更新于2024-07-29 2 收藏 1.21MB PDF 举报

"这篇文档是关于RANSAC（随机样本一致）算法的详细解释，以及如何在Matlab toolbox中使用该算法。作者Marco Zuliani提供了包含实例的指南，适用于Matlab和Octave平台。文档涵盖了RANSAC的基本概念、参数估计在异常值存在时的问题，以及RANSAC算法的工作原理。" RANSAC（Random Sample Consensus，随机样本一致）是一种用于处理带有噪声数据的模型参数估计方法，特别适用于在存在大量异常值（outliers）的情况下。在计算机视觉、图像处理和几何建模等领域中，RANSAC广泛应用于直线检测、平面估计、特征匹配等任务。 1. 引言 RANSAC算法旨在从含有噪声和异常值的数据集中估计模型参数。它通过反复抽取随机子集来寻找一个能最好地解释大部分数据的简单模型。这种方法的核心思想是假设数据中有大部分是符合特定模型的内点（inliers），而小部分是不规则的外点（outliers）。 2. 参数估计与异常值问题 2.1 二维直线估计为了直观理解，文档以二维直线估计为例。在这个例子中，RANSAC通过最小化内点到拟合直线的距离来估计直线方程。最大似然估计（Maximum Likelihood Estimation, MLE）通常用于求解模型参数，但在有异常值的情况下，MLE可能会被外点误导，导致不准确的结果。 2.2 异常值、偏置与破坏点 2.2.1 异常值：异常值是指那些不遵循大多数数据模式的数据点，它们可能是由于测量错误、数据污染或目标环境的特殊条件导致的。 2.2.2 偏置：当异常值存在时，参数估计可能产生偏置，即估计结果偏离真实值。 2.2.3 破坏点：破坏点是当异常值比例达到某个阈值时，导致模型估计完全失效的临界点。 2.3 2D线性最小二乘估计器的破坏点破坏点是线性最小二乘估计器能够容忍的最大异常值比例，超过这个比例，估计器将无法给出有效的模型。 3. 随机样本一致（RANSAC） RANSAC算法工作流程包括以下步骤： 1. 选择随机子集（通常是3个点）来拟合模型。 2. 计算剩余数据点到模型的残差，将残差低于阈值的点标记为内点。 3. 如果内点的数量超过了预设的最小内点数，用这些内点重新估计模型。 4. 重复步骤1-3，直到达到迭代次数上限或找到最优模型。 5. 最终模型由包含最多内点的模型决定。在Matlab toolbox中，用户可以利用提供的函数和工具来应用RANSAC，例如进行直线、平面或其他几何结构的估计。文档还可能包含了具体使用Matlab和Octave的示例代码，帮助读者理解和实现RANSAC算法。 RANSAC是解决异常值问题的有效工具，通过不断迭代和选择最佳模型，它能够从杂乱无章的数据中提取出有用的信息。在实际应用中，正确设置参数如迭代次数、阈值和最小内点数，对于获得高质量的模型至关重要。

Draft

of a model using datasets containing more than 50% of outliers.

Despite many modiﬁcations, the RANSAC algorithm is essentially composed of

two steps that are repeated in an iterative fashion (hypothesize–and–test framework):

• Hypothesize. First minimal sample sets (MSSs) are randomly selected from

the input dataset and the model parameters are computed using only the el-

ements of the MSS. The cardinality of the MSS is the smallest suﬃcient to

determine the model parameters

(as opposed to other approaches, such as

least squares, where the parameters are estimated using all the data available,

possibly with appropriate weights).

• Test. In the second step RANSAC checks which elements of the entire dataset

are consistent with the model instantiated with the parameters estimated in

the ﬁrst step. The set of such elements is called consensus set (CS).

RANSAC terminates when the probability of ﬁnding a better ranked CS drops be-

low a certain threshold. In the original formulation the ranking of the CS was its

cardinality ( i.e. CSs that contain more elements are ranked better than CSs that

contain fewer elements).

3.2 Preliminaries

To facilitate the discussion that follows, it is convenient to introduce a suitable

formalism to describe the steps for the estimation of the model parameters and for

the construction of the CS. As usual we will denote vectors with boldface letters

and the superscript

(h)

will indicate the h

iteration. The symbol ˆx indicates the

estimated value of the quantity x. The input dataset, which is composed of N

elements, is indicated by D = {d

, . . . , d

} and we will indicate a MSS with the

letter s. Let θ ({d

, . . . , d

}) be the parameter vector estimated using the set of

data {d

, . . . , d

}, where h ≥ k and k is the cardinality of the MSS. The model space

M is deﬁned as:

M(θ)

def



d ∈ R

: f

(d; θ) = 0



where θ is a parameter vector and f

is a smooth function whose zero level set

contains all the points that ﬁt the model M instantiated with the parameter vector

Suppose we want to estimate a line: in this case the cardinality of the MSS is 2, since at least

two distinct points are needed to uniquely deﬁne a line.

Draft

θ. We deﬁne the error associated with the datum d with respect to the model space

as the distance from d to M(θ):

(d, θ)

def

= min

∈M(θ)

dist(d, d

)

where dist(·, ·) is an appropriate distance function. Using this error metric, we deﬁne

the CS as:

S (θ)

def

= {d ∈ D : e

(d; θ) ≤ δ} (3.1)

where δ is a threshold that can either be inferred from the nature of the problem

or, under certain hypothesis, estimated automatically [WS04] (see Figure 3.1 for a

pictorial representation of the previous deﬁnitions). In the former case, if we want

Draft

more than 50% of outliers.

Despite many modiﬁcations, the RANSAC algorithm is essentially composed of

two steps that are repeated in an iterative fashion (hypothesize–and–test framework):

• Hypothesize. First minimal sample sets (MSSs) are randomly selected from

the input dataset and the model parameters are computed using only the el-

ements of the MSS. The cardinality of the MSS is the smallest suﬃcient t o

determine the model parameters

(as opposed to other approaches, such as

least squares, where the parameters are estimated using all the data available,

possibly with appr o p r i a t e weights).

• Test. In the second step RANSAC checks wh ich elements of the entir e dataset

are consistent with the mo d e l instantiated with the parameters estimated in

the ﬁrst step. The set of such elements is called consensus set (CS).

RANSAC terminates wh en the proba b i l i ty of ﬁnding a better ranked CS d r op s be-

low a certain threshold. In the origin a l formulation the ranking of the CS was its

cardinality ( i.e. CSs that contain more elements are ranked better than CSs t hat

contain fewer elem ents).

3.2 Preliminaries

To facilitate the discussion that follows, it is convenient to intro duce a suitable

formalism to describe the steps for the estimation of the mod el param et er s and for the

construction of the CS. As usual we will denot e vectors with boldface letters and the

superscript

(h)

will indicate the h

iteration. The symbol ˆx indicates the estimated

value of the quantity x. The input dataset which is composed of N elements is

indicated by D = {d

,...,d

} and we will indicate a MSS with the letter s. Let

θ ({d

,...,d

})betheparametervectorestimatedusingthesetofdata{d

,...,d

where h ≥ k and k is the cardin a l i ty of the MSS. The model space M is deﬁned as:

M(θ)

def



d ∈ R

: f

(d; θ)=0



where θ is a parameter vector and f

is a smooth function whose zero level set

contains all the points that ﬁt the model M in st a ntiated with the parameter vector

Suppose we want to estimate a line: in this case the cardinality of the MSS is 2, since at least

two distinct points are needed to uniquely deﬁne a line.

Draft

more than 50% of out l i er s.

Despite many modiﬁcations, th e RANSAC algori th m is essential l y composed of

two steps that are repeated in an iterative fashion (hypothesize–and–test framework):

• Hypothesize. First minimal sample sets (MSSs) are randomly selected from

the input dataset and the model parameters are computed using only the el-

ements of the MSS. The cardinality of the MSS is the smallest suﬃcient to

determine th e model parameters

(as opposed to other approaches, such as

least squares, where the parameter s are estimated using all the data available,

possibly with appr o p r i a t e wei g hts) .

• Test. In the second step RANSAC ch ecks which elements of t h e entire dataset

are consist ent with th e model instantiated with t h e parameters estimated in

the ﬁrst step. The set of such elements is called consensus set (CS).

RANSAC terminates wh en the probabili ty of ﬁnding a better ranked CS drops be-

low a certain thr esh o l d. In the original formulation the ranking of the CS was its

cardinality ( i.e. CSs that contain more elements are ranked bett er than CSs that

contain fewer elements).

3.2 Preliminaries

To facilitate the discussion that follows, it is convenient to introduce a suitable

formalism to describe the steps for t h e estimati on of the model parameters and for the

construction of the CS. As usual we will denote vectors with boldface letters and the

superscript

(h)

will indicate the h

iteration. The symbol ˆx indicates the estimat ed

value of t h e quantity x. The inpu t dataset whi ch is composed of N el ements is

indicated by D = {d

,...,d

} and we will indicate a MSS with the letter s. Let

θ ({d

,...,d

})betheparametervectorestimatedusingthesetofdata{d

,...,d

where h ≥ k and k is the cardinality of t h e MSS. The model space M is deﬁned as:

M(θ)

def



d ∈ R

: f

(d; θ)=0



where θ is a parameter vector and f

is a smooth function whose zero level set

contains all the points that ﬁt the model M instantiated with t h e par a m et er vector

Suppose we want to estimate a line: in this case the cardinality of the MSS is 2, since at least

two distinct points are needed to uniquely deﬁne a line.

Figure 3.1: This ﬁgure pictorially displays the model space M as a green surface (the

locus for which f

(d; θ) = 0). The yellow surfaces represent the boundaries for a

datum to be considered an inlier (imagine that the distance function is the Euclidean

distance, hence the smallest distance between any two points on the yellow surface

and green surface is δ). Note that the structure of the green surface is both deﬁned

by the model M and by the parameter vector θ. The inliers, represented as blue

dots, lie in between the two yellow “crusts”.

to relate the value of δ to the statistics of the noise that aﬀects the data and the

distance function is the Euclidean norm, we can write:

(d, θ) = min

∈M(θ)

i=1

− d

)

i=1

− d

∗

)

Draft

where d

∗

is the orthogonal projection of d onto the model space M(θ). Now suppose

that the datum d is aﬀected by Gaussian noise η ∼ N(0, σ

I) so that η = d − d

∗

Our goal is to calculate the value of δ that bounds, with a given probability P

inlier

the error generated by a true inlier contaminated with Gaussian noise. More formally

we want to ﬁnd the value δ such that:

P [e

(d, θ) ≤ δ] = P

inlier

(3.2)

Following [HZ03], p. 118, we can write the following chain of equations:

P [e

(d, θ) ≤ δ] = P

i=1

≤ δ

= P

i=1





≤

and since η

/σ

∼ N(0, 1), the random variable

i=1





has a χ

distribution.

Hence:

δ = σ

−1

inlier

) (3.3)

where F

−1

is the inverse cumulative distribution function associated with a χ

ran-

dom variable. Figure 3.2(a) displays the function F

−1

for diﬀerent values of n. Note

that when P

inlier

tends to one ( i.e. we want to pick an error threshold such that all

the inliers will be considered) the value of F

−1

diverges to inﬁnity. Values of P

inlier

close to one will return a large threshold with the risk of including some outliers as

well. On the other hand, too small values of P

inlier

will generate a value for δ which

is too tight, and possibly some inliers will be discarded.

3.3 RANSAC Overview

A pictorial representation of the RANSAC fundamental iteration together with

the notation just introduced is shown in Figure 3.3. As mentioned before, the

RANSAC algorithm is composed of two steps that are repeated in an iterative fashion

(hypothesize-and-test framework). First a MSS s

(h)

is selected from the input dataset

and the model parameters θ

(h)

are computed using only the elements of the selected

MSS. Then, in the second step, RANSAC checks which elements in the dataset D

are consistent with the model instantiated with the estimated parameters and, if it is

the case, it updates the current best CS S

∗

(which, in the original Fischler and Bolles

剩余100页未读，继续阅读

乡村的风你的香

粉丝: 1
资源: 11

RANSAC算法详解与MATLAB工具箱应用

matlab RANSAC工具包

SIFT+RANSAC

基于特征匹配和RANSAC的三维点云拼接配准方法，matlab实现

RANSAC.zip_RANSAC_RANSAC matlab_RANSAC matlab_RANSAC4Dummies_

RANSAC-Toolbox.rar_Matlabransac_random_ransac算法 matlab_ransac算法m

A toolbox to experiment with the RANSAC algorithm for Matlab and

RANSAC算法matlab包

RANSAC算法 MATLAB实现

matlabransac代码-Image-Stitching:在Matlab中使用ComputerVision将图像拼接在一起！

matlabransac代码-ShapeContexts:用于形状匹配和点对应的形状上下文

最新资源