轮廓检测与分层图像分割算法研究

5星 · 超过95%的资源需积分: 9 114 浏览量更新于2024-07-23 1 收藏 4.6MB PDF 举报

"本文主要探讨了计算机视觉领域的两个核心问题：轮廓检测与分层图像分割。作者提出了一种结合局部线索的全局优化框架，用于轮廓检测，并基于谱聚类的方法。此外，他们还提出了一种将任何轮廓检测器的输出转化为层次区域树的通用算法，从而简化了图像分割问题。实验结果表明，他们的方法在轮廓检测和图像分割方面显著优于其他算法。系统能够自动生成层次结构的分割，并允许用户通过特定注释进行交互式细化。多分辨率计算为将该系统与识别应用相结合提供了可能。" 在这篇论文中，作者首先详细介绍了轮廓检测的重要性。轮廓检测是识别图像边界的关键步骤，它能帮助区分不同的物体或区域。提出的轮廓检测算法结合了多种局部特征，并将其整合到一个全局优化框架中，该框架基于谱聚类理论。这种方法的优势在于它能更准确地捕捉图像边缘，同时减少了虚假边缘的检测。接着，作者转向图像分割的话题，尤其是如何将轮廓检测的结果转化为层次结构的图像分割。他们设计了一种通用的算法，可以接受任何有效的轮廓检测算法作为输入，然后生成一个表示图像区域层次关系的树状结构。这种分层结构不仅提供了对图像内容的直观理解，也为后续的分析和识别任务提供了便利。在实验部分，作者对比了他们的算法与其他现有方法，证明了其在准确性和效率上的优越性。通过大量实验，他们展示了所提出的轮廓检测和图像分割方法在各种场景下都能取得显著更好的结果。此外，该论文还讨论了如何通过用户指定的注释来交互式地改进这些自动分割结果。这使得用户可以根据需要微调分割，增加了系统的灵活性和实用性。同时，通过在不同图像分辨率上运行算法，可以适应不同的应用场景，比如在识别任务中，低分辨率可以快速预处理，高分辨率则用于精细化分析。这篇论文提出的图像分割算法在计算机视觉领域具有重要意义，它不仅提升了轮廓检测的性能，还通过分层结构和用户交互功能，增强了图像分割的实用性和准确性。这些技术对于图像理解和分析，尤其是在复杂场景下的目标识别和分割，有着广泛的应用潜力。

spanning tree of R. Considering edges in non-decreasing

order by weight, each step of the algorithm merges

components R

and R

connected by the current edge if

the edge weight is less than:

min(Int(R

)+τ(R

),Int(R

)+τ(R

)) (1)

where τ (R)=k/|R|. k is a scale parameter that can be

used to set a preference for component size.

The Mean Shift algorithm [34] offers an alternative

clustering framework. Here, pixels are represented in

the joint spatial-range domain by concatenating their

spatial coordinates and color values into a single vector.

Applying mean shift ﬁltering in this domain yields a

convergence point for each pixel. Regions are formed by

grouping together all pixels whose convergence points

are closer than h

in the spatial domain and h

in the

range domain, where h

and h

are respective bandwidth

parameters. Additional merging can also be performed

to enforce a constraint on minimum region area.

Spectral graph theory [48], and in particular the Nor-

malized Cuts criterion [45], [46], provides a way of

integrating global image information into the grouping

process. In this framework, given an afﬁnity matrix W

whose entries encode the similarity between pixels, one

deﬁnes diagonal matrix D



and solves for the

generalized eigenvectors of the linear system:

(D − W )v = λDv (2)

Traditionally, after this step, K-means clustering is

applied to obtain a segmentation into regions. This ap-

proach often breaks uniform regions where the eigenvec-

tors have smooth gradients. One solution is to reweight

the afﬁnity matrix [47]; others have proposed alternative

graph partitioning formulations [49], [50], [51].

A recent variant of Normalized Cuts for image seg-

mentation is the Multiscale Normalized Cuts (NCuts)

approach of Cour et al. [33]. The fact that W must

be sparse, in order to avoid a prohibitively expensive

computation, limits the naive implementation to using

only local pixel afﬁnities. Cour et al. solve this limitation

by computing sparse afﬁnity matrices at multiple scales,

setting up cross-scale constraints, and deriving a new

eigenproblem for this constrained multiscale cut.

Sharon et al. [52] propose an alternative to improve

the computational efﬁciency of Normalized Cuts. This

approach, inspired by algebraic multigrid, iteratively

coarsens the original graph by selecting a subset of nodes

such that each variable on the ﬁne level is strongly

coupled to one on the coarse level. The same merging

strategy is adopted in [31], where the strong coupling of

a subset S of the graph nodes V is formalized as:



j∈S



j∈V

>ψ ∀i ∈ V − S (3)

where ψ is a constant and p

the probability of merging

i and j, estimated from brightness and texture similarity.

Many approaches to image segmentation fall into a

different category than those covered so far, relying on

the formulation of the problem in a variational frame-

work. An example is the model proposed by Mumford

and Shah [53], where the segmentation of an observed

image u

is given by the minimization of the functional:

F(u, C)=



(u − u

)

dx + μ



Ω\C

|∇(u)|

dx + ν|C| (4)

where u is piecewise smooth in Ω\C and μ, ν are weight-

ing parameters. Theoretical properties of this model can

be found in, e.g. [53], [54]. Several algorithms have been

developed to minimize the energy (4) or its simpliﬁed

version, where u is piecewise constant in Ω\C. Koepﬂer

et al. [55] proposed a region merging method for this

purpose. Chan and Vese [56], [57] follow a different

approach, expressing (4) in the level set formalism of

Osher and Sethian [58], [59]. Bertelli et al. [30] extend

this approach to more general cost functions based on

pairwise pixel similarities. Recently, Pock et al. [60] pro-

posed to solve a convex relaxation of (4), thus obtaining

robustness to initialization. Donoser et al. [29] subdivide

the problem into several ﬁgure/ground segmentations,

each initialized using low-level saliency and solved by

minimizing an energy based on Total Variation.

2.3 Benchmarks

Though much of the extensive literature on contour

detection predates its development, the BSDS [2] has

since found wide acceptance as a benchmark for this task

[23], [24], [25], [26], [27], [28], [35], [61]. The standard for

evaluating segmentations algorithms is less clear.

One option is to regard the segment boundaries

as contours and evaluate them as such. However, a

methodology that directly measures the quality of the

segments is also desirable. Some types of errors, e.g. a

missing pixel in the boundary between two regions, may

not be reﬂected in the boundary benchmark, but can

have substantial consequences for segmentation quality,

e.g. incorrectly merging large regions. One might argue

that the boundary benchmark favors contour detectors

over segmentation methods, since the former are not

burdened with the constraint of producing closed curves.

We therefore also consider various region-based metrics.

2.3.1 Variation of Information

The Variation of Information metric was introduced for

the purpose of clustering comparison [6]. It measures the

distance between two segmentations in terms of their

average conditional entropy given by:

VI(S, S



)=H(S)+H(S



) − 2I(S, S



) (5)

where H and I represent respectively the entropies and

mutual information between two clusterings of data S

and S



. In our case, these clusterings are test and ground-

truth segmentations. Although VI possesses some inter-

esting theoretical properties [6], its perceptual meaning

and applicability in the presence of several ground-truth

segmentations remains unclear.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

剩余19页未读，继续阅读

szgsuper

粉丝: 1

轮廓检测与分层图像分割算法研究

图像分割,图像分割算法,matlab

基于MATLAB的图像分割算法研究.doc

meanshift图像分割算法

matlab图像分割算法

matlab图像分割算法改进

python图像分割算法

图像分割算法的研究背景

阐述图像分割算法的理论

基于深度学习的图像分割算法的优点

基于matlab的遥感图像分割算法

最新资源