交互式图像分割：香港科技大学的区域合并方法

香港科技大学

图像分割

需积分: 10 114 浏览量更新于2024-09-22 收藏 1.83MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

本篇文章主要探讨了香港科技大学在图像分割领域的研究，特别是他们开发的一种半自动交互式图像分割方法。该方法着重于最大化相似度的区域合并技术，旨在提高计算机视觉和对象识别中的图像分割效率和效果。自然图像的全自动化分割通常非常具有挑战性，因此，通过用户少量简单操作的交互式方案成为了一种可行的解决方案。研究者Jifeng Ning、Lei Zhang、David Zhang等人，分别来自香港理工大学的生物特征研究中心、西安西电大学的集成服务网络国家重点实验室以及西北农林科技大学的信息工程学院，共同合作提出了这个创新的算法。他们的方法基于最大相似度原则，利用MeanShift算法作为核心技术，允许用户通过指定几个关键点或区域边界来引导图像分割过程。文章的关键步骤包括： 1. **区域识别与初始化**：首先，系统会根据用户提供的初始信息识别出图像中的潜在区域或感兴趣区域。 2. **最大相似度搜索**：然后，算法在相邻区域之间寻找具有最大相似性的组合，这些区域可能共享共同的特征或颜色分布。 3. **区域合并**：当找到具有足够高相似性的区域对时，这两个区域将被合并，形成一个新的、更大的区域，同时更新整个图像的分割结果。 4. **迭代优化**：这个过程是迭代的，直到用户不再提供新的输入或者达到预设的分割阈值，最终得到一个满足用户需求的分割结果。 5. **高效性能**：为了提高处理效率，研究人员还强调了算法的高效性，确保在实时应用中也能快速响应用户的交互。 6. **应用前景**：由于其交互性和准确性，这种方法有广泛的应用潜力，包括但不限于医学图像分析、视频监控、自动驾驶和计算机辅助设计等领域。总结来说，香港科技大学的这一研究成果为图像分割领域提供了一个实用且灵活的工具，使得用户可以通过最少的干预实现精确的图像区域划分，对于推动计算机视觉技术的实际应用具有重要意义。

资源详情

资源推荐

J. Ning et al. / Pattern Recognition 43 (2010) 445 -- 456 447

The RGB/Bhattacharyya descriptor is a very simple yet efficient

way to represent the regions and measure their similarity. It has

been successfully used to measure the similarity between target

model and candidate model in the popular kernel based object track-

ing method [27]. However, it should be stressed that other color

spaces, such as the HSI color space, and other distance measures,

such as the Euclidean distance between histogram vectors, can also

be adopted in the proposed region merging scheme. In Section 3.3,

we present examples by using HSI color space and Euclidean dis-

tance, respectively. The results are similar to those by using the RGB/

Bhattacharyya descriptor.

2.2. Object and background marking

In the interactive image segmentation, the users need to specify

the object and background conceptually. Similar to [10,13,17], the

users can input interactive information by drawing markers, which

could be lines, curves and strokes on the image. The regions that

have pixels inside the object markers are thus called object marker

regions, while the regions that have pixels inside the background

markers are called background marker regions. Fig. 1b shows exam-

ples of the object and background markers by using simple lines.

We use green markers to mark the object while using blue markers

to represent the background. Please note that usually only a small

portion of the object regions and background regions will be marked

by the user. Actually, the less the required inputs by the users, the

more convenient and more robust the interactive algorithm is.

After object marking, each region will be labeled as one of three

kinds of regions: the marker object region, the marker background

region and the non-marker region. To completely extract the object

contour, we need to automatically assign each non-marker region

with a correct label of either object region or background region.

For the convenience of the following development, we denote by M

and M

the sets of marker object regions and marker background

regions, respectively, and denote by N the set of non-marker regions.

2.3. Maximal similarity based merging rule

After object/background marking, it is still a challenging problem

to extract accurately the object contour from the background because

only a small portion of the object/background features are indicated

by the user. The conventional region merging methods merge two

adjacent regions whose similarity is above a preset threshold [14,

Chapter 6.3]. These methods have difficulties in adaptive threshold

selection. A big threshold will lead to incomplete merging of the

regions belonging to the object, while a small threshold can easily

cause over-merging, i.e. some object regions are merged into the

background. Moreover, it is difficult to judge when the region merg-

ing process should stop.

Object and background markers provide some key features of

object and background, respectively. Similar to graph cut and marker

based watershed [4], where the marker is the seed and starting point

of the algorithm, the proposed region merging method also starts

from the initial marker regions and all the non-marker regions will

be gradually labeled as either object region or background region.

The lazy snapping cutout method proposed in [17], which combines

graph cut with watershed based initial segmentation, is actually a

region merging method. It is controlled by a max-flow algorithm

[11]. In this paper, we present an adaptive maximal similarity based

merging mechanism to identify all the non-marker regions under

the guidance of object and background markers.

Let Q be an adjacent region of R and denote by

={S

}

i=1,2,...,q

the set of Q's adjacent regions. The similarity between Q and all its

adjacent regions, i.e.

(Q, S

), i = 1,2, ... ,q, are calculated. Obviously,

R is a member of

. If the similarity between R and Q is the maximal

one among all the similarities

(Q, S

), we will merge R and Q.The

following merging rule is defined:

Merge R and Q if

(R, Q) = max

i=1,2,...,q

(Q, S

)(2)

The merging rule (2) is very simple but it establishes the basis of

the proposed region merging process. One important advantage of

(2) is that it avoids the presetting of similarity threshold for merging

control. Although “max” is an operator that is sensitive to outliers,

we empirically found that it works well in our algorithm. This is

mainly because that the histogram is a global descriptor of the local

region and it is robust to noise and small variations. Meanwhile, the

Bhattacharyya coefficient is the inner product of the two histogram

vectors and it is also robust to noise and variations.

The marker regions cover only a small part of the object and back-

ground. Those object regions that are not marked by the user, i.e. the

non-marker object regions, should be identified and not be merged

with the background. Since they are from the same object, the non-

marker object regions will usually have higher similarity with the

marker object regions than the background regions. Therefore, in the

automatic region merging process, the non-marker object regions

will have high probabilities to be identified as object.

2.4. The merging process

The whole MSRM process can be divided into two stages, which

are repeatedly executed until no new merging occurs. Our strategy

is to merge background regions as many as possible while keep ob-

ject regions from being merged. Once we merge all the background

regions, it is equivalent to extracting the desired object. Some two-

step strategies have been used in [22,23] for image pyramid con-

struction. Different from [22,23], the proposed strategy aims for

image segmentation and it is guided by the markers input by users.

In the first stage, we try to merge marker background regions

with their adjacent regions. For each region B ∈ M

, we form the set

of its adjacent regions

={A

}

i=1,2,...,r

. Then for each A

and A

/∈ M

we form its set of adjacent regions

={S

}

j=1,2,...,k

. It is obvious

that B ∈

. The similarity between A

and each element in

, i.e.

(A

, S

), is calculated. If B and A

satisfy the rule (2), i.e.

(A

, B) = max

j=1,2,...,k

(A

, S

)(3)

then B and A

are merged into one region and the new region will

have the same label as region B:

B = B ∪ A

(4)

Otherwise, B and A

will not merge.

The above procedure is iteratively implemented. Note that in

each iteration, the sets M

and N will be updated. Specifically, M

expands and N shrinks. The iteration stops when the entire marker

background regions M

will not find new merging regions.

After the region merging of this stage, some non-marker back-

ground regions will be merged with the corresponding background

markers. However, there are still non-marker background regions

which cannot be merged because they have higher similarity scores

with each other than with the marker background regions. Fig. 2a

shows that after the first stage merging, many regions belonging to

the background (leaves, branches, etc.) are merged but there are still

some non-marker background regions left.

To complete the task of target object extraction, in the second

stage we will focus on the non-marker regions in N remained from

the first stage. Part of N belongs to the background, while part of

剩余11页未读，继续阅读

shan1ren2

粉丝: 0
资源: 1

交互式图像分割：香港科技大学的区域合并方法

一种基于MAP估计的斑点图像分割模型

舌头图像训练数据集舌头图像训练数据集

HKUSTimageprocess图像处理试题20fall

图像分割国内外研究现状，分开写

红外图像和毫米波雷达数据集

国内的开源植物叶片图像数据库

作一首关于香港理工大学的诗歌

swin-transformer-semantic-segm

如何在自己的电脑中下载并使用香港中文大学LLM凤凰模型

请帮我写一份香港科技大学集成电路设计工程的英文文书，800字左右

swin transformer

帮我找一个高动态范围图像数据集

怎么下载香港理工大学电离层闪烁监测数据

低光照图像增强数据集

香港大学computer science 笔试面试

swingtransformer

介绍一下去雨算法的国内外研究现状

swin transformer原文

香港大学 cs(msc)提前批

图像分类2023最新网络

最新资源