自动检测与分割运动物体：学习颜色和局部线索

需积分: 3 74 浏览量更新于2024-11-23 收藏 2.87MB PDF 举报

"学习颜色和局部性线索以进行移动物体检测和分割" 在计算机视觉领域，自动检测和分割视频中的移动物体是一项重要的任务。本文“Learning color and locality cues for moving object detection and segmentation”由Feng Liu和Michael Gleicher共同发表于University of Wisconsin-Madison的计算机科学系，旨在解决在有限运动信息条件下难以准确检测和分割物体的问题。传统的自动算法通常依赖于运动信息来识别移动物体，但当物体运动稀疏或不足时，这些方法的效果会大打折扣。为了解决这个问题，作者提出了一种无监督的学习算法，该算法能够从稀疏的运动信息中学习物体的颜色和局部性特征。首先，通过检测具有可靠运动线索的关键帧，算法可以利用马尔可夫随机场（Markov Random Field, MRF）框架估计出基于这些运动线索的子物体。关键帧的选择对于获取稳定可靠的运动信息至关重要，它为后续的分析提供了基础。接着，从这些子物体中，算法学习到一个外观模型，即用颜色高斯混合模型（Color Gaussian Mixture Model）。这种模型能有效地捕捉物体的色彩分布，帮助区分移动物体与背景。为了避免将颜色与移动物体相似的背景像素误分类，算法采用了子物体位置的传播策略。这种方法允许算法根据已知的运动模式和颜色信息来更新和校正分割结果，减少背景干扰对结果的影响。此外，论文还可能涉及了跟踪算法的优化，确保在物体运动不连续或遮挡的情况下仍能保持对物体的准确追踪。通过结合颜色、空间局部性和运动信息，该算法提高了在复杂场景中移动物体检测和分割的准确性。这篇论文提供了一个创新的解决方案，通过学习颜色和局部性特征，克服了仅依赖运动信息的局限性，提升了在低运动信号环境下的移动物体检测和分割性能。这种方法对于监控、自动驾驶等领域的应用具有重大意义，有助于提高系统的鲁棒性和准确性。

Learning color and locality cues for moving object detection and segmentation

Feng Liu and Michael Gleicher

Department of Computer Sciences, University of Wisconsin-Madison

1210 West Dayton Street, Madison, WI, 53706

{fliu|gleicher}@cs.wisc.edu

Abstract

This paper presents an algorithm for automatically de-

tecting and segmenting a moving object from a monocular

video. Detecting and segmenting a moving object from a

video with limited object motion is challenging. Since exist-

ing automatic algorithms rely on motion to detect the mov-

ing object, they cannot work well when the object motion is

sparse and insufﬁcient. In this paper, we present an unsu-

pervised algorithm to learn object color and locality cues

from the sparse motion information. We ﬁrst detect key

frames with reliable motion cues and then estimate mov-

ing sub-objects based on these motion cues using a Markov

Random Field ( MRF) framework. From these sub-objects,

we learn an appearance model as a color Gaussian Mixture

Model. To avoid the false classiﬁcation of background pix-

els with similar color to the moving objects, the locations

of these sub-objects are propagated to neighboring frames

as locality cues. Finally, robust moving object segmenta-

tion is achieved by combining these learned color and lo-

cality cues with motion cues in a MRF framework. Experi-

ments on videos with a variety of object and camera motion

demonstrate the effectiveness of this a lgorithm.

1. Introduction

Automatically detecting and segmenting a moving ob-

ject from a monocular video is useful in many applications

like video editing, video summarization, video coding, vi-

sual surveillance, human computer interaction, etc. Many

methods have been presented (c.f. [21, 9, 3, 24, 23]). Many

of them aim at a robust algorithm for extracting a moving

object from a video with rich object and camera motion.

However, extracting a moving object from a video with less

object and camera motion is also challenging. Most previ-

ous automatic methods rely on object and/or camera mo-

tion to detect the moving object. Small motion of the ob-

ject and/or camera do not provide sufﬁcient information for

these methods.

For example, most existing methods use motion to detect

moving objects. They assume if a compact region moves

differently from the global background motion, it mostly

likely belongs to a moving object. Motion-based methods

[8, 12, 21, 9, 3] usually take the detected moving pixels as

seeds, and cluster pixels into layers with consistent motions

(and consistent color and depth). When motion information

is sparse and incomplete, they cannot work robustly. For

example, Figure 1 shows an example where a boy sits on

theﬂoorandmovesonlyinafewframes.Andeveninthese

frames, he only moves a part of his body. Methods using

object motion information can only detect an incomplete

part of the object. For example, if we segment the object

in a popular Markov Random Field (MRF) framework, as

described in § 2.3, only the moving part of the boy’s body

is detected in frames where the part moves, and no m ean-

ingful region is found in other frames as shown in Figure 1

(b) and (c). This example shows that using object motion

alone to infer moving objects is insufﬁcient. Similarly, in

this example, since the camera barely moves, it is also dif-

ﬁcult for a structure from motion (SFM) algorithm as used

in methods like [24] to obtain useful depth information to

infer the moving object.

Impressive results have been reported recently for bi-

layer video segmentation in the scenario of video chat-

ting [4, 23]. These algorithms can robustly segment a major

foreground object from a video with dynamic background,

however, they are not suitable for videos with complex cam-

era motions.

Instead of building a moving object model, some other

methods build a background model to d etect and segment a

moving object (c.f. [5, 10, 17, 15, 18, 22]). These methods

work well for videos with static cameras. When videos have

complex camera motions, the background model is hard to

build.

This paper presents a solution that learns a moving object

model by collecting the sparse and insufﬁcient motion in-

formation throughout the video. Speciﬁcally, we presented

an unsupervised algorithm to learn the co lor and locality

cues of the moving object. We ﬁrst detect key frames that

contain motion cues that can reliably indicate at least some

下载后可阅读完整内容，剩余7页未读，立即下载

chaojidabendan

粉丝: 0

自动检测与分割运动物体：学习颜色和局部线索

Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Reco

Big Data and Computational Intelligence in Networking-CRC(2018).pdf

Locality in Search Engine Queries and Its Implications for Caching

hbase locality

how to use Covariance in locality preserving projection

HDFS 和 高斯db

Locality-sensitive hashing（LSH）的Python代码

如何用Python实现 Locality Sensitive Hashing (LSH)算法？

yarn.scheduler.capacity.node-locality-delay

vedio swin transformer

最新资源

HDFS 和高斯db