基于修改PCNN的多对象场景3D建模与图像融合方法

需积分: 9 172 浏览量更新于2024-07-23 收藏 16.51MB PDF 举报

图像融合是计算机视觉领域的一个关键技术，它通过结合多个视角下同一场景的图像信息，生成一幅综合图像，以提升后续图像处理任务的性能。这篇论文聚焦于一种基于修改后的双脉冲耦合神经网络（Modified Dual Pulse Coupled Neural Network, MDPCCNN）的图像融合方案。该方法特别适用于处理多物体场景，这些场景通常包含遮挡、多重反射和阴影引起的复杂光照条件，以及各种形状和表面特性各异的物体，这为单物体建模方法带来了巨大的挑战。论文的主要贡献首先体现在对多物体三维场景的建模上。由于存在多个对象的遮挡和光照复杂性，传统的单一物体建模技术可能无法准确捕捉场景的完整三维信息。作者提出的新方法利用多视图图像集合生成的粗糙三维模型作为初始估计，通过这种方法，能够处理多物体场景特有的复杂性。在重建三维场景的过程中，论文提出了两种关键的校正技术。第一种方法专门用于识别和纠正由于遮挡和光照不一致导致的错误。这种校正技术能够提高场景中各物体之间的空间一致性，确保在融合图像中物体间的相对位置和深度关系更为精确。另一种方法关注于解决多物体场景中的形状与轮廓信息匹配问题。所谓的“Shape-from-Contours”技术，即从边缘或轮廓信息推断出物体的形状，通过修正算法，能够减少因光照变化和复杂表面引起的形状偏差，从而生成更准确的三维模型。此外，论文还探讨了多视图数据中的深度信息处理，包括多视差纠正，这有助于消除由于不同视角带来的立体感偏差，使得最终融合图像在各个方向上都能提供一致的视觉效果。这篇图像融合论文通过结合MDPCCNN的特性，以及针对多物体场景设计的特有校正策略，为三维场景建模提供了创新的解决方案。这对于诸如虚拟现实、增强现实、自动驾驶等领域，尤其是在处理复杂的光照和遮挡条件下的实时三维重建任务具有重要的实践价值。

where s, k and t, correspond to scaling, skew and translation

applying in the direction of the u coordinate. In order to calculate

these parameters, the images corresponding to the 3D patch

are divided into l rows of pixels. When considering a single row of

pixels, the skew and translation act together to produce a single

horizontal offset denoted as o,sincethev coordinate of each pixel is

the same.

The normalized cross correlation is computed between each

pair of rows at various scale and offset values and those corre-

sponding to the best match are recorded. The scale s is calculated

as the median of the offset values found for each row from

the given patch. Any values signiﬁcantly outside the median are

deemed to be outliers and discarded. Using the offsets from all

rows, the skew and translation parameters k and t are calculated

by solving a linear system



⋮⋮

−1

⋮

; ð11Þ

where v

is the v coordinate of the pixel from the l’th row. In order

to improve the robustness further, the rows with the greatest

residuals are removed from the system (11), and k, t are recalcu-

lated. This process is repeated until convergence. Now H can be

calculated from H

taking into account the rectifying transforma-

tions

H ¼R′

−1

R: ð12Þ

Certain patches and their corresponding projections may not be

suitable to provide a reliable matching and are not considered for

the surface rectiﬁcation mechanism described above. For ﬁnding

the offsets uniquely, the line by line matching based on cross-

correlation requires that there must be signiﬁcant detail in the

quad regions from the images corresponding to the patch projec-

tion. The observed detail can be unreliable in patches with

signiﬁcant deviation from planarity which are viewed from obli-

que angles. In order to assess the appropriateness of using a

certain image I for correcting a given 3D scene patch, we calculate

a conﬁdence score χ as a function depending on the angle between

the pair of images and on the distance to the scene

ðy−cÞ

∥y−c∥

; ∥y−c∥; arccos

ðy−cÞðy′−cÞ

∥y−c∥∥y′−c∥



; ð13Þ

where N

is the surface normal of the 3D patch, c is the location

from the 3D scene (usually an RBF center) viewed from the image

location y. The ﬁrst term in χðÞ represents the cosine of the angle

between the surface normal N

and the viewing vector from y. The

third term from χðÞ enforces a minimum baseline angle formed by

the two viewing directions from locations y and y ′ to the center of

the patch c for the images fI; I′g. Images pairs fI; I′g which are too

closely located to each other will not provide enough disparity in

order to extract reliable information. By using the conditions from

χðÞ, we assess which images are suitable to be used for correcting

the disparity for a speciﬁc 3D patch from the scene S.

For a 3D patch, corresponding to the RBF center c

, let us

consider a set of image pairs fI; I′g

for i ¼ 1; …; K which fulﬁl the

above conditions. Each of the images I and I′ is characterized by

the projection matrices P and P′, we estimate H

R;i

and H

using

(12), and calculate the corresponding displacement vector v

from

(6). Consequently, we estimate the correct location of the plane ψ

which should contain the basis function center , in order to fulﬁlthe

consistency between the pair of images fI; I′g

and their corresponding

3D patch, using (8)

0001



: ð14Þ

The location of the basis function center is updated as

¼c

∑

i ¼ 1

 N

ð15Þ

for j¼1,…,l,withl the number of RBFs centers to be updated, while N

isthesurfacenormaltotheplaneψ

,andjth basis function center c

updated to

, by being constrained to lie onto each of the planes ψ

This corresponds to the rectiﬁcation due to the disparity identiﬁed in

each image pair fI; I′g

for i ¼1; …; K. After updating the RBF centers

we recalculate the output weights w

, i¼1,…,M by solving (2).Inorder

to av oid the singularity when solving (2), if multiple basis function

centers occur in the immediate neighborhood of each other , only one

is preserved while the others are remov ed.

4. Scene correction using shape-from-contours

In the following we assume that we have a 3D scene recon-

structed from a set of images as described in Section 2. The 3D

scene correction using image disparities, as described in Section 3,

relies on the existence of textured areas in the given set of images.

Large uniform colored regions may not provide suitable matches

for estimating image disparities between pairs of images. However,

such image regions can be easily segmented providing reliable

object contours. In the following we propose using contours of

segmented objects for correcting the 3D scene.

4.1. Detecting disparities in object contours

Let us assume that the scene contains at least two distinct

objects fA; Bg∈S in the scene. The background is assumed as a

distinct object, part of the scene as well. We consider that each

object outline from the 3D scene is projected onto contours in the

input images, denoted as fa

; b

g∈I

; i ¼1; …; n where

¼P

A; b

¼P

B ð16Þ

where P

represents the projection matrix from the 3D scene to

the ith image. In some of the images one or both objects can be

occluded and their contours may not be everywhere visible. The

assumption in the following is that the objects from the 3D scene

are inconsistent with the given set of images I due to errors in

their shape estimation.

In this paper we consider segmentation for deﬁning the

contours of objects, such as fA; Bg in the 3D scene and fa

, b

g in

the image I

from the set I . In the case of 3D objects we assume

that we have an initial scene as provided by the initialization

described in the previous sections. Segmenting the given 3D scene

is rather straight forward since the RBF surface delimits each

object from the surrounding area, except for the case when two

objects are in contact with each other. 3D objects are characterized

by an additional location feature when compared to their corre-

sponding projections into images and are easier to segment

even when they are only roughly modeled. A simple compactness

criterion or a clustering algorithm considering three location and

their corresponding three color components, can be used for

segmenting 3D objects. Let us denote by Pðx∈Ajz

Þ and

Pðx∈Bjz

Þ the probability of segmenting the objects A and B in

3D, where z

represents the feature vectors characterizing the set

of locations from the 3D scene.

We consider both unsupervised and supervised image segmen-

tations for extracting object contours from images. The unsupervised

segmentation corresponds to clustering in the feature space [31,32].

In the case of supervised classiﬁcation, the image segmentation is

M. Grum, A.G. Bors / Pattern Recognition 47 (2014) 326–343 329

剩余17页未读，继续阅读

huangdouya2014

粉丝: 0
资源: 1

基于修改PCNN的多对象场景3D建模与图像融合方法

图像融合 NSCT算法 matlab

matlab图像融合--基于相位与局部能量的变换域图像融合算法论文源代码

2.4图像融合技术.rar

图像融合常用论文图像

图像融合技术论文资料

有关于图像融合的论文

图像融合演示：论文“图像融合与引导过滤”的代码-matlab开发

红外和可见光图像融合硕士论文

matlab图像融合源码-Haze-Removal:图像去雾论文和实验总结

图像融合及性能评价论文

最新资源