使用地球移动距离进行图像相似性比较

5星 · 超过95%的资源需积分: 43 57 浏览量更新于2024-08-02 收藏 576KB PDF 举报

"Earth mover's distance (EMD) 在图像相似性计算中的应用" 地球移动距离（Earth Mover's Distance, EMD）是一种衡量两个分布之间差异的度量方法，尤其在计算机视觉和图像检索领域有着广泛的应用。这个概念最初由Peleg、Werman和Rom等人提出，用于解决特定的视觉问题。EMD的核心思想是计算将一个分布转化为另一个分布所需的最小成本，这种转化可以被视为一种“运输”过程。在图像检索中，EMD与基于向量量化（Vector Quantization, VQ）的表示方案相结合，形成了一种新的图像比较框架。向量量化是将高维数据（如图像特征）映射到低维离散空间的过程，通过这种方式，图像可以被表示为一系列的“质心”或“码书”向量。利用EMD，我们可以评估两个图像的这些向量分布之间的相似性，这通常比其他传统方法更能反映人类感知的相似性。 EMD的计算基于线性优化中的运输问题，这是一个已知的数学问题，有高效的算法可以求解。这个过程可以视为在两个分布之间分配“土方”，使得土方的总移动距离最小。在图像上下文中，“土方”可以代表像素或者特征点，而“运输”则表示将这些元素从一个图像重新分配到另一个图像的等效过程。 EMD的优势在于它考虑了数据分布的整体结构，而不仅仅是简单的距离或相似度测量。例如，即使两个图像的某些部分在位置上有偏移，但整体形状和结构相似，EMD仍能识别它们的相似性。这对于处理图像变形、旋转、缩放等问题非常有用。然而，EMD也有其局限性。由于涉及到大规模的优化问题，计算复杂度较高，可能不适合实时或大数据量的应用。此外，EMD对噪声和局部细节的变化敏感，可能会导致不期望的结果。为了改善这一点，研究者们已经提出了各种优化策略，如采样、近似算法以及结合其他相似度度量来提高效率和鲁棒性。 Earth Mover's Distance作为一种强大的工具，为图像检索和计算机视觉中的图像相似性比较提供了新的视角，尤其是在处理几何变换和结构相似性时表现出优越的性能。然而，实际应用中需要权衡计算复杂度和精度，以适应不同的应用场景。

The Earth Mover’s Distance 103

Quadratic-form distance: this distance was suggested

in Niblack et al. (1993) for color based retrieval:

(H, K) =

(h − k)

A(h − k),

where h and k are vectors that list all the entries

in H and K. Cross-bin information is incorporated

via a similarity matrix A = [a

] where a

denote

similarity between bins i and j. Here i and j are

sequential (scalar) indices into the bins.

For our experiments, we followed the recom-

mendation in Niblack et al. (1993) and used

=1 − d

max

where d

is the ground distance

between bins i and j of the histogram, and d

max

max(d

). Although in general the quadratic-form is

not a metric, it can be shown that with this choice of

A the quadratic-form is indeed a metric.

The quadratic-form distance does not enforce a

one-to-one correspondence between mass elements

in the two histograms: The same mass in a given

bin of the ﬁrst histogram is simultaneously made to

correspond to masses contained in different bins of

the other histogram. This is illustrated in Fig. 1(b)

where the quadratic-form distance between the two

histograms on the left is larger than the distance be-

tween the two histograms on the right. Again, this

is clearly at odds with perceptual dissimilarity. The

desired distance here should be based on the corre-

spondences shown in part (d) of the ﬁgure.

Similar conclusions were obtained in Stricker and

Orengo (1995) where it was shown that using the

quadratic-form distance in image retrieval results

in false positives, because it tends to overestimate

the mutual similarity of color distributions without

a pronounced mode.

Match distance:

(H, K) =

−

| ,

where

j≤i

is the cumulative histogram of

}, and similarly for {k

The match distance (Shen and Wong, 1983;

Werman et al., 1985) between two one-dimensional

histograms is deﬁned as the L

distance between

their corresponding cumulative histograms. For one-

dimensional histograms with equal areas, this dis-

tance is a special case of the EMD which we present

in Section 4 with the important differences that the

match distance cannot handle partial matches, or

handle other ground distances. The match distance

does not extend to higher dimensions because the

relation j ≤ i is not a total ordering in more than

one dimension, and the resultingarbitrariness causes

problems.

Kolmogorov-Smirnov distance:

(H, K) = max

−

|).

Again,

and

are cumulative histograms.

The Kolmogorov-Smirnov distance is a common

statistical measure for unbinned distributions. Simi-

larly to the match distance, it is deﬁned only for one

dimension.

2.3. Parameter-Based Dissimilarity Measures

These methods ﬁrst compute a small set of parame-

ters from the histograms, either explicitly or implic-

itly, and then compare these parameters. For instance,

in Stricker and Orengo (1995) the distance between

distributions is computed as the sum of the weighted

distances of the distributions’ ﬁrst three moments. In

Das et al. (1997), only the peaks of color histograms

are used for color image retrieval. In Liu and Picard

(1996), textures are compared based on measures of

their periodicity, directionality, and randomness, while

in Manjunath and Ma (1996) texture distances are de-

ﬁnedby comparing theirmeansandstandard deviations

in a weighted-L

sense.

Additionaldissimilarity measuresforimage retrieval

are evaluated and compared in Smith (1997) and

Puzicha et al. (1997).

3. Histograms vs Signatures

In Section 2 we deﬁned a histogram as deriving from

a ﬁxed partitioning of the domain of a distribution. Of

course, even if bin sizes are ﬁxed, they can be different

in different parts of the underlying feature space. Even

so, however, for some images often only a small frac-

tion of the bins contain signiﬁcant information, while

most others are hardly populated. A ﬁnely quantized

histogram is highly inefﬁcient in this case. On the

other hand, for images that contain a large amount of

information, a coarsely quantized histogram would be

inadequate. In brief, because histograms are ﬁxed-size

structures, theycannot achieve a good balance between

expressiveness and efﬁciency.

A signature {s

= (m

)}, on the other hand,

represents a set of feature clusters. Each cluster is

剩余22页未读，继续阅读

crazyfln

粉丝: 0
资源: 3

使用地球移动距离进行图像相似性比较

python emd算法

EMD(Earth Mover's Distance)

python-emd:围绕Yossi Rubner的Earth Mover's Distance实现（http的Python包装器

Earth Movers Distance

the Earth Movers Distance（EMD） 代码

Earth Mover Distance

Efficient Similarity Join Based on Earth Mover's Distance Using MapReduce

Building Earth Mover's Distance on Bilingual Word Embeddings for Machine Translation

Unsupervised Person Re-identification with Locality-constrained Earth Mover’s Distance

Inducing Bilingual Lexica From Non-Parallel Data With Earth Mover’s Distance Regularization

最新资源

the Earth Movers Distance（EMD）代码