形态扩张CNN在高光谱图像分类中的应用

需积分: 12 85 浏览量更新于2024-08-05 收藏 3.08MB PDF 举报

本文主要探讨了在高光谱图像分类中，基于形态学扩张卷积神经网络（Morphologically Dilated Convolutional Neural Network, MDCNN）的应用。随着遥感技术的不断发展，高光谱图像在土地覆盖特征精确分类中的作用日益显著，成为研究的重要领域。传统的分类方法虽然已取得良好结果，但深度学习模型如卷积神经网络（Convolutional Neural Network, CNN）因其强大的特征提取能力而受到广泛关注。传统CNN在高光谱图像处理中可能遇到分辨率限制和细节丢失的问题，尤其是在处理小目标或低对比度区域时。为解决这些问题，文章引入了形态学扩张这一概念，它是一种数学形态学的方法，通过在滤波操作中增加邻域的考虑范围，增强了对图像细节的敏感性和边缘检测的能力。形态学扩张能够有效地扩展卷积核的视野，从而改善图像特征的表达。在MDCNN模型中，作者设计了一种结合了形态学扩张的卷积层，这使得网络能够在保持计算效率的同时，提高对图像复杂纹理和结构的识别。该模型首先通过二值化步骤将高光谱图像转化为便于处理的二值图像，然后利用形态学扩张的卷积层提取特征。这些特征进一步被送入深度网络进行训练，以实现对不同土地覆盖类型的准确分类。值得注意的是，关键词包括数学形态学、高光谱图像（Hyperspectral Image, HIS）、扩张卷积、二值化以及卷积神经网络。研究结果表明，MDCNN在高光谱图像分类任务上具有显著的优势，不仅提高了分类精度，还可能减少对大量标注数据的依赖，为遥感领域提供了一个有力的工具。总结来说，这篇论文深入研究了如何将形态学扩张融入卷积神经网络，以优化高光谱图像的处理和分类性能。这对于遥感技术的发展和精准地理解地球表面的环境变化具有重要的实际应用价值。在未来的研究中，这种方法有望进一步拓展到其他相关领域，如环境监测、城市规划等。

V. Kumar, R.S. Singh and Y. Dua Signal Processing: Image Communication 101 (2022) 116549

discriminative information in the shallow layers is transferred via these

connections to aid reconstruction and classification tasks. In [48],

the author proposes a two-branch spectral–spatial attention network

for hyperspectral image classification, with one branch dedicated to

spectral attention and the other to spatial attention. Each convolutional

layer includes attention modules. Thus, making CNN prioritize discrim-

inative channels and spatial positions while suppressing irrelevant ones.

Furthermore, two-branch results are fused in the classification phase

using an adaptively weighted summation method.

Based on the literature’s core problems, as mentioned above, and an

extension of work from Roy et al. [37], we proposed a novel morpho-

logically dilated convolutional network (MDCNN). MDCNN uses both

the 3D convolution layer and the 2D convolution layer, morphological

feature maps, and both standard as well as dilated convolution. In

MDCNN, the principal component analysis (PCA) algorithm is used

to transform high-dimensional input data into low-dimensional data

in order to minimize computational costs. CNN has a flaw, which is

the inaccuracy of boundary location, resulting in partial object shapes.

As a result, information obtained by another type of spatial extractor

can improve deep network feature representation [49]. Then different

mathematical morphology operations are applied on new low dimen-

sional input data to extract discriminant spatial feature maps. Further,

these morphological feature maps concatenated with the previous low

dimensional input data. Then, input patches are extracted around each

pixel, which will send to the network. Both standard 3D convolution

and Dilated-3D convolution were applied to extract spectral–spatial

features at the same time. Then Dilated-2D convolution was used to

extract discriminant spatial features. The dilated convolution applied

in this paper by substituting the convolution layers in traditional CNN

with dilated convolution layers, which expands the receptive field

without boosting parameters and thus improves network performance

without increasing network complexity [50–52]. The dilation layer

does not reduce the number of parameters but reduces the size of the

output feature map, which leads to the overall reduction in the number

of parameters. Spectral–spatial attributes are then transmitted to fully

connecting layers to extract abstract high-level features.

This papers primary contributions can be summarized as follows:

1. Mathematical morphological operations are applied to input

hyperspectral data to extract spatial feature map as output.

2. This output is concatenated with the input and fed into neural

network to reduce the workload of CNN and provide better

spatial features.

3. The neural network contains both 3D convolutional layers as

well as a 2D convolutional layer. We use a mix of traditional and

dilated convolution to increase the receptive field power, reduce

trainable parameters, and reduce overfitting, which in turn re-

sults in the reduction of overall complexity of the model. The

dilated convolution layer’s output feature map size is slightly

less than the output of the traditional convolution layer, which

reduces overall trainable parameters.

4. The model simultaneously extracts the discriminatory spectral–

spatial attributes or properties to achieve high classification

accuracy by utilizing the spectral–spatial relationship.

5. Experiments were performed on three different publicly avail-

able datasets to evaluate the performance of MDCNN model with

other state-of-the-art methods.

The rest of this paper is structured in Section 2, a detailed overview

of the current MDCNN structure. Section 3 discusses experimental

evidence, setup, findings, and interpretation. Finally, some conclusions

are drawn in Section 4.

2. Problem formulation

HSI data is also known as a hypercube. This hypercube can be

represented as 𝐈 ∈ 𝐑

[𝐇×𝐖×𝐂]

, where I is the original HSI input, H is

the height, W is the width of the input, and C is the total number

of spectral bands. HSI provides a tremendous amount of information

through a large number of spectral bands, but their high dimensionality

increases the computational burden. So, we use PCA, which can reduce

the high dimensional data into a low dimension data with minimum

loss of useful information. The reduced data cube after applying PCA is

𝐈𝐏 ∈ 𝐑

[𝐇×𝐖×𝐏]

, where P is the number of the principal component. Bina-

rization operation applies to low dimensional data and gives an output

𝐈𝐁 ∈ 𝐑

[𝐇×𝐖×𝐊]

. Only K initial bands are selected for the Binarization

process. Three mathematical morphological operations apply on Binary

data cube IB and give an output of size [𝐻 × 𝑊 × 𝐾] each. Data cube

IP, 𝐼

𝐸𝑟𝑜𝑠𝑖𝑜𝑛

, 𝐼

𝐷𝑖𝑙𝑎𝑡𝑖𝑜𝑛

and 𝐼

𝐺𝑟𝑎𝑑𝑖𝑒𝑛𝑡

are concatenated and form an output

𝐈𝐂 ∈ 𝐑

[𝐇×𝐖×𝐁]

, where 𝐁 = (𝐏 + 𝟑𝐊). Input patches 𝐈

𝐏𝐚𝐭𝐜𝐡

∈ 𝐑

[𝐒𝑥𝐒𝑥𝐁]

are

extracted from IC and fed to the CNN for deep spectral–spatial features

extraction and classification task.

3. Proposed framework

Fig. 1 describe the workflow of the proposed model MDCNN, which

includes different steps like (a) Dimension reduction of the hyper-

spectral cube in spectral direction using principal component analy-

sis (PCA), (b) Binarization of the low-dimension hyperspectral cube,

Concatenation of low-dimension hyperspectral cube and morpholog-

ical feature maps (Erosion, Closing, and Gradient), (e) Extraction of

patches for the input of the convolutional neural network, (f) Deep

spectral–spatial features extraction using standard 3D convolution and

Dilated-3D convolution, (g) Discriminative spatial feature extraction

using Dilated-2D convolution, and (h) Prediction of classification map

using softmax classifier.

3.1. Binarization process

Binarization is a process of converting any input vector’s value to

a spectrum of 0 to 1. In the case of HSI data, at the first pixel values,

transform into the range of 0 to 255. The expression for the conversion

is as follows:

𝛹

𝑖,𝑗

255 ∗ (𝐼𝑝

𝑖,𝑗

− 𝑚𝑖𝑛(𝐼 𝑝))

𝐼𝑝

𝑖,𝑗

(1)

Where Ip is the image, 𝐈𝐩

𝐢,𝐣

is the pixel value at the position (i, j). After

the range conversion, a threshold (Θ) is selected using the expression

as follow:

𝛩 =

∑

𝐻−1

𝑖=0

∑

𝑊 −1

𝑗=0

𝛹

𝑖,𝑗

𝐻 × 𝑊 × 𝐾

(2)

Where H and W are the height and width of the input cube, and K is

the number of bands selected for binarization. Now, the value of 𝜳

𝐢,𝐣

will be one if it is greater than or equal to Θ, otherwise zero. The new

image formed after thresholding is given by:

𝐵𝐼

𝑖,𝑗

{

1, 𝐢𝐟 𝛹

𝑖,𝑗

≥ 𝛩

0, 𝐢𝐟 𝛹

𝑖,𝑗

< 𝛩

(3)

3.2. Mathematical morphology

The high spatial resolution possessed by hyperspectral images makes

a very less number of mixed pixels and provides clear boundaries

between different objects in the case of land covers data. So, the

discriminative spatial features like morphological features can provide

results with better accuracy. Serra [53] first introduced morpholog-

ical analysis in 1982 and used Structuring Elements(SEs) to collect

information on the shape, boundary, and skeleton of an image. It is a

three-step process to acquire morphological feature map, (a) selection

of structuring element, (b) The conversion of image into a binary image,

since through binary image only the structuring element can follow the

剩余10页未读，继续阅读

qq_22771997

粉丝: 0
资源: 3

形态扩张CNN在高光谱图像分类中的应用

基于深度卷积神经网络的高光谱遥感图像分类.pdf

基于卷积神经网络的高光谱图像分类.pdf

基于三维全卷积神经网络的高光谱图像分类.pdf

基于多尺度卷积神经网络的高光谱图像分类.pdf

基于加权K近邻和卷积神经网络的高光谱图像分类.pdf

利用卷积神经网络的高光谱图像分类.pdf

多层局部感知卷积神经网络的高光谱图像分类.pdf

基于卷积神经网络与主动学习的高光谱图像分类.pdf

卷积神经网络与MRF相结合的高光谱图像分类.pdf

基于局部保留降维卷积神经网络的高光谱图像分类算法.pdf

最新资源