916 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 11, NO. 5, MAY 2014
Region-of-Interest Extraction Based on Frequency
Domain Analysis and Salient Region Detection
for Remote Sensing Image
Libao Zhang and Kaina Yang
Abstract—Traditional approaches for detecting visually salient
regions or targets in remote sensing images are inaccurate and
prohibitively computationally complex. In this letter, a fast, ef-
ficient region-of-interest extraction method based on frequency
domain analysis and salient region detection (FDA-SRD) is pro-
posed. First, the HSI transform is used to preprocess the re-
mote sensing image from RGB space to HSI space. Second, a
frequency domain analysis strategy based on quaternion Fourier
transform was employed to rapidly generate the saliency map.
Finally, the salient regions are described by an adaptive threshold
segmentation algorithm based on Gaussian Pyramids. Compared
with existing models, the new algorithm is computationally more
efficient and provides more visually accurate detection results.
Index Terms—Frequency domain analysis (FDA), quaternion
Fourier transform, region of interest (ROI), remote sensing image
processing.
I. INTRODUCTION
R
EGION-of-interest (ROI) detection technology, which is
represented by the visual attention mechanism, has been
introduced into the remote sensing image analysis field, and
it has become an important technical approach for improving
the time r equired and analysis accuracy in mass-data image
processing [1]. After providing a potential ROI, the viewer
can search for specific objects in the region. The computing
resources can be reasonably allocated to enhance the operating
efficiency of an image processing system.
A region that draws attention is defined as a focus of attention
(FOA), which is considered an ROI or a target. Several com-
putational models have been developed to simulate the human
visual system (HVS) [1]–[3]. Itti [2] constructed a model using
a biologically plausible architecture, which was proposed by
Koch and Ullman [3] and is the basis for visual attention.
Dai et al. [4] presented a method involving visual attention into
the satellite image classification. A faster, more efficient ROI
detection algorithm based on an adaptive spatial subsampling
visual attention model was proposed by Zhang et al. [5]. The
Manuscript received May 6, 2013; revised July 17, 2013 and August 17,
2013; accepted September 5, 2013. This work was supported in part by the
National Natural Science Foundation of China under Grant 61071103 and Fun-
damental Research Funds for the Central Universities under Grant 2012LYB50.
The authors are with the College of Information Science and Technol-
ogy, Beijing Normal University, Beijing 100875, China (e-mail: libaozhang@
bnu.edu.cn).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/LGRS.2013.2281827
above models have attempted to simulate the visual attention
mechanism based on the HVS biological construction.
In addition to the above biological models, certain other
methods have been proposed. Achanta et al. [6] presented a
frequency-tuned approach for computing saliency in images
using low level color and luminance features and it generates
full-resolution saliency maps. By analyzing the log-spectrum of
an input image, Hou et al. [7] extracted the spectral residual for
an image in the spectral domain and proposed a fast method
for constructing a corresponding saliency map in the spatial
domain. A bottom-up visual saliency model, graph-based visual
saliency (GBVS), is proposed by Harel et al. [8]. This method
uses a novel application of ideas from graph theory to concen-
trate mass on activation maps, and to form activation maps from
raw features. In addition to such models, the visual saliency
model is also applied to video compression [9].
Remote sensing images comprise high amounts of data. The
biological models can simulate the HVS well, but they often
lead to prohibitive computational complexity and not consider
the characteristics in frequency domain. Further, human visual
attention does not necessarily reflect actual concern in a remote
sensing image. Additional researchers in different disciplines
calculate ROI quickly, but they only consider the features of
the image itself. It is easy to cause false or missing detection.
To overcome the weaknesses in the existing visual attention
models so that they are more suitable for processing remote
sensing images, we focus on two aspects: accuracy and low
computation. The salient regions should be detected and well-
described. Thus, we propose a FDA-SRD model. This model is
proposed to improve computational efficiency and accuracy in
ROI detection of remote sensing images. After the HSI trans-
form, a novel frequency domain strategy based on quaternion
Fourier transform is been used to generate a saliency map,
which is time-saving and efficient. In addition, an adaptive
threshold segmentation algorithm based on Gaussian Pyramids
is employed to obtain more accurate shape information of ROIs.
Experimental results show that the proposed model is time-
efficient and accurate.
The remainder of this letter is organized as follows. The
FDA-SRD method is illustrated in Section I I. Section III
focuses on the research findings, while Section IV provides
conclusions.
II. FDA-SRD M
ETHOD
In the FDA-SRD model, the input image is subsampled by
a factor of 2 twice to reduce the amount of data and is pre-
processed using the HSI transform. A novel frequency domain
1545-598X © 2013 IEEE