5
TABLE II: Summary of state-of-the-art methods. See II for more detailed description.
Methods Year&Venue Network architecture Reference manner Supervision form Learning paradigm Supervision level
Fu et al. [50] 2015 EAAI Basic Patch-based Fully-Sup. STL Instance level
Wang et al. [49] 2015 ACMMM Basic Patch-based Fully-Sup. STL Instance level
Cross scene [51] 2015 CVPR Basic Patch-based Fully-Sup. MTL Instance level
MCNN [1] 2016 CVPR Multi-column Whole image-based Fully-Sup. STL Instance level
Crowdnet [3] 2016 ACMMM Multi-column Patch-based Fully-Sup. STL Instance level
CNN-Boosting [15] 2016 ECCV Basic Patch-based Fully-Sup. STL Instance level
Hydra-CNN [2] 2016 ECCV Multi-column Patch-based Fully-Sup. MTL Instance level
Shang et al. [69] 2016 ECCV Multi-column Whole image-based Fully-Sup. STL Instance level
CMTL [55] 2017 AVSS Multi-column Whole image-based Fully-Sup. MTL Instance level
Switching CNN [5] 2017 CVPR Multi-column Patch-based Fully-Sup. MTL Instance level
CP-CNN [6] 2017 ICCV Multi-column Whole image-based Fully-Sup. MTL Instance level
D-ConvNet [70] 2018 CVPR Single-column Whole image-based Filly-Sup. STL Instance level
CSRNet [12] 2018 CVPR Single-column Whole image-based Fully-Sup. STL Instance level
DRSAN [71] 2018 IJCAI Multi-column Whole image-based Fully-Sup. STL Instance level
DecideNet [7] 2018 CVPR Multi-column Patch-based Fully-Sup. MTL Instance level
SaCNN [9] 2018 WACV Single column Whole image-based Fully-Sup. MTL Instance level
SACNN [11] 2018 ECCV Single column Patch-based Fully-Sup. MTL Instance level
IG-CNN [72] 2018 CVPR Multi-column Patch-based Fully-Sup. MTL Instance level
ic-CNN [73] 2018 ECCV Multi-column Whole image-based Fully-Sup. MTL Instance level
ACSCP [74] 2018 CVPR Multi-column Patch-based Fully-Sup. MTL Instance level
NetVLAD [75] 2018 TII Single-column Whole image-based Fully-Sup. MTL Instance level
CL [76] 2018 ECCV Single-column Patch-based Fully-Sup. MTL Instance level
L2R [77] 2018 CVPR Basic Whole image-based Self-Sup. MTL –
GAN-MTR [78] 2018 WACV Basic Whole image-based Semi-Sup. MTL –
PaDNet [79] 2019 TIP Single-column Patch-based Fully-Sup. STL Instance level
ASD [80] 2019 ICASSP Multi-column Whole image-based Fully-Sup. MTL Instance level
SPN [81] 2019 WACV Single column Whole image-based Fully-Sup. STL Instance level
SR-GAN [82] 2019 CVIU Basic Whole image-based Semi-Sup. MTL –
ADCrowdnet [83] 2019 CVPR Single column Whole image-based Fully-Sup. STL Instance level
SAAN [8] 2019 WACV Multi-column Whole image-based Fully-Sup. MTL Instance level
SAA-Net [13] 2019 CVPR Single column Whole image-based Fully-Sup. MTL Instance level
SFCN†
2
[84] 2019 CVPR Single column Whole image-based Fully-Sup. STL Instance level
SE Cycle GAN [84] 2019 CVPR Single column Whole image-based Fully-Sup. STL Instance level
PACNN [85] 2019 CVPR Single column Whole image-based Fully-Sup. STL Instance level
CAN&ECAN [86] 2019 CVPR Single column Whole image-based Fully-Sup. STL Instance level
CFF [87] 2019 ICCV Single-column Whole image-based Fully-Sup. MTL Instance level
PCC Net [88] 2019 TCSVT Multi-column Whole image-based Fully-Sup. MTL Instance level
SFANet [89] 2019 CVPR Single column Whole image-based Fully-Sup. MTL Instance level
W-Net [90] 2019 CVPR Single column Whole image-based Fully-Sup. STL Instance level
SL2R [91] 2019 CVPR Basic Whole image-based Self-Sup. MTL –
TEDnet [92] 2019 CVPR Single column Whole image-based Fully-Sup. STL Instance level
RReg [93] 2019 CVPR Multi-column Whole image-based Fully-Sup. STL Instance level
RAZNet [94] 2019 CVPR Multi-column Whole image-based Fully-Sup. MTL Instance level
AT-CNN [95] 2019 CVPR Single-column Whole image-based Fully-Sup. MTL Instance level
GWTA-CCNN [96] 2019 AAAI Single column Patch-based Un-Sup. STL –
HA-CCN [97] 2019 TIP Single column Whole image-based Fully-Sup./Weak-Sup STL Instance/Image level
L2SM [98] 2019 ICCV Single column Patch-based Fully-Sup STL Instance level
RANet [99] 2019 ICCV Multi-column Whole image-based Fully-Sup STL Instance level
McML [100] 2019 ACM MM Multi-column Whole image-based Fully-Sup STL Instance level
ILC [101] 2019 CVPR Multi-column Whole image-based Fully-Sup. MTL Image level
classifier is also trained alternatively on the regressions to
select the best one for the density estimation.
• CP-CNN [6] is a contextual pyramid CNN that combines
global and local contextual information to generate high-
quality density maps. Moreover, adversarial learning [103]
is utilized to fuse the features from different levels.
• TDF-CNN [104] delivers top-down information to the
bottom-up network to amend the density estimation.
• DRSAN [71] handles the issues of scale variation and
rotation variation taking advantages of Spatial Transformer
Network (STN) [105].
• SAAN [8] is similar to the idea of MoC-CNN [106]
and CP-CNN [6], but utilizes visual attention mechanism to
automatically select the particular scale both for the global
image level and local image patch level.
• RANet [99] provides local self-attention (LSA) and global
self-attention (GSA) to capture short-range and long-range in-
terdependence information respectively, furthermore, a relation
module is introduced to merge LSA and GSA to obtain more
informative aggregated feature representations.
• McML [100] incorporates a statistical network into the
multi-column network to estimate the mutual information be-
tween different columns, the proposed mutual learning scheme
which can optimize each column alternately whilst retaining
other columns fixed on each mini-batch training data.
• DADNet [107] takes dilated-CNN with different dilated
rates to capture more contextual information as front-end and
adaptive deformable convolution as a back-end to locate the
positions of the objects accurately.
Albeit great progress has been achieved by these multi-
column network, they still suffer from several significant dis-
advantages, which have been demonstrated through conducting
experiments by Li et al. [12]. First of all, it is difficult to train
the multi-column networks since it requires more time and