
Received July 13, 2021, accepted July 22, 2021, date of publication July 26, 2021, date of current version August 3, 2021.
Digital Object Identifier 10.1109/ACCESS.2021.3100414
ICIoU: Improved Loss Based on
Complete Intersection Over Union
for Bounding Box Regression
XUFEI WANG
1,2
AND JEONGYOUNG SONG
2
, (Member, IEEE)
1
Key Laboratory of Industrial Automation, School of Mechanical Engineering, Shaanxi University of Technology, Hanzhong 723000, China
2
Department of Computer Engineering, Pai Chai University, Daejeon 35345, South Korea
Corresponding author: Jeongyoung Song (jysong@pcu.ac.kr)
This work was supported in part by the Shaanxi Provincial Key Laboratory of Industrial Automation Research Program under Grant
18JS020.
ABSTRACT An object detector based on convolutional neural network (CNN) has been widely used in
the field of computer vision because of its simplicity and efficiency. The average accuracy of CNN model
detection results in the object detector is greatly affected by the loss function. The precision of the localization
algorithm in the loss function is the main factor affecting the result. Based on the complete intersection over
union (CIoU) loss function, an improved penalty function is proposed to improve the localization accuracy.
Specifically, the algorithm more comprehensively considers matching bounding boxes between prediction
with ground truth, using the proportional relationship of the aspect ratio from both bounding boxes. Under
the same aspect ratio of the two bounding boxes, the influence factors of the prediction box on localization
accuracy were considered. In this way, the function of the penalty function is strengthened, and localization
accuracy of the network model improved. This loss function is called Improved CIoU (ICIoU). Experiments
on the Udacity, PASCAL VOC, and MS COCO datasets have demonstrated the effectiveness of ICIoU
in improving localization accuracy of network models by using the one-stage object detector YOLOv4.
Compared with CIoU, the proposed ICIoU improved average precision (AP) by 0.57% and AP75 by 0.12%
on Udacity, AP by 0.26% and AP75 by 1.28% on PASCAL VOC, and AP by 0.06% and AP75 by 0.65% on
MS COCO.
INDEX TERMS Bounding box regression, localization accuracy, loss function, object detection.
I. INTRODUCTION
Object detection is one of the key problems in computer
vision tasks. In recent years, convolutional neural net-
works (CNNs) have been increasingly applied in the field
of computer vision [1]–[11]. When using CNNs to solve the
problem of object detection, no matter whether a regression or
classification problem, a loss function is indispensable. Loss
functions are used to estimate the degree of inconsistency
between the predicted value of a model and the real value.
The main task of model training in the present work is to
use the optimization method to find the model parameters
corresponding to the minimization of the loss function. The
loss function determines what the optimal value of the model
is, so the performance of different object detectors is affected
The associate editor coordinating the review of this manuscript and
approving it for publication was Sudipta Roy .
by the loss function. The loss function generally consists of
bounding box regression and classification. The loss calcula-
tion of bounding box regression is the key step of object loca-
tion, multiobject detection, target tracking, and instance-level
segmentation. In terms of multiobject detection, compared
with the traditional region proposal methods, a deep CNN
has better performance advantages in predicting the bounding
box of candidate objects. These networks include one-stage
object detectors such as the YOLO series [3]–[6] and single
shot multibox detector (SSD) [9], two-stage object detectors,
such as series of the regions with CNN features (R-CNN)
[11]–[14], and even multistage object detectors, such as cas-
cade R-CNN [15]. In these networks, intersection over union
(IoU) loss has become the most popular evaluation mea-
surement algorithm for bounding box regression compared
with focal loss and L
n
-norm (e.g. L
1
, L
2
) loss [16], [17].
However, IoU algorithm cannot detect the bounding box
105686
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 9, 2021