FogGuard: guarding YOLO against fog using perceptual loss
Soheil Gharatappeh, Sepideh Neshatfar, Salimeh Yasaei Sekeh
1
and Vikas Dhiman
2
Abstract— In this paper, we present a novel fog-aware object
detection network called FogGuard, designed to address the
challenges posed by foggy weather conditions. Autonomous
driving systems heavily rely on accurate object detection algo-
rithms, but adverse weather conditions can significantly impact
the reliability of deep neural networks (DNNs).
Existing approaches fall into two main categories, 1) image
enhancement such as IA-YOLO 2) domain adaptation based
approaches. Image enhancement based techniques attempt to
generate fog-free image. However, retrieving a fogless image
from a foggy image is a much harder problem than detecting
objects in a foggy image. Domain-adaptation based approaches,
on the other hand, do not make use of labelled datasets in the
target domain. Both categories of approaches are attempting
to solve a harder version of the problem. Our approach builds
over fine-tuning on the
Our framework is specifically designed to compensate for
foggy conditions present in the scene, ensuring robust perfor-
mance even. We adopt YOLOv3 as the baseline object detection
algorithm and introduce a novel Teacher-Student Perceptual
loss, to high accuracy object detection in foggy images.
Through extensive evaluations on common datasets such as
PASCAL VOC and RTTS, we demonstrate the improvement
in performance achieved by our network. We demonstrate that
FogGuard achieves 69.43% mAP, as compared to 57.78% for
YOLOv3 on the RTTS dataset.
Furthermore, we show that while our training method
increases time complexity, it does not introduce any additional
overhead during inference compared to the regular YOLO
network.
I. INTRODUCTION
Adverse weather conditions such as rain, snow, and fog
present risks for driving. One such risk is reduced visibility,
which, in autonomous driving, impairs object detection. This
is highly dangerous; objects that are not spotted cannot
be avoided, while objects that are inaccurately localized or
classified can cause the vehicle to respond by swerving or
“phantom braking” [1]. In this work, we focus on improving
object detection in foggy weather.
We focus on improving object detection accuracy using
only cameras. Not all autonomous vehicles have multiple
sensor types, but cameras are present on virtually all of
them [2], [3]. This makes our research widely applicable,
including to vehicles that have additional sensor types;
camera-based object detection can always be combined with
other systems to improve overall accuracy via multi-sensor
fusion [4]. Other research has explored the use of fog-specific
1
School of Computing and Information Science, University of Maine,
Orono, ME, United States, soheil.gharatappeh@maine.edu
2
Department of Electrical and Computer Engineering, University of
Maine, Orono, ME, United States
This material is based upon work supported by the National Science
Foundation under Grant No 2218063
supplemental sensors, such as the novel millimeter-wave
radar [5], [6].
The image processing community has explored the prob-
lems of dehazing, defoggification, and image-enhancement
before the success of deep learning based approaches [7]–
[10]. Bringing image processing based approaches into the
learning domain, IA-YOLO [11] combines an image pro-
cessing module with a learning pipeline to infer a de-fogged
image before feeding it into a regular object detector like
YOLO [12]. We posit that inferring a de-fogged image is
a much harder problem than detecting objects in a foggy
image. Clearly, detecting and classifying a bounding box
in a foggy image as an object class, for example car, is a
much easier problem than recreating every pixel of that car.
Additionally, dehazzing based approaches often suffer from
significant computational overhead in order to achieve better
image quality.
To improve object detection in a foggy image, we modify
the training process of a YOLO-v3 [12] network to be robust
to foggy images. Our modified training process contributes
two novel ideas, 1) generalization of perceptual loss [13] to
Teacher-student perceptual loss (Section IV-A) and 2) data-
augmentation with depth-aware realistic fog (Section IV-B).
We use perceptual loss based on the intuition that the seman-
tic information in a foggy image is same as that of a clear
image. So we seek to minimize the perceptual loss between
a clear image and the foggified version of that iamges. Data-
augmentation is necessary because foggy object detection
datasets like RTTS [14] (∼3K images) are much smaller
than of clear image datasets like PASCAL VOC [15](∼16K
images) and MS-COCO [16] (∼116K images). Our ablation
studies demonstrate the utility of each of our contribution in
improving the accuracy of object detection in the presence
of fog.
We evaluate and compare our proposed method on
RTTS dataset against state-of-the-art approaches like IA-
YOLO [11], DE-YOLO [17] and SSD-Entropy [4]. We
find that our approach is more accurate than IA-YOLO by
11.64% and [4] by 14.27% while being faster by a factor of
5 times.
II. RELATED WORK
Vanilla object detection algorithms [12], [18], [19] are
often insufficient in adverse weather conditions such as fog,
rain, snow, low light scenarios. To address such problems,
the literature can be categorized into four main categories,
1) analytical image processing techniques, 2) learning-based
approaches, 3) domain adaptation and 4) learning-based
image-enhancement techniques.
arXiv:2403.08939v1 [cs.CV] 13 Mar 2024