1
Natural-Logarithm-Rectified Activation
Function in Convolutional Neural Networks
YANG LIU, JIANPENG ZHANG, CHAO GAO, JINGHUA QU, AND LIXIN JI
National Digital Switching System Engineering and Technological R&D Center, Zhengzhou 450002, China
Corresponding author: Jianpeng Zhang (zjp@ndsc.com.cn)
This work was supported by the National Natural Science Foundation of China for Innovative Research Groups under
Grant 61521003, and the National Natural Science Foundation of China under Grant 61601513 and Grant 61803384.
ABSTRACT Activation functions play a key role in providing remarkable performance in deep neural
networks, and the rectified linear unit (ReLU) is one of the most widely used activation functions. Various
new activation functions and improvements on ReLU have been proposed, but each carry performance
drawbacks. In this paper, we propose an improved activation function, which we name the natural-
logarithm-rectified linear unit (NLReLU). This activation function uses the parametric natural logarithmic
transform to improve ReLU and is simply defined as
f x x
. NLReLU not only
retains the sparse activation characteristic of ReLU, but it also alleviates the “dying ReLU” and vanishing
gradient problems to some extent. It also reduces the bias shift effect and heteroscedasticity of neuron data
distributions among network layers in order to accelerate the learning process. The proposed method was
verified across ten convolutional neural networks with different depths for two essential datasets.
Experiments illustrate that convolutional neural networks with NLReLU exhibit higher accuracy than those
with ReLU, and that NLReLU is comparable to other well-known activation functions. NLReLU provides
0.16% and 2.04% higher classification accuracy on average compared to ReLU when used in shallow
convolutional neural networks with the MNIST and CIFAR-10 datasets, respectively. The average accuracy
of deep convolutional neural networks with NLReLU is 1.35% higher on average with the CIFAR-10
dataset.
INDEX TERMS Convolutional neural networks, activation function, rectified linear unit
I. INTRODUCTION
Activation functions have a crucial impact on the
performance and capabilities of neural networks, and they are
one essential research topic in deep learning. Activation
functions are generally monotonic and nonlinear. The
introduction of the activation function in neural networks
allows neural networks to apply nonlinear transforms to input
data such that many complex problems can be resolved.
However, as the number of neural network layers increases,
serious problems caused by the activation function begin to
appear. Some examples include gradient vanishing or
explosion during back propagation and migration of the data
distribution among layers of the neural network after
activation, which increases the difficulty of learning the
training data. Therefore, a suitable activation function is
critical to building a neural network.
Rectified linear unit (ReLU) [1], [2] is one of the most
widely-used activation functions in neural networks in recent
years. In the majority of popular convolutional neural
networks, e.g., VGG nets [3], residual networks (ResNets)
[4], [5], and dense convolutional networks (DenseNets) [6],
ReLU always provides desirable results. Meanwhile, the
research community is still devoted to developing new
activation functions (e.g., Gaussian error linear unit (GELU)
[7], scaled exponential linear unit (SELU) [8], and Swish [9])
and improving ReLU (e.g., leaky ReLU (LReLU) [10],
parametric ReLU (PReLU) [11], exponential linear unit
(ELU) [12], and concatenated ReLU (CReLU) [13]) to
obtain more robust activation functions. However, most new
activation functions do not perform well in terms of
generalized performance. Xu et al. [14] and Ramachandran et
al. [9] systematically investigated the performance of
different types of rectified activation functions in
convolutional neural networks, and they presented
inconsistent performance improvements with these activation
functions across different models and datasets.