NLReLU：自然对数修正激活函数在卷积神经网络中的应用

166 浏览量更新于2024-08-27 收藏 576KB PDF 举报

"自然对数修正线性单元在卷积神经网络中的应用" 这篇研究论文探讨了在深度神经网络中激活函数的重要性和作用，特别关注了自然对数修正线性单元（NLReLU）这一新提出的激活函数。传统的ReLU（Rectified Linear Unit）因其简单和高效的特性在神经网络中被广泛应用，但其也存在一些问题，如死亡ReLU问题，即一部分神经元可能因为负输入而永久性地保持在零状态，这可能导致模型学习能力的下降。 NLReLU是ReLU的一种改进版本，它引入了参数化的自然对数变换来解决ReLU的缺点。NLReLU的定义为： \( f(x) = \beta \cdot \ln(max(0, x)) + 1.0 \) 其中，\(\beta\) 是一个可学习的参数，可以根据训练数据动态调整，\(\ln\) 表示自然对数，\(max(0, x)\) 是ReLU的基本结构，确保当输入 \(x\) 为负时输出为零。通过这种形式，NLReLU在保留ReLU非线性特性的基础上，增加了更多的表达能力，能够更好地适应不同数据分布。论文指出，NLReLU不仅解决了ReLU的部分问题，而且能够提高模型的表达能力和泛化性能。自然对数的引入使得在负值区域有更平滑的梯度，减少了梯度消失的问题，同时，可学习的参数\(\beta\)使得网络可以自适应地调整非线性程度，以适应不同层和不同阶段的训练需求。此外，该研究还对比了NLReLU与其他一些激活函数，如Leaky ReLU、Parametric ReLU（PReLU）和ELU（Exponential Linear Units）等，分析了它们在不同情况下的性能表现。实验结果表明，NLReLU在多项基准任务上表现出优于或至少与这些竞争激活函数相当的性能，特别是在图像分类和物体检测等卷积神经网络应用中。研究得到了中国国家自然科学基金创新研究群体项目和面上项目的资助，展示了NLReLU在深度学习领域的潜在价值，并为未来神经网络设计提供了新的思路。论文作者包括杨柳、张建鹏、高超、曲静华和李欣吉，通信作者为张建鹏。

Natural-Logarithm-Rectified Activation

Function in Convolutional Neural Networks

YANG LIU, JIANPENG ZHANG, CHAO GAO, JINGHUA QU, AND LIXIN JI

National Digital Switching System Engineering and Technological R&D Center, Zhengzhou 450002, China

Corresponding author: Jianpeng Zhang (zjp@ndsc.com.cn)

This work was supported by the National Natural Science Foundation of China for Innovative Research Groups under

Grant 61521003, and the National Natural Science Foundation of China under Grant 61601513 and Grant 61803384.

ABSTRACT Activation functions play a key role in providing remarkable performance in deep neural

networks, and the rectified linear unit (ReLU) is one of the most widely used activation functions. Various

new activation functions and improvements on ReLU have been proposed, but each carry performance

drawbacks. In this paper, we propose an improved activation function, which we name the natural-

logarithm-rectified linear unit (NLReLU). This activation function uses the parametric natural logarithmic

transform to improve ReLU and is simply defined as

( ) ln( max(0, ) 1.0)

f x x



  

. NLReLU not only

retains the sparse activation characteristic of ReLU, but it also alleviates the “dying ReLU” and vanishing

gradient problems to some extent. It also reduces the bias shift effect and heteroscedasticity of neuron data

distributions among network layers in order to accelerate the learning process. The proposed method was

verified across ten convolutional neural networks with different depths for two essential datasets.

Experiments illustrate that convolutional neural networks with NLReLU exhibit higher accuracy than those

with ReLU, and that NLReLU is comparable to other well-known activation functions. NLReLU provides

0.16% and 2.04% higher classification accuracy on average compared to ReLU when used in shallow

convolutional neural networks with the MNIST and CIFAR-10 datasets, respectively. The average accuracy

of deep convolutional neural networks with NLReLU is 1.35% higher on average with the CIFAR-10

dataset.

INDEX TERMS Convolutional neural networks, activation function, rectified linear unit

I. INTRODUCTION

Activation functions have a crucial impact on the

performance and capabilities of neural networks, and they are

one essential research topic in deep learning. Activation

functions are generally monotonic and nonlinear. The

introduction of the activation function in neural networks

allows neural networks to apply nonlinear transforms to input

data such that many complex problems can be resolved.

However, as the number of neural network layers increases,

serious problems caused by the activation function begin to

appear. Some examples include gradient vanishing or

explosion during back propagation and migration of the data

distribution among layers of the neural network after

activation, which increases the difficulty of learning the

training data. Therefore, a suitable activation function is

critical to building a neural network.

Rectified linear unit (ReLU) [1], [2] is one of the most

widely-used activation functions in neural networks in recent

years. In the majority of popular convolutional neural

networks, e.g., VGG nets [3], residual networks (ResNets)

[4], [5], and dense convolutional networks (DenseNets) [6],

ReLU always provides desirable results. Meanwhile, the

research community is still devoted to developing new

activation functions (e.g., Gaussian error linear unit (GELU)

[7], scaled exponential linear unit (SELU) [8], and Swish [9])

and improving ReLU (e.g., leaky ReLU (LReLU) [10],

parametric ReLU (PReLU) [11], exponential linear unit

(ELU) [12], and concatenated ReLU (CReLU) [13]) to

obtain more robust activation functions. However, most new

activation functions do not perform well in terms of

generalized performance. Xu et al. [14] and Ramachandran et

al. [9] systematically investigated the performance of

different types of rectified activation functions in

convolutional neural networks, and they presented

inconsistent performance improvements with these activation

functions across different models and datasets.

下载后可阅读完整内容，剩余8页未读，立即下载

weixin_38636655

粉丝: 4
资源: 941

NLReLU：自然对数修正激活函数在卷积神经网络中的应用

Natural logarithm wavelength modulation spectroscopy

Logarithm Function in IEEE

A gray-natural logarithm ratio bilateral filtering method for image processing

Logarithm.zip_logarithm

Precise asymptotics in the law of the iterated logarithm for associated sequences

Real Matrix Logarithm：计算实正规矩阵的实对数-matlab开发

LOGARITHM.zip_The Given

Matrix Logarithm with Frechet Derivatives and Condition Number：使用导数和条件数计算实数/复数算术中的矩阵对数。-matlab开发

Discrete logarithm based additively homomorphic encryption and secure data aggregation

LnE.zip_Logarithm verilog_lne是指数_verilog 指数_对数_对数 verilog

最新资源