改进Ridge多项式神经网络的梯度惩罚方法收敛理论与实验验证

114 浏览量更新于2024-08-28 收藏 504KB PDF 举报

本文主要探讨了"梯度法与惩罚项在Ridge多项式神经网络中的收敛性"这一主题。Ridge多项式神经网络是一种基于高阶前馈神经网络结构的模型，特别关注的是其在传统误差函数中引入惩罚项的目的，旨在提升模型的泛化能力。传统的误差函数容易导致过拟合问题，通过增加惩罚项，可以有效地缓解过拟合现象，提高模型在未知数据上的表现。论文的核心内容首先聚焦于学习参数的选择。作者提出了一个关于梯度法的单调性理论，该理论对于指导如何选择合适的参数至关重要。通过这个理论，研究人员能够确保梯度下降过程中的学习速率是稳定的，从而优化训练过程并防止陷入局部最优解。此外，作者还针对同步梯度法与惩罚项的神经网络，分别建立了弱收敛和强收敛的两个重要定理。弱收敛意味着随着迭代次数的增加，算法最终会收敛到某个区域，而强收敛则进一步保证了收敛点就是全局最小值。这两个定理为评估算法的稳定性和有效性提供了坚实的数学基础。实验部分，作者通过解决函数逼近问题来验证这些理论。通过一系列的实际应用，结果显示，添加惩罚项后的同步梯度方法不仅提高了Ridge多项式神经网络的性能，而且其收敛性理论得到了实际结果的有效支持。这表明，该方法在处理复杂函数和提高泛化能力方面具有显著优势。这篇研究论文深入探讨了在Ridge多项式神经网络中采用惩罚项的梯度方法，并通过理论分析和实验证明了其在改进模型泛化能力和选择合适学习参数方面的有效性。这对于理解高阶神经网络的训练策略以及优化算法在实际应用中的表现具有重要的参考价值。

Letters

Convergence of gradient method with penalty for Ridge Polynomial

neural network

Xin Yu

, Qingfeng Chen

School of Computer, Electronics and Information, Guangxi University, Nanning 53004, China

article info

Article history:

Received 2 December 2011

Received in revised form

24 April 2012

Accepted 28 May 2012

Communicated by R.W. Newcomb

Available online 3 July 2012

Keywords:

Ridge Polynomial neural network

Gradient algorithm

Monotonicity

Convergence

abstract

In this paper, a penalty term is added to the conventional error function to improve the generalization

of the Ridge Polynomial neural network. In order to choose appropriate learning parameters, we

propose a monotonicity theorem and two convergence theorems including a weak convergence and a

strong convergence for the synchronous gradient method with penalty for the neural network. The

experimental results of the function approximation problem illustrate the above theoretical results

are valid.

1. Introduction

The pi–sigma(

–

) neural network (PSNN) [1] isatypeof

high-order feedforword neural network, which uses t he product of

the sums of the input units. This network has a powerful function

approximation ability while avoiding the combinatorial explosion of

the higher terms. However, it is not an universal approximator. The

Ridge Polynomial neural network (RPNN) [2] is a generalization of

PSNN and uses a number of PSNNs as its basic building blocks. It

maintains the fast learning ability of the PSNN and can uniformly

approximate any continuous function on a compact set in R

[3].

Thus, the RPNN has been widely used in various ﬁelds [4–6].

The synchronous gradient method is a commonly used train-

ing algorithm for higher-order neural networks, and some con-

vergence theorems of gradient methods for various neural

networks have been proposed [7–13]. The study on the conver-

gence of training method is important for choosing appropriate

training parameters, e.g., initial weights and learning rate, to

perform an effective network training. Moreover, the general-

ization of neural networks is a critical issue in designing a training

method for neural networks. To improve the generalization of

neural networks, one widely used scheme is adding a penalty

term proportional to the magnitude of network weights to the

conventional error function [10–13]. Thus, this article aims to

study the convergence of synchronous gradient method with

penalty for the RPNN. We ﬁrst prove that the error sequence will

be uniformly monotonous and the algorithm also will be weakly

convergent during the training procedure on condition that

certain assumptions (cf. Assumptions A1–A3) are satisﬁed, then

prove that if the error function has ﬁnite stable points, the

algorithm with the weak convergence will be strongly conver-

gent. In the end, the training method is applied to a function

approximation problem to illustrate our theoretical ﬁnding.

2. Preliminary

Since the Ridge Polynomial neural network mainly consists of

some Pi–Sigma neural networks, the structure of the Pi–Sigma

neural network is introduced below.

2.1. Pi–sigma neural network

Fig. 1 shows a Pi–Sigma neural network with a single output.

Let nþ1, k and 1 be the number of the input units, summing units

and output units of the PSNN, respectively. Denote by w

¼ðw

, ..., w

A R

n þ1

the weight vector from the summing unit j to

the input units, and write w ¼ðw

, w

, ..., w

A R

ðn þ1Þk

. Note that

the weights from summing units to product unit are ﬁxed to 1. Let

x ¼ðx

, x

, ..., x

A R

n þ1

be nþ1 dimensional input vector where

¼1 is an adjustable threshold. Further, let y denote the output

of the product unit, given by

y ¼

j ¼ 1

i ¼ 0

j ¼ 1

, x



ð1Þ

Contents lists available at SciVerse ScienceDirect

journal homepage: www.elsevier.com/locate/neucom

Neurocomputing

http://dx.doi.org/10.1016/j.neucom.2012.05.022

Corresponding author.

E-mail addresses: yuxin21@126.com (X. Yu), qingfeng@gxu.edu.cn (Q. Chen).

Neurocomputing 97 (2012) 405–409

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38565631

粉丝: 2
资源: 913

改进Ridge多项式神经网络的梯度惩罚方法收敛理论与实验验证

A Self-Adjusting Conjugate Gradient Method with Sufficient Descent Condition .

Wind Prediction Based on Improved BP Artificial Neural Network in Wind Farm

Convergence of CR-gradient learning algorithm for Multi-valued Neuron

On convergence rate of distributed stochastic gradient algorithm for convex optimization with inequality constraints

The convergence analysis of Carroll function method for nonlinear SOC programming

The Impact of Speed on the Convergence of Gossip Algorithms with Mobility in Wireless Sensor Networks

On the convergence of an adaptive finite element methodfor a second-kind elliptic variational inequality

On the convergence of hybrid polynomial approximation to higher derivatives of rational curves

Polynomial Convergence of An Inexact Infeasible Interior Point Algorithm For P-Matrix Linear Complementarity Problems

On Local and Global Convergence of a Nonsmooth Newton-type Method for Nonlinear Semidefinite Programs

最新资源