Please translate passage to academic english: 优化器选用Adadelta,其是在Adagrad算法的基础上改进得到的扩展版。Adadelta相比Adagrad,不再累积所有过去的梯度,而是根据渐变更新的移动窗口调整学习率,使其具有更强的鲁棒性[33]。Adadelta算法参数设定主要包括:学习率为1.0,Adadelta梯度平方移动均值的衰减率为0.95,模糊因子为1×10-6,每次参数更新后学习率衰减值为0。
时间: 2024-02-26 22:54:24 浏览: 66
4-7d多处最优服务次序问题.zip_greedy knapsack_passage2zn_算法设计与分析;贪心_背包问题
The optimizer used is Adadelta, an improved and extended version of the Adagrad algorithm. Compared with Adagrad, Adadelta no longer accumulates all past gradients, but adjusts the learning rate based on the moving window updated by the gradient, making it more robust [33]. The main parameters for setting the Adadelta algorithm include a learning rate of 1.0, a decay rate of 0.95 for the moving average of Adadelta gradient squares, a blur factor of 1×10-6, and a learning rate decay value of 0 after each parameter update.
阅读全文