Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

A New Conjugate Gradient Method with Smoothing $$L_{1/2} $$ L 1 / 2 Regularization Based on a Modified Secant Equation for Training Neural N ...

A New Conjugate Gradient Method with Smoothing $$L_{1/2} $$ L 1 / 2 Regularization Based on a... Proposed in this paper is a new conjugate gradient method with smoothing $$L_{1/2} $$ L 1 / 2 regularization based on a modified secant equation for training neural networks, where a descent search direction is generated by selecting an adaptive learning rate based on the strong Wolfe conditions. Two adaptive parameters are introduced such that the new training method possesses both quasi-Newton property and sufficient descent property. As shown in the numerical experiments for five benchmark classification problems from UCI repository, compared with the other conjugate gradient training algorithms, the new training algorithm has roughly the same or even better learning capacity, but significantly better generalization capacity and network sparsity. Under mild assumptions, a global convergence result of the proposed training method is also proved. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Neural Processing Letters Springer Journals

A New Conjugate Gradient Method with Smoothing $$L_{1/2} $$ L 1 / 2 Regularization Based on a Modified Secant Equation for Training Neural N ...

Neural Processing Letters , Volume 48 (2) – Nov 21, 2017

Loading next page...
 
/lp/springer-journals/a-new-conjugate-gradient-method-with-smoothing-l-1-2-l-1-2-9rWECh7z80

References (51)

Publisher
Springer Journals
Copyright
Copyright © 2017 by Springer Science+Business Media, LLC
Subject
Computer Science; Artificial Intelligence (incl. Robotics); Complex Systems; Computational Intelligence
ISSN
1370-4621
eISSN
1573-773X
DOI
10.1007/s11063-017-9737-9
Publisher site
See Article on Publisher Site

Abstract

Proposed in this paper is a new conjugate gradient method with smoothing $$L_{1/2} $$ L 1 / 2 regularization based on a modified secant equation for training neural networks, where a descent search direction is generated by selecting an adaptive learning rate based on the strong Wolfe conditions. Two adaptive parameters are introduced such that the new training method possesses both quasi-Newton property and sufficient descent property. As shown in the numerical experiments for five benchmark classification problems from UCI repository, compared with the other conjugate gradient training algorithms, the new training algorithm has roughly the same or even better learning capacity, but significantly better generalization capacity and network sparsity. Under mild assumptions, a global convergence result of the proposed training method is also proved.

Journal

Neural Processing LettersSpringer Journals

Published: Nov 21, 2017

There are no references for this article.