On the optimization of deep networks: Implicit acceleration by overparameterization

Arora, Sanjeev; Cohen, N; Hazan, Elad

On the optimization of deep networks: Implicit acceleration by overparameterization

Author(s): Arora, Sanjeev; Cohen, N; Hazan, Elad

Download

To refer to this page use: http://arks.princeton.edu/ark:/88435/pr13b1z

Full metadata record

DC Field	Value	Language
dc.contributor.author	Arora, Sanjeev	-
dc.contributor.author	Cohen, N	-
dc.contributor.author	Hazan, Elad	-
dc.date.accessioned	2019-08-29T17:04:54Z	-
dc.date.available	2019-08-29T17:04:54Z	-
dc.date.issued	2018	en_US
dc.identifier.citation	Arora, S, Cohen, N, Hazan, E. (2018). On the optimization of deep networks: Implicit acceleration by overparameterization. 1 (372 - 389	en_US
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/pr13b1z	-
dc.description.abstract	Conventional wisdom in deep learning states that increasing depth improves expressiveness but complicates optimization. This paper suggests that, sometimes, increasing depth can speed up optimization. The effect of depth on optimization is decoupled from expressiveness by focusing on settings where additional layers amount to overparameterization - linear neural networks, a wellstudied model. Theoretical analysis, as well as experiments, show that here depth acts as a preconditioner which may accelerate convergence. Even on simple convex problems such as linear regression with p loss, p > 2, gradient descent can benefit from transitioning to a non-convex overparameterized objective, more than it would from some common acceleration schemes. We also prove that it is mathematically impossible to obtain the acceleration effect of overparametrization via gradients of any regularizer	en_US
dc.format.extent	372 - 389	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartof	35th International Conference on Machine Learning	en_US
dc.rights	Author's manuscript	en_US
dc.title	On the optimization of deep networks: Implicit acceleration by overparameterization	en_US
dc.type	Conference Article	en_US
pu.type.symplectic	http://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceeding	en_US

Files in This Item:

File	Description	Size	Format
On the optimization of deep networks Implicit acceleration by overparameterization.pdf		659.95 kB	Adobe PDF	View/Download

Show Simple Item Record