Variance-reduced and projection-free stochastic optimization
Author(s): Hazan, Elad; Luo, H
DownloadTo refer to this page use:
http://arks.princeton.edu/ark:/88435/pr1hd43
Abstract: | The Frank-Wolfe optimization algorithm has recently regained popularity for machine learning applications due to its projection-free property and its ability to handle structured constraints. However, in the stochastic learning setting, it is still relatively understudied compared to the gradient descent counterpart. In this work, leveraging a recent variance reduction technique, we propose two stochastic Frank-Wolfe variants which substantially improve previous results in terms of the number of stochastic gradient evaluations needed to achieve 1 - e accuracy. For example, we improve from O(1/ϵ) to O(ln1/ϵ) if the objective function is smooth and strongly convex, and from 0(1/ϵ2) to O(1/ϵ15) if the objective function is smooth and Lipschitz. The theoretical improvement is also observed in experiments on real-world datasets for a mulliclass classification application. |
Publication Date: | 2016 |
Electronic Publication Date: | 2016 |
Citation: | Hazan, E, Luo, H. (2016). Variance-reduced and projection-free stochastic optimization. 3 (1926 - 1936 |
Pages: | 1926 - 1936 |
Type of Material: | Conference Article |
Journal/Proceeding Title: | 33rd International Conference on Machine Learning, ICML 2016 |
Version: | Author's manuscript |
Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.