Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality

Zhang, Yi; Plevrakis, Orestis; Du, Simon S; Li, Xingguo; Song, Zhao; Arora, Sanjeev

Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality

Author(s): Zhang, Yi; Plevrakis, Orestis; Du, Simon S; Li, Xingguo; Song, Zhao; et al

Download

To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1nz7s

Full metadata record

DC Field	Value	Language
dc.contributor.author	Zhang, Yi	-
dc.contributor.author	Plevrakis, Orestis	-
dc.contributor.author	Du, Simon S	-
dc.contributor.author	Li, Xingguo	-
dc.contributor.author	Song, Zhao	-
dc.contributor.author	Arora, Sanjeev	-
dc.date.accessioned	2021-10-08T19:50:47Z	-
dc.date.available	2021-10-08T19:50:47Z	-
dc.date.issued	2020	en_US
dc.identifier.citation	Zhang, Yi, Orestis Plevrakis, Simon S. Du, Xingguo Li, Zhao Song, and Sanjeev Arora. "Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality." Advances in Neural Information Processing Systems 33 (2020).	en_US
dc.identifier.issn	1049-5258	-
dc.identifier.uri	https://proceedings.neurips.cc/paper/2020/file/0740bb92e583cd2b88ec7c59f985cb41-Paper.pdf	-
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/pr1nz7s	-
dc.description.abstract	Adversarial training is a popular method to give neural nets robustness against adversarial perturbations. In practice adversarial training leads to low robust training loss. However, a rigorous explanation for why this happens under natural conditions is still missing. Recently a convergence theory of standard (non-adversarial) supervised training was developed by various groups for {\em very overparametrized} nets. It is unclear how to extend these results to adversarial training because of the min-max objective. Recently, a first step towards this direction was made by Gao et al. using tools from online learning, but they require the width of the net to be \emph{exponential} in input dimension d , and with an unnatural activation function. Our work proves convergence to low robust training loss for \emph{polynomial} width instead of exponential, under natural assumptions and with ReLU activations. A key element of our proof is showing that ReLU networks near initialization can approximate the step function, which may be of independent interest.	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartof	Advances in Neural Information Processing Systems	en_US
dc.rights	Final published version. Article is made available in OAR by the publisher's permission or policy.	en_US
dc.title	Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality	en_US
dc.type	Conference Article	en_US
pu.type.symplectic	http://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceeding	en_US

Files in This Item:

File	Description	Size	Format
OverparametrizedAdversarial.pdf		384.87 kB	Adobe PDF	View/Download

Show Simple Item Record