Skip to main content

The Rate of Convergence of AdaBoost

Author(s): Mukherjee, Indraneel; Rudin, Cynthia; Schapire, Robert E

Download
To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1n24p
Abstract: The AdaBoost algorithm was designed to combine many "weak" hypotheses that perform slightly better than random guessing into a "strong" hypothesis that has very low error. We study the rate at which AdaBoost iteratively converges to the minimum of the "exponential loss." Unlike previous work, our proofs do not require a weak-learning assumption, nor do they require that minimizers of the exponential loss are finite. Our first result shows that at iteration $t$, the exponential loss of AdaBoost's computed parameter vector will be at most $\epsilon$ more than that of any parameter vector of $\ell_1$-norm bounded by $B$ in a number of rounds that is at most a polynomial in $B$ and $1/\epsilon$. We also provide lower bounds showing that a polynomial dependence on these parameters is necessary. Our second result is that within $C/\epsilon$ iterations, AdaBoost achieves a value of the exponential loss that is at most $\epsilon$ more than the best possible value, where $C$ depends on the dataset. We show that this dependence of the rate on $\epsilon$ is optimal up to constant factors, i.e., at least $\Omega(1/\epsilon)$ rounds are necessary to achieve within $\epsilon$ of the optimal exponential loss.
Citation: Mukherjee, Indraneel, Rudin, Cynthia, Schapire, Robert E. (The Rate of Convergence of AdaBoost
Type of Material: Journal Article
Journal/Proceeding Title: Journal of Machine Learning Research
Version: Final published version. This is an open access article.



Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.