The Rate of Convergence of AdaBoost

Author(s): Mukherjee, Indraneel; Rudin, Cynthia; Schapire, Robert E

Download

To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1n24p

Abstract:	The AdaBoost algorithm was designed to combine many "weak" hypotheses that perform slightly better than random guessing into a "strong" hypothesis that has very low error. We study the rate at which AdaBoost iteratively converges to the minimum of the "exponential loss." Unlike previous work, our proofs do not require a weak-learning assumption, nor do they require that minimizers of the exponential loss are finite. Our first result shows that at iteration $t$, the exponential loss of AdaBoost's computed parameter vector will be at most $\epsilon$ more than that of any parameter vector of $\ell_1$-norm bounded by $B$ in a number of rounds that is at most a polynomial in $B$ and $1/\epsilon$. We also provide lower bounds showing that a polynomial dependence on these parameters is necessary. Our second result is that within $C/\epsilon$ iterations, AdaBoost achieves a value of the exponential loss that is at most $\epsilon$ more than the best possible value, where $C$ depends on the dataset. We show that this dependence of the rate on $\epsilon$ is optimal up to constant factors, i.e., at least $\Omega(1/\epsilon)$ rounds are necessary to achieve within $\epsilon$ of the optimal exponential loss.
Citation:	Mukherjee, Indraneel, Rudin, Cynthia, Schapire, Robert E. (The Rate of Convergence of AdaBoost
Type of Material:	Journal Article
Journal/Proceeding Title:	Journal of Machine Learning Research
Version:	Final published version. This is an open access article.