Skip to main content

Bounded regret in stochastic multi-armed bandits

Author(s): Bubeck, Sébastien; Perchet, Vianney; Rigollet, Philippe

Download
To refer to this page use: http://arks.princeton.edu/ark:/88435/pr16j5x
Full metadata record
DC FieldValueLanguage
dc.contributor.authorBubeck, Sébastien-
dc.contributor.authorPerchet, Vianney-
dc.contributor.authorRigollet, Philippe-
dc.date.accessioned2020-03-02T22:14:56Z-
dc.date.available2020-03-02T22:14:56Z-
dc.date.issued2013-02en_US
dc.identifier.citationBubeck, Sébastien, Perchet, Vianney, Rigollet, Philippe. (2013). Bounded regret in stochastic multi-armed bandits. https://arxiv.org/abs/1302.1611v2en_US
dc.identifier.urihttps://arxiv.org/abs/1302.1611v2-
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/pr16j5x-
dc.description.abstractWe study the stochastic multi-armed bandit problem when one knows the value $\mu^{(\star)}$ of an optimal arm, as a well as a positive lower bound on the smallest positive gap $\Delta$. We propose a new randomized policy that attains a regret {\em uniformly bounded over time} in this setting. We also prove several lower bounds, which show in particular that bounded regret is not possible if one only knows $\Delta$, and bounded regret of order $1/\Delta$ is not possible if one only knows $\mu^{(\star)}$en_US
dc.format.extent1 - 14en_US
dc.language.isoen_USen_US
dc.relation.ispartofJournal of Machine Learning Researchen_US
dc.rightsAuthor's manuscripten_US
dc.titleBounded regret in stochastic multi-armed banditsen_US
dc.typeJournal Articleen_US
pu.type.symplectichttp://www.symplectic.co.uk/publications/atom-terms/1.0/journal-articleen_US

Files in This Item:
File Description SizeFormat 
BoundedRegretMultiArmStochastic.pdf188.74 kBAdobe PDFView/Download


Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.