Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits

Agarwal, Alekh; Hsu, Daniel; Kale, Satyen; Langford, John; Li, Lihong; Schapire, Robert E

Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits

Author(s): Agarwal, Alekh; Hsu, Daniel; Kale, Satyen; Langford, John; Li, Lihong; et al

Download

To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1v255

Full metadata record

DC Field	Value	Language
dc.contributor.author	Agarwal, Alekh	-
dc.contributor.author	Hsu, Daniel	-
dc.contributor.author	Kale, Satyen	-
dc.contributor.author	Langford, John	-
dc.contributor.author	Li, Lihong	-
dc.contributor.author	Schapire, Robert E	-
dc.date.accessioned	2021-10-08T19:48:29Z	-
dc.date.available	2021-10-08T19:48:29Z	-
dc.date.issued	2014	en_US
dc.identifier.citation	Agarwal, Alekh, Hsu, Daniel, Kale, Satyen, Langford, John, Li, Lihong, Schapire, Robert E. (Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits	en_US
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/pr1v255	-
dc.description.abstract	We present a new algorithm for the contextual bandit learning problem, where the learner repeatedly takes one of $K$ actions in response to the observed context, and observes the reward only for that chosen action. Our method assumes access to an oracle for solving fully supervised cost-sensitive classification problems and achieves the statistically optimal regret guarantee with only $\tilde{O}(\sqrt{KT/\log N})$ oracle calls across all $T$ rounds, where $N$ is the number of policies in the policy class we compete against. By doing so, we obtain the most practical contextual bandit learning algorithm amongst approaches that work for general policy classes. We further conduct a proof-of-concept experiment which demonstrates the excellent computational and prediction performance of (an online variant of) our algorithm relative to several baselines.	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartof	31st International Conference on Machine Learning	en_US
dc.rights	Author's manuscript	en_US
dc.title	Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits	en_US
dc.type	Conference Article	en_US
pu.type.symplectic	http://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceeding	en_US

Files in This Item:

File	Description	Size	Format
TamingMonsterFastSimpleAlgorithmContextualBandits.pdf		295.75 kB	Adobe PDF	View/Download
1402.0555v2.pdf		404.53 kB	Adobe PDF	View/Download

Show Simple Item Record