Distributed cooperative decision-making in multiarmed bandits: Frequentist and Bayesian algorithms

Landgren, P; Srivastava, V; Leonard, NE

Distributed cooperative decision-making in multiarmed bandits: Frequentist and Bayesian algorithms

Author(s): Landgren, P; Srivastava, V; Leonard, NE

Download

To refer to this page use: http://arks.princeton.edu/ark:/88435/pr15t0j

Full metadata record

DC Field	Value	Language
dc.contributor.author	Landgren, P	en_US
dc.contributor.author	Srivastava, V	en_US
dc.contributor.author	Leonard, NE	en_US
dc.date.accessioned	2018-07-20T15:08:33Z	-
dc.date.available	2018-07-20T15:08:33Z	-
dc.date.issued	2016-12-27	en_US
dc.identifier.citation	Landgren, P, Srivastava, V, Leonard, NE. (2016). Distributed cooperative decision-making in multiarmed bandits: Frequentist and Bayesian algorithms. 2016 IEEE 55th Conference on Decision and Control, CDC 2016, 167 - 172. doi:10.1109/CDC.2016.7798264	en_US
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/pr15t0j	-
dc.description.abstract	© 2016 IEEE. We study distributed cooperative decision-making under the explore-exploit tradeoff in the multiarmed bandit (MAB) problem. We extend state-of-the-art frequentist and Bayesian algorithms for single-agent MAB problems to cooperative distributed algorithms for multi-agent MAB problems in which agents communicate according to a fixed network graph. We rely on a running consensus algorithm for each agent's estimation of mean rewards from its own rewards and the estimated rewards of its neighbors. We prove the performance of these algorithms and show that they asymptotically recover the performance of a centralized agent. Further, we rigorously characterize the influence of the communication graph structure on the decision-making performance of the group.	en_US
dc.format.extent	167 - 172	en_US
dc.relation.ispartof	2016 IEEE 55th Conference on Decision and Control, CDC 2016	en_US
dc.title	Distributed cooperative decision-making in multiarmed bandits: Frequentist and Bayesian algorithms	en_US
dc.type	Conference Proceeding	-
dc.identifier.doi	doi:10.1109/CDC.2016.7798264	en_US
pu.type.symplectic	http://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceeding	en_US

Files in This Item:

File	Description	Size	Format
Distributed cooperative decision-making in multiarmed bandits Frequentist and Bayesian algorithms.pdf		1.32 MB	Adobe PDF	View/Download

Show Simple Item Record