Skip to main content

On distributed cooperative decision-making in multiarmed bandits

Author(s): Landgren, P; Srivastava, V; Leonard, NE

Download
To refer to this page use: http://arks.princeton.edu/ark:/88435/pr12m3g
Full metadata record
DC FieldValueLanguage
dc.contributor.authorLandgren, Pen_US
dc.contributor.authorSrivastava, Ven_US
dc.contributor.authorLeonard, NEen_US
dc.date.accessioned2018-07-20T15:09:14Z-
dc.date.available2018-07-20T15:09:14Z-
dc.date.issued2017-01-06en_US
dc.identifier.citationLandgren, P, Srivastava, V, Leonard, NE. (2017). On distributed cooperative decision-making in multiarmed bandits. 2016 European Control Conference, ECC 2016, 243 - 248. doi:10.1109/ECC.2016.7810293en_US
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/pr12m3g-
dc.description.abstract© 2016 EUCA. We study the explore-exploit tradeoff in distributed cooperative decision-making using the context of the multiarmed bandit (MAB) problem. For the distributed cooperative MAB problem, we design the cooperative UCB algorithm that comprises two interleaved distributed processes: (i) running consensus algorithms for estimation of rewards, and (ii) upper-confidence-bound-based heuristics for selection of arms. We rigorously analyze the performance of the cooperative UCB algorithm and characterize the influence of communication graph structure on the decision-making performance of the group.en_US
dc.format.extent243 - 248en_US
dc.relation.ispartof2016 European Control Conference, ECC 2016en_US
dc.titleOn distributed cooperative decision-making in multiarmed banditsen_US
dc.typeConference Proceeding-
dc.identifier.doidoi:10.1109/ECC.2016.7810293en_US
pu.type.symplectichttp://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceedingen_US

Files in This Item:
File Description SizeFormat 
On distributed cooperative decision-making in multiarmed bandits.pdf1.17 MBAdobe PDFView/Download


Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.