Skip to main content

Humans use directed and random exploration to solve the explore–exploit dilemma.

Author(s): Wilson, Robert C.; Geana, Andra; White, John M.; Ludvig, Elliot A.; Cohen, Jonathan D.

Download
To refer to this page use: http://arks.princeton.edu/ark:/88435/pr17x6v
Full metadata record
DC FieldValueLanguage
dc.contributor.authorWilson, Robert C.-
dc.contributor.authorGeana, Andra-
dc.contributor.authorWhite, John M.-
dc.contributor.authorLudvig, Elliot A.-
dc.contributor.authorCohen, Jonathan D.-
dc.date.accessioned2019-10-28T15:54:25Z-
dc.date.available2019-10-28T15:54:25Z-
dc.date.issued2014en_US
dc.identifier.citationWilson, Robert C, Geana, Andra, White, John M, Ludvig, Elliot A, Cohen, Jonathan D. (2014). Humans use directed and random exploration to solve the explore–exploit dilemma. Journal of Experimental Psychology: General, 143 (6), 2074 - 2081. doi:10.1037/a0038199en_US
dc.identifier.issn0096-3445-
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/pr17x6v-
dc.description.abstractAll adaptive organisms face the fundamental tradeoff between pursuing a known reward (exploitation) and sampling lesser-known options in search of something better (exploration). Theory suggests at least two strategies for solving this dilemma: a directed strategy in which choices are explicitly biased toward information seeking, and a random strategy in which decision noise leads to exploration by chance. In this work we investigated the extent to which humans use these two strategies. In our “Horizon task,” participants made explore– exploit decisions in two contexts that differed in the number of choices that they would make in the future (the time horizon). Participants were allowed to make either a single choice in each game (horizon 1), or 6 sequential choices (horizon 6), giving them more opportunity to explore. By modeling the behavior in these two conditions, we were able to measure exploration-related changes in decision making and quantify the contributions of the two strategies to behavior. We found that participants were more information seeking and had higher decision noise with the longer horizon, suggesting that humans use both strategies to solve the exploration– exploitation dilemma. We thus conclude that both information seeking and choice variability can be controlled and put to use in the service of exploration.en_US
dc.format.extent2074 - 2081en_US
dc.language.isoen_USen_US
dc.relation.ispartofJournal of Experimental Psychology: Generalen_US
dc.rightsAuthor's manuscripten_US
dc.titleHumans use directed and random exploration to solve the explore–exploit dilemma.en_US
dc.typeJournal Articleen_US
dc.identifier.doidoi:10.1037/a0038199-
dc.date.eissued2014en_US
dc.identifier.eissn1939-2222-
pu.type.symplectichttp://www.symplectic.co.uk/publications/atom-terms/1.0/journal-articleen_US

Files in This Item:
File Description SizeFormat 
Humans_Use_Directed_Random_Exploration_2014.pdf1.49 MBAdobe PDFView/Download


Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.