A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation

Yang, R; Sun, X; Narasimhan, Karthik

A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation

Author(s): Yang, R; Sun, X; Narasimhan, Karthik

Download

To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1cr9k

Full metadata record

DC Field	Value	Language
dc.contributor.author	Yang, R	-
dc.contributor.author	Sun, X	-
dc.contributor.author	Narasimhan, Karthik	-
dc.date.accessioned	2021-10-08T19:47:09Z	-
dc.date.available	2021-10-08T19:47:09Z	-
dc.date.issued	2019-08-01	en_US
dc.identifier.citation	Yang, R, Sun, X, Narasimhan, K. (2019). A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation. eprint arXiv:1908.08342, arXiv - 1908.08342	en_US
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/pr1cr9k	-
dc.description.abstract	We introduce a new algorithm for multi-objective reinforcement learning (MORL) with linear preferences, with the goal of enabling few-shot adaptation to new tasks. In MORL, the aim is to learn policies over multiple competing objectives whose relative importance (preferences) is unknown to the agent. While this alleviates dependence on scalar reward design, the expected return of a policy can change significantly with varying preferences, making it challenging to learn a single model to produce optimal policies under different preference conditions. We propose a generalized version of the Bellman equation to learn a single parametric representation for optimal policies over the space of all possible preferences. After this initial learning phase, our agent can quickly adapt to any given preference, or automatically infer an underlying preference with very few samples. Experiments across four different domains demonstrate the effectiveness of our approach.	en_US
dc.format.extent	arXiv - 1908.08342	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartof	eprint arXiv:1908.08342	en_US
dc.rights	Author's manuscript	en_US
dc.title	A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation	en_US
dc.type	Journal Article	en_US
pu.type.symplectic	http://www.symplectic.co.uk/publications/atom-terms/1.0/journal-article	en_US

Files in This Item:

File	Description	Size	Format
MultiObjectiveReinforcementLearningPolicyAdaptation.pdf		1.89 MB	Adobe PDF	View/Download
A generalized algorithm for multi-objective reinforcement learning and policy adaptation.pdf		5.18 MB	Adobe PDF	View/Download

Show Simple Item Record