Risk-sensitive inverse reinforcement learning via semi- and non-parametric methods

Singh, S; Lacotte, J; Majumdar, Anirudha; Pavone, M

Risk-sensitive inverse reinforcement learning via semi- and non-parametric methods

Author(s): Singh, S; Lacotte, J; Majumdar, Anirudha; Pavone, M

Download

To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1vk31

Full metadata record

DC Field	Value	Language
dc.contributor.author	Singh, S	-
dc.contributor.author	Lacotte, J	-
dc.contributor.author	Majumdar, Anirudha	-
dc.contributor.author	Pavone, M	-
dc.date.accessioned	2021-10-08T20:20:08Z	-
dc.date.available	2021-10-08T20:20:08Z	-
dc.date.issued	2018	en_US
dc.identifier.citation	Singh, S, Lacotte, J, Majumdar, A, Pavone, M. (2018). Risk-sensitive inverse reinforcement learning via semi- and non-parametric methods. International Journal of Robotics Research, 37 (1713 - 1740. doi:10.1177/0278364918772017	en_US
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/pr1vk31	-
dc.description.abstract	The literature on inverse reinforcement learning (IRL) typically assumes that humans take actions to minimize the expected value of a cost function, i.e., that humans are risk neutral. Yet, in practice, humans are often far from being risk neutral. To fill this gap, the objective of this paper is to devise a framework for risk-sensitive (RS) IRL to explicitly account for a human’s risk sensitivity. To this end, we propose a flexible class of models based on coherent risk measures, which allow us to capture an entire spectrum of risk preferences from risk neutral to worst case. We propose efficient non-parametric algorithms based on linear programming and semi-parametric algorithms based on maximum likelihood for inferring a human’s underlying risk measure and cost function for a rich class of static and dynamic decision-making settings. The resulting approach is demonstrated on a simulated driving game with 10 human participants. Our method is able to infer and mimic a wide range of qualitatively different driving styles from highly risk averse to risk neutral in a data-efficient manner. Moreover, comparisons of the RS-IRL approach with a risk-neutral model show that the RS-IRL framework more accurately captures observed participant behavior both qualitatively and quantitatively, especially in scenarios where catastrophic outcomes such as collisions can occur.	en_US
dc.format.extent	1713 - 1740	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartof	International Journal of Robotics Research	en_US
dc.rights	Author's manuscript	en_US
dc.title	Risk-sensitive inverse reinforcement learning via semi- and non-parametric methods	en_US
dc.type	Journal Article	en_US
dc.identifier.doi	doi:10.1177/0278364918772017	-
pu.type.symplectic	http://www.symplectic.co.uk/publications/atom-terms/1.0/journal-article	en_US

Files in This Item:

File	Description	Size	Format
Risk-sensitive Inverse Reinforcement Learning via Semi- and Non-Parametric Methods.pdf		6.28 MB	Adobe PDF	View/Download

Show Simple Item Record