Can Rationalization Improve Robustness?

Chen, Howard; He, Jacqueline; Narasimhan, Karthik; Chen, Danqi

Can Rationalization Improve Robustness?

Author(s): Chen, Howard; He, Jacqueline; Narasimhan, Karthik; Chen, Danqi

Download

To refer to this page use: http://arks.princeton.edu/ark:/88435/pr18s4jp4r

Full metadata record

DC Field	Value	Language
dc.contributor.author	Chen, Howard	-
dc.contributor.author	He, Jacqueline	-
dc.contributor.author	Narasimhan, Karthik	-
dc.contributor.author	Chen, Danqi	-
dc.date.accessioned	2023-12-14T14:31:28Z	-
dc.date.available	2023-12-14T14:31:28Z	-
dc.date.issued	2022-07	en_US
dc.identifier.citation	Chen, Howard, He, Jacqueline, Narasimhan, Karthik and Chen, Danqi. "Can Rationalization Improve Robustness?" Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2022): 3792-3805. doi:10.18653/v1/2022.naacl-main.278	en_US
dc.identifier.uri	http://arks.princeton.edu/ark:/88435/pr18s4jp4r	-
dc.description.abstract	A growing line of work has investigated the development of neural NLP models that can produce rationales–subsets of input that can explain their model predictions. In this paper, we ask whether such rationale models can provide robustness to adversarial attacks in addition to their interpretable nature. Since these models need to first generate rationales (“rationalizer”) before making predictions (“predictor”), they have the potential to ignore noise or adversarially added text by simply masking it out of the generated rationale. To this end, we systematically generate various types of ‘AddText’ attacks for both token and sentence-level rationalization tasks and perform an extensive empirical evaluation of state-of-the-art rationale models across five different tasks. Our experiments reveal that the rationale models promise to improve robustness over AddText attacks while they struggle in certain scenarios–when the rationalizer is sensitive to position bias or lexical choices of attack text. Further, leveraging human rationale as supervision does not always translate to better performance. Our study is a first step towards exploring the interplay between interpretability and robustness in the rationalize-then-predict framework.	en_US
dc.format.extent	3792 - 3805	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartof	Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies	en_US
dc.rights	Final published version. This is an open access article.	en_US
dc.title	Can Rationalization Improve Robustness?	en_US
dc.type	Conference Article	en_US
dc.identifier.doi	10.18653/v1/2022.naacl-main.278	-
pu.type.symplectic	http://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceeding	en_US

Files in This Item:

File	Description	Size	Format
RationalizationImproveRobustness.pdf		846.68 kB	Adobe PDF	View/Download

Show Simple Item Record