Calibration, Entropy Rates, and Memory in Language Models
Author(s): Braverman, Mark; Chen, Xinyi; Kakade, Sham; Narasimhan, Karthik; Zhang, Cyril; et al
DownloadTo refer to this page use:
http://arks.princeton.edu/ark:/88435/pr1f859
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Braverman, Mark | - |
dc.contributor.author | Chen, Xinyi | - |
dc.contributor.author | Kakade, Sham | - |
dc.contributor.author | Narasimhan, Karthik | - |
dc.contributor.author | Zhang, Cyril | - |
dc.contributor.author | Zhang, Yi | - |
dc.date.accessioned | 2021-10-08T19:51:00Z | - |
dc.date.available | 2021-10-08T19:51:00Z | - |
dc.date.issued | 2020 | en_US |
dc.identifier.citation | Braverman, Mark, Xinyi Chen, Sham Kakade, Karthik Narasimhan, Cyril Zhang, and Yi Zhang. "Calibration, Entropy Rates, and Memory in Language Models." In Proceedings of the 37th International Conference on Machine Learning (2020): pp. 1089-1099. | en_US |
dc.identifier.uri | http://proceedings.mlr.press/v119/braverman20a/braverman20a.pdf | - |
dc.identifier.uri | http://arks.princeton.edu/ark:/88435/pr1f859 | - |
dc.description.abstract | Building accurate language models that capture meaningful long-term dependencies is a core challenge in natural language processing. Towards this end, we present a calibration-based approach to measure long-term discrepancies between a generative sequence model and the true distribution, and use these discrepancies to improve the model. Empirically, we show that state-of-the-art language models, including LSTMs and Transformers, are miscalibrated: the entropy rates of their generations drift dramatically upward over time. We then provide provable methods to mitigate this phenomenon. Furthermore, we show how this calibration-based approach can also be used to measure the amount of memory that language models use for prediction. | en_US |
dc.format.extent | 1089 - 1099 | en_US |
dc.language.iso | en_US | en_US |
dc.relation.ispartof | Proceedings of the 37th International Conference on Machine Learning | en_US |
dc.rights | Final published version. Article is made available in OAR by the publisher's permission or policy. | en_US |
dc.title | Calibration, Entropy Rates, and Memory in Language Models | en_US |
dc.type | Conference Article | en_US |
pu.type.symplectic | http://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceeding | en_US |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
CalibrationEntropyRates.pdf | 674.74 kB | Adobe PDF | View/Download |
Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.