Calibration, Entropy Rates, and Memory in Language Models
Author(s): Braverman, Mark; Chen, X; Kakade, SM; Narasimhan, Karthik; Zhang, C; et al
DownloadTo refer to this page use:
http://arks.princeton.edu/ark:/88435/pr17z5t
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Braverman, Mark | - |
dc.contributor.author | Chen, X | - |
dc.contributor.author | Kakade, SM | - |
dc.contributor.author | Narasimhan, Karthik | - |
dc.contributor.author | Zhang, C | - |
dc.contributor.author | Zhang, Y | - |
dc.date.accessioned | 2021-10-08T19:47:10Z | - |
dc.date.available | 2021-10-08T19:47:10Z | - |
dc.date.issued | 2019-06-01 | en_US |
dc.identifier.citation | Braverman, M, Chen, X, Kakade, SM, Narasimhan, K, Zhang, C, Zhang, Y. (2019). Calibration, Entropy Rates, and Memory in Language Models. eprint arXiv:1906.05664, arXiv - 1906.05664 | en_US |
dc.identifier.uri | http://arks.princeton.edu/ark:/88435/pr17z5t | - |
dc.description.abstract | Building accurate language models that capture meaningful long-term dependencies is a core challenge in natural language processing. Towards this end, we present a calibration-based approach to measure long-term discrepancies between a generative sequence model and the true distribution, and use these discrepancies to improve the model. Empirically, we show that state-of-the-art language models, including LSTMs and Transformers, are \emph{miscalibrated}: the entropy rates of their generations drift dramatically upward over time. We then provide provable methods to mitigate this phenomenon. Furthermore, we show how this calibration-based approach can also be used to measure the amount of memory that language models use for prediction. | en_US |
dc.format.extent | arXiv - 1906.05664 | en_US |
dc.language.iso | en_US | en_US |
dc.relation.ispartof | eprint arXiv:1906.05664 | en_US |
dc.rights | Author's manuscript | en_US |
dc.title | Calibration, Entropy Rates, and Memory in Language Models | en_US |
dc.type | Journal Article | en_US |
pu.type.symplectic | http://www.symplectic.co.uk/publications/atom-terms/1.0/journal-article | en_US |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
CalibrationEntropyRatesMemoryLanguageModels.pdf | 674.74 kB | Adobe PDF | View/Download | |
Calibration, Entropy Rates, and Memory in Language Models.pdf | 408.67 kB | Adobe PDF | View/Download |
Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.