Grow and Prune Compact, Fast, and Accurate LSTMs
Author(s): Dai, Xiaoliang; Yin, Hongxu; Jha, Niraj K
DownloadTo refer to this page use:
http://arks.princeton.edu/ark:/88435/pr1zk55m4w
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Dai, Xiaoliang | - |
dc.contributor.author | Yin, Hongxu | - |
dc.contributor.author | Jha, Niraj K | - |
dc.date.accessioned | 2023-12-24T15:38:13Z | - |
dc.date.available | 2023-12-24T15:38:13Z | - |
dc.date.issued | 2019-11-20 | en_US |
dc.identifier.citation | Dai, Xiaoliang, Yin, Hongxu, Jha, Niraj K. (2020). Grow and Prune Compact, Fast, and Accurate LSTMs. IEEE Transactions on Computers, 69 (3), 441 - 452. doi:10.1109/tc.2019.2954495 | en_US |
dc.identifier.issn | 0018-9340 | - |
dc.identifier.uri | http://arks.princeton.edu/ark:/88435/pr1zk55m4w | - |
dc.description.abstract | Long short-term memory (LSTM) has been widely used for sequential data modeling. Researchers have increased LSTM depth by stacking LSTM cells to improve performance. This incurs model redundancy, increases run-time delay, and makes the LSTMs more prone to overfitting. To address these problems, we propose a hidden-layer LSTM (H-LSTM) that adds hidden layers to LSTM's original one-level nonlinear control gates. H-LSTM increases accuracy while employing fewer external stacked layers, thus reducing the number of parameters and run-time latency significantly. We employ grow-and-prune (GP) training to iteratively adjust the hidden layers through gradient-based growth and magnitude-based pruning of connections. This learns both the weights and the compact architecture of H-LSTM control gates. We have GP-trained H-LSTMs for image captioning, speech recognition, and neural machine translation applications. For the NeuralTalk architecture on the MSCOCO dataset, our three models reduce the number of parameters by 38.7× [floating-point operations (FLOPs) by 45.5×], run-time latency by 4.5×, and improve the CIDEr-D score by 2.8 percent, respectively. For the DeepSpeech2 architecture on the AN4 dataset, the first model we generated reduces the number of parameters by 19.4× and run-time latency by 37.4 percent. The second model reduces the word error rate (WER) from 12.9 to 8.7 percent. For the encoder-decoder sequence-to-sequence network on the IWSLT 2014 German-English dataset, the first model we generated reduces the number of parameters by 10.8× and run-time latency by 14.2 percent. The second model increases the BLEU score from 30.02 to 30.98. Thus, GP-trained H-LSTMs can be seen to be compact, fast, and accurate. | en_US |
dc.format.extent | 441 - 452 | en_US |
dc.language.iso | en_US | en_US |
dc.relation.ispartof | IEEE Transactions on Computers | en_US |
dc.rights | Author's manuscript | en_US |
dc.title | Grow and Prune Compact, Fast, and Accurate LSTMs | en_US |
dc.type | Journal Article | en_US |
dc.identifier.doi | doi:10.1109/tc.2019.2954495 | - |
dc.identifier.eissn | 1557-9956 | - |
pu.type.symplectic | http://www.symplectic.co.uk/publications/atom-terms/1.0/journal-article | en_US |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
1805.11797.pdf | 470.07 kB | Adobe PDF | View/Download |
Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.