Submitted by fromnighttilldawn t3_y11a7r in MachineLearning
visarga t1_irzdrho wrote
Reply to comment by _Arsenie_Boca_ in [D] Looking for some critiques on recent development of machine learning by fromnighttilldawn
> if LSTMs would have received the amount of engineering attention that went into making transformers better and faster
There was a short period when people were trying to improve LSTMs using genetic algorithms or RL.
-
An Empirical Exploration of Recurrent Network Architectures (2015, Sutskever)
-
LSTM: A Search Space Odyssey (2015, Schmidhuber)
-
Neural Architecture Search with Reinforcement Learning (2016, Quoc Le)
The conclusion was that the LSTM cell is somewhat arbitrary and many other architectures work just as well, but none much better. So people stuck with classic LSTMs.
Viewing a single comment thread. View all comments