vjb_reddit_scrap t1_ivymo0p wrote on November 11, 2022 at 4:13 PM

IIRC Hinton et al had a paper about initializing RNNs with identity and it solved many problems that LSTM solves.

DrXaos t1_iw04agd wrote on November 11, 2022 at 10:18 PM

That’s a different scenario and clearly dynamically justified.

Any recursive neural network is like a nonlinear dynamical system. Learning happens best on the boundary of dissipation vs chaos (exploding or vanishing gradients).

The additive incorporation of new info in LSTM/GRU greatly ameliorates that usual problem of RNNs with random transition matrices where perturbations evolve multiplicatively. RNN initted to zero Lyapunov exponent through identity is helpful.