Viewing a single comment thread. View all comments

DaLameLama t1_iyc7nha wrote on November 30, 2022 at 8:48 AM

I don't think that's true. It would imply that Bi-LSTMs reach good performance faster than Transformers, and Transformers catch up later during training.

I've never seen proof for that, nor do my personal experiences confirm this.