csreid t1_j8p5z30 wrote on February 15, 2023 at 11:11 PM

Reply to comment by farmingvillein in [R] RWKV-4 14B release (and ChatRWKV) - a surprisingly strong RNN Language Model by bo_peng

But they theoretically support infinite context length. Getting it is a problem to be solved, not a fundamental incompatibility like it is with transformers.

farmingvillein t1_j8p7lci wrote on February 15, 2023 at 11:22 PM

Neither really work for super long contexts, so it is kind of a moot point.

Both--empirically--end up with bolt-on approaches to enhance memory over very long contexts, so it isn't really clear (a priori) that the RNN has a true advantage here.