bo_peng OP t1_j61fdtp wrote
Reply to comment by Gody_Godee in [P] RWKV 14B Language Model & ChatRWKV : pure RNN (attention-free), scalable and parallelizable like Transformers by bo_peng
No. It's highly competitive.
Gody_Godee t1_j6ayw0r wrote
your idea looks like this one from 3 years ago: https://arxiv.org/abs/2006.16236
bo_peng OP t1_j6gnqrp wrote
2006.16236 is bad at any nontrivial task such as language modeling.
Viewing a single comment thread. View all comments