I have been following this from some time but I can't fully understand it and explain it to my collaborators.
I work in ML and I have quite some experience with transformers and I still can't fully get it.
Let alone convince some of my collaborator that is worth pursuing it.
It is paramount that we have a paper that explains this in more detail if we want the community to consider this seriously.
luxsteele t1_jb1b68d wrote
Reply to comment by _Arsenie_Boca_ in [R] RWKV (100% RNN) can genuinely model ctx4k+ documents in Pile, and RWKV model+inference+generation in 150 lines of Python by bo_peng
Totally agree.
I have been following this from some time but I can't fully understand it and explain it to my collaborators.
I work in ML and I have quite some experience with transformers and I still can't fully get it. Let alone convince some of my collaborator that is worth pursuing it.
It is paramount that we have a paper that explains this in more detail if we want the community to consider this seriously.
Please do it!