Submitted by ThePerson654321 t3_11lq5j4 in MachineLearning
Aran_Komatsuzaki t1_jbkyegs wrote
Reply to comment by LetterRip in [D] Why isn't everyone using RWKV if it's so much better than transformers? by ThePerson654321
> Thanks for sharing your results. It is being tuned to longer context lengths, current is
I tried the one w/ context length = 4096 for RWKV :)
> Could you clarify - was one of those meant to be former and the other late
Sorry for the typo. The latter 'former' is meant to be the 'latter'.
Viewing a single comment thread. View all comments