[R] RWKV 14B ctx8192 is a zero-shot instruction-follower without finetuning, 23 token/s on 3090 after latest optimization (16G VRAM is enough, and you can stream layers to save more VRAM) Submitted by bo_peng t3_11teywc on March 17, 2023 at 2:49 AM in MachineLearning 32 comments 101
acertainmoment t1_jcm22en wrote on March 17, 2023 at 8:29 PM tried a small modification to one of the examples on huggingface :) https://ibb.co/zNS3H1J Permalink 1
Viewing a single comment thread. View all comments