Jump to main content Jump to sidebar

Forums
Wiki

Log in
Sign up

Overview
Submissions
Comments

bo_peng

[R] RWKV 14B ctx8192 is a zero-shot instruction-follower without finetuning, 23 token/s on 3090 after latest optimization (16G VRAM is enough, and you can stream layers to save more VRAM)

Submitted by bo_peng t3_11teywc on March 17, 2023 at 2:49 AM in MachineLearning

32 comments

101

[R] RWKV (100% RNN) can genuinely model ctx4k+ documents in Pile, and RWKV model+inference+generation in 150 lines of Python

Submitted by bo_peng t3_11iwt1b on March 5, 2023 at 1:11 PM in MachineLearning

26 comments

63

[P] ChatRWKV v2 (can run RWKV 14B with 3G VRAM), RWKV pip package, and finetuning to ctx16K

Submitted by bo_peng t3_11f9k5g on March 1, 2023 at 5:23 PM in MachineLearning

37 comments

89

[R] RWKV-4 14B release (and ChatRWKV) - a surprisingly strong RNN Language Model

Submitted by bo_peng t3_1135aew on February 15, 2023 at 6:44 PM in MachineLearning

37 comments

268

[P] RWKV 14B Language Model & ChatRWKV : pure RNN (attention-free), scalable and parallelizable like Transformers

Submitted by bo_peng t3_10eh2f3 on January 17, 2023 at 4:54 PM in MachineLearning

19 comments

110

[R] RWKV-4 7B release: an attention-free RNN language model matching GPT-J performance (14B training in progress)

Submitted by bo_peng t3_yxt8sa on November 17, 2022 at 3:32 PM in MachineLearning

22 comments

172

bo_peng

Registered on April 24, 2022

t2_im17liv7

Running Postmill