MysteryInc152 t1_jcrnqc8 wrote on March 19, 2023 at 1:01 AM

You can try training chatGLM. 6b parameters and initially trained on 1T English/Chinese Tokens. Also completely open source. However, it's already been fine tuned and had RLHF but that was optimized for Chinese Q/A. Could use some English work,

Another option is RWKV. There are 7b and 14b models(I would go with the 14b, it's the better of the two) fine tuned to a context length of 8196 tokens. He plans on increasing context further too.

Craiglbl t1_jcrxjy2 wrote on March 19, 2023 at 2:19 AM

ChatGLM is really good. I sometimes have a hard time distinguishing its Chinese outputs from those of chatgpt.

Sadly its English could use some improvement as it usually use Chinese adjectives when similar words are lacking in English.

cthorrez t1_jcvhg41 wrote on March 19, 2023 at 9:38 PM

RWKV is recurrent right? Why is it token limited?