Viewing a single comment thread. View all comments

MysteryInc152 t1_jcrnqc8 wrote

You can try training chatGLM. 6b parameters and initially trained on 1T English/Chinese Tokens. Also completely open source. However, it's already been fine tuned and had RLHF but that was optimized for Chinese Q/A. Could use some English work,

Another option is RWKV. There are 7b and 14b models(I would go with the 14b, it's the better of the two) fine tuned to a context length of 8196 tokens. He plans on increasing context further too.

17

Craiglbl t1_jcrxjy2 wrote

ChatGLM is really good. I sometimes have a hard time distinguishing its Chinese outputs from those of chatgpt.

Sadly its English could use some improvement as it usually use Chinese adjectives when similar words are lacking in English.

8

cthorrez t1_jcvhg41 wrote

RWKV is recurrent right? Why is it token limited?

1