Submitted by [deleted] t3_11v4h5z in MachineLearning
MysteryInc152 t1_jcrnqc8 wrote
You can try training chatGLM. 6b parameters and initially trained on 1T English/Chinese Tokens. Also completely open source. However, it's already been fine tuned and had RLHF but that was optimized for Chinese Q/A. Could use some English work,
Another option is RWKV. There are 7b and 14b models(I would go with the 14b, it's the better of the two) fine tuned to a context length of 8196 tokens. He plans on increasing context further too.
Craiglbl t1_jcrxjy2 wrote
ChatGLM is really good. I sometimes have a hard time distinguishing its Chinese outputs from those of chatgpt.
Sadly its English could use some improvement as it usually use Chinese adjectives when similar words are lacking in English.
cthorrez t1_jcvhg41 wrote
RWKV is recurrent right? Why is it token limited?
Viewing a single comment thread. View all comments