Viewing a single comment thread. View all comments

liyanjia92 OP t1_jdjwfnh wrote

It maybe better to submit an issue on github so that i can point you to some code with context. if you are talking my code, you need to convert the weights and load it into GPT class before running SFT training. otherwise there might be mismatch in weights and it could just output random stuff.

2