Viewing a single comment thread. View all comments

Civil_Collection7267 t1_jd0pcqf wrote

Untuned 30B LLaMA, you're saying? It's excellent and adept at storywriting, chatting, and so on, and it can output faster than ChatGPT at 4-bit precision. While I'm not into this myself, I understand that there is a very large RP community at subs like CharacterAI and Pygmalion, and the 30B model is genuinely great for feeling like talking to a real person. I'm using it with text-generation-webui and custom parameters and not the llama.cpp implementation.

For assistant tasks, I've been using either the ChatLLaMA 13B LoRA or the Alpaca 7B LoRA, both of which are very good as well. ChatLLaMA, for instance, was able to answer a reasoning question correctly that GPT-3.5 failed, but it has drawbacks in other areas.

The limitations so far are that none of these models can answer programming questions competently yet, and a finetune for that will be needed. They also have the tendency to hallucinate frequently unless parameters are made more restrictive.

39

hosjiu t1_jd1a6az wrote

"They also have the tendency to hallucinate frequently unless parameters are made more restrictive."

I am not really understand this point in term of technical

0

royalemate357 t1_jd1stda wrote

Not op, but I imagine they're referring to the sampling hyperparameters that control the text generation process. For example there is a temperature setting, a lower temperature makes it sample more from the most likely choices. So it would potentially be more precise/accurate but also less diverse and creative in it's outputs

1