Submitted by austintackaberry t3_120usfk in MachineLearning
light24bulbs t1_jdntdbb wrote
Reply to comment by baffo32 in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
I'm not hoping to do instruction tuning, i want to do additional pre-training.
baffo32 t1_jdo24su wrote
It is the same thing. The alpaca data is just further pretraining data consisting of instructions and responses. Doing this is called finetuning.
baffo32 t1_jdrhj77 wrote
I was still confused as to your response, and I’m thinking that if you wanted a model to behave like you had given different pretraining data, you would probably first finetune on the different bulk data, and then after this finetune on the target task such as instruction following.
Instruction following is indeed of course just predicting the next word: on data where the next word is obedient to instructions preceding it.
light24bulbs t1_jdrm9kh wrote
That's the part I wasn't getting. I assumed the fine tuning involved a different process. I see now that it is fact just more training data, often templated into a document in such a way that it's framed clearly for the LLM.
The confusing thing is that most of the LLM-as-a-service companies, Open-AI included, will ONLY take data in the question answer format, as if that's the only data you'd want to use to fine tune.
What if i want to feed a book in so we can talk about the book? A set of legal documents? Documentation of my project? Transcriptions of TV shows?
There are so many use cases for training on top of an already pre-trained LLM that aren't just question answering.
I'm into training llama now. I simply took some training code i found, removed the JSON parsing question answer templating stuff, and done.
nemorocksharder t1_jdz8kt5 wrote
What you're describing is exactly what I have been looking to do too, and am really surprised I'm not hearing more about it. Have you found any useful approaches to essentially adding to the LLM's Corpus with target material/text? or anyone else trying to do this?
light24bulbs t1_jdzzeh4 wrote
Yes, I'm into it now. Code like this can be adapted to load bulk data instead of q&a.
I suspect some of the training parameters need to be adjusted a bit to prevent over fitting and obviously the data loading and templating needs to be removed.
https://github.com/lxe/llama-tune Or for a cooler approach where you make a Lora layer https://github.com/serp-ai/LLaMA-8bit-LoRA
Viewing a single comment thread. View all comments