Submitted by austintackaberry t3_120usfk in MachineLearning
dreamingleo12 t1_jdll44j wrote
Reply to comment by SeymourBits in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
I’ve been experimenting with Alpaca and able to fine-tune it using the dataset provided in 40 minutes with 8 A100s, spot instances. It actually works well.
Daveboi7 t1_jdm8aby wrote
What platform are you using for training?
dreamingleo12 t1_jdn511a wrote
By platform you mean?
Daveboi7 t1_jdnczd9 wrote
My bad. Did you train the model locally on your PC or using cloud?
dreamingleo12 t1_jdndszl wrote
I trained the model using cloud
Daveboi7 t1_jdndvq0 wrote
With databricks?
dreamingleo12 t1_jdndzmt wrote
No I don’t use databricks. I only tried LLaMA and Alpaca.
Daveboi7 t1_jdnedrd wrote
But which cloud service did you use to train them?
I tried using databricks to train a model but the setup was too complicated.
I’m wondering is there a more straightforward platform to train on?
dreamingleo12 t1_jdnel6b wrote
You can just follow Stanford Alpaca’s github instructions, as long as you have LLaMA weights. It’s straightforward.
Daveboi7 t1_jdneqdx wrote
Ah. I’m trying to train the Dolly model created developed databricks.
dreamingleo12 t1_jdnewt2 wrote
It’s just Alpaca with a different base model. Databricks boasted too much.
Daveboi7 t1_jdnf18o wrote
Yeah but the comparisons I have seen between Dolly and Alpaca look totally different.
Somehow the Dolly answers look much better imo
Edit: spelling
dreamingleo12 t1_jdnf4qn wrote
I don’t trust DB’s results tbh. LLaMA is a better model than GPT-J.
Daveboi7 t1_jdnf96e wrote
Somebody posted results on Twitter, they looked pretty good. I don’t think he worked for DB either. But who knows really
Viewing a single comment thread. View all comments