Viewing a single comment thread. View all comments

dreamingleo12 t1_jdll44j wrote on March 25, 2023 at 8:21 AM

Reply to comment by SeymourBits in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry

I’ve been experimenting with Alpaca and able to fine-tune it using the dataset provided in 40 minutes with 8 A100s, spot instances. It actually works well.

Daveboi7 t1_jdm8aby wrote on March 25, 2023 at 1:11 PM

What platform are you using for training?

dreamingleo12 t1_jdn511a wrote on March 25, 2023 at 5:17 PM

By platform you mean?

Daveboi7 t1_jdnczd9 wrote on March 25, 2023 at 6:12 PM

My bad. Did you train the model locally on your PC or using cloud?

dreamingleo12 t1_jdndszl wrote on March 25, 2023 at 6:18 PM

I trained the model using cloud

Daveboi7 t1_jdndvq0 wrote on March 25, 2023 at 6:19 PM

With databricks?

dreamingleo12 t1_jdndzmt wrote on March 25, 2023 at 6:19 PM

No I don’t use databricks. I only tried LLaMA and Alpaca.

Daveboi7 t1_jdnedrd wrote on March 25, 2023 at 6:22 PM

But which cloud service did you use to train them?

I tried using databricks to train a model but the setup was too complicated.

I’m wondering is there a more straightforward platform to train on?

dreamingleo12 t1_jdnel6b wrote on March 25, 2023 at 6:24 PM

You can just follow Stanford Alpaca’s github instructions, as long as you have LLaMA weights. It’s straightforward.

Daveboi7 t1_jdneqdx wrote on March 25, 2023 at 6:25 PM

Ah. I’m trying to train the Dolly model created developed databricks.

dreamingleo12 t1_jdnewt2 wrote on March 25, 2023 at 6:26 PM

It’s just Alpaca with a different base model. Databricks boasted too much.

Daveboi7 t1_jdnf18o wrote on March 25, 2023 at 6:27 PM

Yeah but the comparisons I have seen between Dolly and Alpaca look totally different.

Somehow the Dolly answers look much better imo

Edit: spelling

dreamingleo12 t1_jdnf4qn wrote on March 25, 2023 at 6:27 PM

I don’t trust DB’s results tbh. LLaMA is a better model than GPT-J.

Daveboi7 t1_jdnf96e wrote on March 25, 2023 at 6:28 PM

Somebody posted results on Twitter, they looked pretty good. I don’t think he worked for DB either. But who knows really