Viewing a single comment thread. View all comments

dreamingleo12 t1_jdll44j wrote

I’ve been experimenting with Alpaca and able to fine-tune it using the dataset provided in 40 minutes with 8 A100s, spot instances. It actually works well.

3

Daveboi7 t1_jdm8aby wrote

What platform are you using for training?

2

dreamingleo12 t1_jdn511a wrote

By platform you mean?

2

Daveboi7 t1_jdnczd9 wrote

My bad. Did you train the model locally on your PC or using cloud?

1

dreamingleo12 t1_jdndszl wrote

I trained the model using cloud

2

Daveboi7 t1_jdndvq0 wrote

With databricks?

1

dreamingleo12 t1_jdndzmt wrote

No I don’t use databricks. I only tried LLaMA and Alpaca.

1

Daveboi7 t1_jdnedrd wrote

But which cloud service did you use to train them?

I tried using databricks to train a model but the setup was too complicated.

I’m wondering is there a more straightforward platform to train on?

1

dreamingleo12 t1_jdnel6b wrote

You can just follow Stanford Alpaca’s github instructions, as long as you have LLaMA weights. It’s straightforward.

2

Daveboi7 t1_jdneqdx wrote

Ah. I’m trying to train the Dolly model created developed databricks.

1

dreamingleo12 t1_jdnewt2 wrote

It’s just Alpaca with a different base model. Databricks boasted too much.

1

Daveboi7 t1_jdnf18o wrote

Yeah but the comparisons I have seen between Dolly and Alpaca look totally different.

Somehow the Dolly answers look much better imo

Edit: spelling

1

dreamingleo12 t1_jdnf4qn wrote

I don’t trust DB’s results tbh. LLaMA is a better model than GPT-J.

2

Daveboi7 t1_jdnf96e wrote

Somebody posted results on Twitter, they looked pretty good. I don’t think he worked for DB either. But who knows really

1