Viewing a single comment thread. View all comments

cbsudux t1_jd1qzp7 wrote on March 21, 2023 at 5:09 AM

How long did the training take on an A100?

benfavre t1_jd2n1cg wrote on March 21, 2023 at 12:12 PM

1 epoch of finetuning the 30B model with llama-lora implementation, mini-batch-size=2, maxlen=384, is about 11 hours.

Can you train with 24 gigs of vram ?