Submitted by Zondartul t3_zrbfcr in MachineLearning
artsybashev t1_j154fhy wrote
Reply to comment by caedin8 in [D] Running large language models on a home PC? by Zondartul
it is just the inference. Training requires more like 100 x A100 and a cluster to train on. Just a million to get started.
AltruisticNight8314 t1_j1ohh7u wrote
What hardware would be required to i) train or ii) fine-tune weights (i.e. run a few epochs on my own data) for medium-sized transformers (500M-15B parameters)?
I do research on proteomics and I have a very specific problem where perhaps even fine-tuning the weights of a trained transformer (such as ESM-2) might be great.
Of course, there's always the poor man's alternative of building a supervised model on the embeddings returned by the encoder.
artsybashev t1_j1ph7f3 wrote
one A100 80GB will get you started with models 500M-15B. You can rent that for a $50 per day. See where that takes you in a week.
AltruisticNight8314 t1_j1soeji wrote
Thanks!
Viewing a single comment thread. View all comments