Submitted by Zondartul t3_zrbfcr in MachineLearning
arg_max t1_j136y5q wrote
Reply to comment by arg_max in [D] Running large language models on a home PC? by Zondartul
Just to give you an idea about "optimal configuration" though, this is way beyond desktop PC levels:
You will need at least 350GB GPU memory on your entire cluster to serve the OPT-175B model. For example, you can use 4 x AWS p3.16xlarge instances, which provide 4 (instance) x 8 (GPU/instance) x 16 (GB/GPU) = 512 GB memory.
Viewing a single comment thread. View all comments