kryatoshi t1_jc8suy9 wrote on March 15, 2023 at 2:43 AM Reply to [P] vanilla-llama an hackable plain-pytorch implementation of LLaMA that can be run on any system (if you have enough resources) by poppear You can fit 4bit quantized 65B on M1 Max 64GB RAM, it takes of 40GB unified memory. here https://twitter.com/tljstewart/status/1635326012346048528?s=20 Permalink 1
kryatoshi t1_jc8suy9 wrote
Reply to [P] vanilla-llama an hackable plain-pytorch implementation of LLaMA that can be run on any system (if you have enough resources) by poppear
You can fit 4bit quantized 65B on M1 Max 64GB RAM, it takes of 40GB unified memory. here https://twitter.com/tljstewart/status/1635326012346048528?s=20