[P] vanilla-llama an hackable plain-pytorch implementation of LLaMA that can be run on any system (if you have enough resources) Submitted by poppear t3_11ozl85 on March 12, 2023 at 12:07 AM in MachineLearning 8 comments 83
kryatoshi t1_jc8suy9 wrote on March 15, 2023 at 2:43 AM You can fit 4bit quantized 65B on M1 Max 64GB RAM, it takes of 40GB unified memory. here https://twitter.com/tljstewart/status/1635326012346048528?s=20 Permalink 1
Viewing a single comment thread. View all comments