Submitted by Destiny_Knight t3_11tab5h in singularity
pokeuser61 t1_jcj294w wrote
Reply to comment by FoxlyKei in Those who know... by Destiny_Knight
Don't even need a gaming rig; https://github.com/ggerganov/llama.cpp
FoxlyKei t1_jcj30yc wrote
How much vram do I need, then? I look forward to a larger model trained on gpt 4, I can only imagine the next month even. I'm excited and scared at the same time.
bemmu t1_jcj6zrc wrote
You can try Alpaca out super easily. When I heard about it last night and just followed the instructions I had it running in 5 minutes on my GPU-less old mac mini:
Download the file ggml-alpaca-7b-q4.bin, then in terminal:
git clone https://github.com/antimatter15/alpaca.cpp
cd alpaca.cpp
make chat
./chat
XagentVFX t1_jcl71ht wrote
Dude, thank you so much. I was trying to download llama a different way but flopped. Then resorted to GPT-2. But this was super easy.
testfujcdujb t1_jcrtze8 wrote
It is very bad though. A lot worse than chatgpt.
R1chterScale t1_jcj4i3i wrote
Not GPU, CPU, so normal RAM not VRAM, takes about 8 or so gb to itself
FoxlyKei t1_jcj6xmh wrote
Oh? So this only uses RAM? I just understood that Stable Diffusion requires VRAM but I guess that's just because it's processing images. Most people have plenty of RAM. Nice.
R1chterScale t1_jcjgd0x wrote
Models can either use VRAM or RAM depending on whether they're accelerated with a GPU, has nothing to do with what they're actually processing, just different implementations.
iiioiia t1_jckjt70 wrote
Any rough idea what the perforamnce difference is vs a GPU (of various powers)?
And does more ram help?
Straight-Comb-6956 t1_jcj7fn3 wrote
0. llama.cpp runs on CPU and uses plain RAM.
I've managed to launch 7B Facebook LLAMA with 5GB memory consumption and 65B model with just 43GB.
GreenMirage t1_jcjnxyv wrote
holy crap, thanks man.
Viewing a single comment thread. View all comments