toothpastespiders t1_jc01mr9 wrote on March 13, 2023 at 1:12 AM

Reply to comment by Amazing_Painter_7692 in [P] Discord Chatbot for LLaMA 4-bit quantized that runs 13b in <9 GiB VRAM by Amazing_Painter_7692

> BUT someone has already made a webUI like the automatic1111 one!

There's a subreddit for it over at /r/Oobabooga too that deserves more attention. I've only had a little time to play around with it but it's a pretty sleek system from what I've seen.

> it looked really complicated for me to set up with 4-bits weights

I'd like to say that the warnings make it more intimidating than it really is. I think it was just copying and pasting four or five lines for me onto a terminal. Then again I also couldn't get it to work so I might be doing something wrong. I'm guessing it's just that my weirdo gpu wasn't really accounted for somewhere. I'm going to bang my head against it when I've got time just because it's frustrating having tons of vram to spare and not getting the most out of it.

remghoost7 t1_jc0bymy wrote on March 13, 2023 at 2:34 AM

~~I'm having an issue with the C++ compiler on the last step.~~

~~I've been trying to use python 3.10.9 though, so maybe that's my problem....? My venv is set up correctly as well.~~

~~Not specifically looking for help.~~

Apparently this person posted a guide on it in that subreddit. Will report back if I am successful.

edit - Success! But, using WSL instead of Windows (because that was a freaking headache). WSL worked the first time following the instructions on the GitHub page. Would highly recommend using WSL to install it instead of trying to force Windows to figure it out.

Pathos14489 t1_jc0dame wrote on March 13, 2023 at 2:45 AM

r/Oobabooga isn't accessible for me.