wirefire07 t1_jcgx51q wrote on March 16, 2023 at 7:15 PM Reply to [P] Discord Chatbot for LLaMA 4-bit quantized that runs 13b in <9 GiB VRAM by Amazing_Painter_7692 Already heared about this project? https://github.com/ggerganov/llama.cpp -> It's very fast!! Permalink 1
wirefire07 t1_jcgx51q wrote
Reply to [P] Discord Chatbot for LLaMA 4-bit quantized that runs 13b in <9 GiB VRAM by Amazing_Painter_7692
Already heared about this project? https://github.com/ggerganov/llama.cpp -> It's very fast!!