wirefire07

wirefire07 t1_jcgx51q wrote on March 16, 2023 at 7:15 PM

Reply to [P] Discord Chatbot for LLaMA 4-bit quantized that runs 13b in <9 GiB VRAM by Amazing_Painter_7692

Already heared about this project? https://github.com/ggerganov/llama.cpp -> It's very fast!!