[P] Discord Chatbot for LLaMA 4-bit quantized that runs 13b in <9 GiB VRAM github.com Submitted by Amazing_Painter_7692 t3_11pmz69 on March 12, 2023 at 7:13 PM in MachineLearning 51 comments 320
wirefire07 t1_jcgx51q wrote on March 16, 2023 at 7:15 PM Already heared about this project? https://github.com/ggerganov/llama.cpp -> It's very fast!! Permalink 1
Viewing a single comment thread. View all comments