Amazing_Painter_7692

Amazing_Painter_7692 OP t1_jbzoq05 wrote

There's an inference engine class if you want to build out your own API:

https://github.com/AmericanPresidentJimmyCarter/yal-discord-bot/blob/main/bot/llama_model/engine.py#L56-L96

And there's a simple text inference script here:

https://github.com/AmericanPresidentJimmyCarter/yal-discord-bot/blob/main/bot/llama_model/llama_inference.py

Or in the original repo:

https://github.com/qwopqwop200/GPTQ-for-LLaMa

BUT someone has already made a webUI like the automatic1111 one!

https://github.com/oobabooga/text-generation-webui

Unfortunately it looked really complicated for me to set up with 4-bits weights and I tend to do everything over a Linux terminal. :P

15