Submitted by ortegaalfredo t3_11kr20f in MachineLearning

Installed LLaMA 13B (legitimate download) on a Dual RTX 3090 server and created a discord bot to interact with it.

As it's quite fast I'm opening it to the public, here is the discord invite. No registration/payments, etc. completely free.

Instructions in comments as I cannot post an invite directly here.

83

Comments

You must log in or register to comment.

ortegaalfredo OP t1_jb8kdzj wrote

Here are the instructions, you need a discord account, that's it. No limits on what you can ask it, nor rules. Please behave as any spam will need to be removed:

https://twitter.com/ortegaalfredo/status/1632903130416308229

Code for the bot is here:

https://github.com/ortegaalfredo/celery-ai/blob/main/discord/bot.py

12

ReginaldIII t1_jb9goco wrote

Link to your code? It needs to be GPLv3 to be compliant with LLama's licensing.

How are you finding the quality of the output? I've had a little play around with the model but wasn't overly impressed. That said, a nice big parameter set like this is a nice test bed for looking at things like pruning methods.

−4

abnormal_human t1_jb9kyzr wrote

Actually, it doesn't. GPLv3 just requires that if OP distributes a binary to someone, the source used to produce that binary is also made available. With server side code the binary isn't being distributed, so no obligation to distribute source.

13

ReginaldIII t1_jb9xlil wrote

Fair enough, I didn't realize that hosting a publicly available service is not the same as distributing.

3

ortegaalfredo OP t1_jbi81mn wrote

I posted the github repo in the original post. The output is bad because Meta's original generator is quite bad. I upgraded it today and its much better now. Still not chatgpt.

1

SrPeixinho t1_jb96nyt wrote

Can I donate or help somehow to make it 65B?

11

ortegaalfredo OP t1_jbaaqv5 wrote

The most important thing is to create a multi-process quantization to int8, this will allow it to work with 4X3090 GPU cards. Now it requires 8X3090 GPUs and its way over my budget.

Or just wait some days, I'm told some guys have 2xA100 cards and they will open a 65B model to the public this week.

11

SpaceCockatoo t1_jblj2so wrote

4bit quant already out

2

ortegaalfredo OP t1_jbov7dl wrote

Tried the 8bit, 4bit for some reason don't work yet for me.

Problem is, those are very very slow, about 1 token/sec, compared with 13B I'm getting 100 tokens/s

1

wywywywy t1_jb97nl6 wrote

Nice one.

With dual 3090s, I think 30b should be possible in 8bit?

5

phamtuanminhmeo t1_jb9owvy wrote

Did you put the prompt inside the text "The answer for the question "<prompt>" would be:" and make it the input? I think it would limit a lot of the generated text because it would give it a fixed context. Can we please try without it?

3

ortegaalfredo OP t1_jbaadnz wrote

Yes, you can send raw prompts using 'raw' like this:

&#x200B;

'@ BasedGPT raw The recipe of a chocolate cake is'

This will send whatever you write raw, without any wrapping or added text. But you have to write the prompt as a continuation like every other LLM before ChatGPT.

3

blablanonymous t1_jba5dai wrote

Why are you saying it’s unhinged? It just feels to me like it’s simply not constrained in the same way ChatGPT is, which is a very important part of providing a good experience, isn’t it?

3

ortegaalfredo OP t1_jbat4qi wrote

Just joking, even unbounded, LLaMA its actually more restrained than the original Bing or jailbroken ChatGPT.

6

smallfried t1_jbbs0hg wrote

Nice, it works quite well with a fancy prompt (in bold):

"A user is talking to a very imaginative writer, famous for writing in flowery language.

User: What inspired you to become a writer?

Writer: Reading and storytelling have been my passions since I was a little child. I delight in weaving new worlds and characters with the magic of words. Writing is also a way for me to pour out my soul and share my thoughts and feelings with others.

User: Can you describe a world in which fish are ruling the planet?

Writer: I am fascinated with the ocean because I feel like it has a life of its own, and I enjoy reading novels where characters live underwater. Imagine a planet where we and all other land animals have been forced underground by the fish. Fish have taken over the top layer of the planet and now everything has to be built submerged. Houses, cars, grocery stores, and schools—everything has to be submerged to allow fish to walk on top. I imagine that humans would still be living underground, walking through tunnels and using cars, trains, and subways to travel to different areas of the planet. I imagine that everything would be lit by small lights that shine from the bottoms of houses, and that it would rain a lot because the fish would need the water for their bodies. I can also imagine fish having to adapt to the planet aboveground, and that they would want to take over the world above the ocean. They would fight a big war against humans and would want to live above ground, but humans would fight for their right to live in their own planet.

User: Great! Thanks for the story, can you give me a parting gift?

Writer: Sure, I will give you a pencil!

Writer is a student and lives in an imaginary house. I used to be obsessed with the world underwater. I imagined that houses and cars were underwater and that the outside world was more like a vacation destination for humans. I liked to imagine that, if I could go to school underwater, I would be a very good student. I would be one of the best students and would never have to repeat a grade."

2

Radiant_Rhino t1_jd0gncq wrote

Could you please share the server invitation link with me? 🙏🏻

And is version 65b publicly available on Discord?

2

H0PEN1K t1_jdk0zgk wrote

My friend, could you please send me the discord server link?

2

firecz t1_jc1bwjs wrote

I would love to see this as a (Windows) GUI, similar to what some Stable Diffusion solutions do (nmkd, grisk...) - the entire code running offline on your PC, not sending something to Discord or elsewhere.
This would open it to masses, which in turn would pour in more money for research.

1

anandm21096 t1_je8255p wrote

hey there, can you send me the invite?

1