Recently, I installed dalai on my Macbook Pro (late 2019, i7 processor and 16GB of RAM) and I also installed Alpaca-7B model. Now when I ask it to write a tweet, it writes a wikipedia article and it does the same pretty much every time 😂

First, should I fine-tune it?

Second, is there any "prompt magic" going on here?

P.S: using this one, I got much better results. What's the difference between the two?

Comments

You must log in or register to comment.

Haghiri75 OP t1_jcxt80d wrote on March 20, 2023 at 11:49 AM

I guess I found the reason. Dalai system does quantization on the models and it makes them incredibly fast, but the cost of this quantization is less coherency.

Jaffa6 t1_jd03par wrote on March 20, 2023 at 9:28 PM

That's odd.

Quantisation should make it go from (e.g.) 32 bit floats to 16bit floats, but I wouldn't expect it to lose that much coherency at all. Did they say somewhere that that's why?

Haghiri75 OP t1_jd32s29 wrote on March 21, 2023 at 2:18 PM

Apparently I was wrong, the problem is not only quantization. It is because it's not Stanford's Alpaca and another alpaca-like model. This was what I can surely say about that.

[deleted] t1_jd0wy1z wrote on March 21, 2023 at 12:53 AM

[deleted]

j-solorzano t1_jd16g7r wrote on March 21, 2023 at 2:03 AM

Try adjusting the temperature.

Haghiri75 OP t1_jd32z3b wrote on March 21, 2023 at 2:19 PM

Temperature is just a matter of randomness, getting it higher actually helps in generating more variations from the same prompt, but the coherency is still a problem.

viperx7 t1_jd2idf8 wrote on March 21, 2023 at 11:25 AM

Use my repo it is exatly what you want

https://github.com/ViperX7/Alpaca-Turbo

Coneicus t1_jd2s150 wrote on March 21, 2023 at 12:56 PM

This should help https://twitter.com/cocktailpeanut/status/1635190013758676993?s=20

Board_Stock t1_jczmxht wrote on March 20, 2023 at 7:40 PM

Hello, I've been running the alpaca.cpp on my laptop, and have you figured out how to make it remember conversations yet? sorry if this is a beginner question

j-solorzano t1_jd16nfb wrote on March 21, 2023 at 2:04 AM

Language models don't remember conversations by themselves. You'd have to implement a memory and then add retrieved memories to the prompt.

Board_Stock t1_jd1z78z wrote on March 21, 2023 at 6:56 AM

Yes that's what I meant, I want run the alpaca.cpp in an api like way so that it will automatically enter the previous convo along with the new message in the prompt

j-solorzano t1_jd4qgot wrote on March 21, 2023 at 8:40 PM

Take a look at Langchain and GPTIndex.