sqweeeeeeeeeeeeeeeps t1_jcngpzd wrote on March 18, 2023 at 2:42 AM

If u were to use gpt 3.5 turbo just wait for 4 before you spend $600 on compute costs

[deleted] OP t1_jcojdrv wrote on March 18, 2023 at 10:32 AM

My budget is $3k, hopefully it's enough to make something decent with that.

Also, data generation is surprisingly cheap, 50k mostly long response data from 3.5 cost $20.

danielbln t1_jcpfkwg wrote on March 18, 2023 at 3:26 PM

$100 on compute. The bulk of the $600 cost came from generating the data via davinci3, which is 10x as expensive as gpt3.5-turbo.

Disastrous_Elk_6375 t1_jco0q62 wrote on March 18, 2023 at 6:05 AM

> can I expect better outputs than what Stanford Alpaca achieved?

I think better is a bit subjective. They do note that the answers are generally shorter than ChatGPT, because they used text-davinci-003. Using gpt-3.5-turbo would get your answers closer to ChatGPT, but it could also "grab" that boring monotone "firstly... secondly... in conclusion" that often gives it away.

[deleted] OP t1_jcok6zs wrote on March 18, 2023 at 10:43 AM

Yeah, it does that. I can modify ChatGPT behavior through the system message, which should change the personality and response type in the final data. I could maybe start training it with examples on how it should act when the system message is present.

Example:

### System:{Act as a best friend}

### Instruction:{hi}### Input:{noinput}### Response: Hey there! What's up? How's your day going?

I could feed the model with thousands of examples like this which would result in complete personality change if the system message is present

KingsmanVince t1_jcn734b wrote on March 18, 2023 at 1:22 AM

Positively maybe llama is better than alpaca if you do so

Negatively maybe it responds closely to ChatGPT

[deleted] OP t1_jcolakv wrote on March 18, 2023 at 10:58 AM

I can modify the response behavior through the system message. What would you suggest as a system message to get away from that boring ChatGPT response style? For example, if I set the system message to "act as a pirate" I can grab thousands of responses to various topics in that style. Here is an example of how the system message can completely change the response type:

Question: what is apple Inc.
No system message answer: Apple Inc. is a multinational technology company that designs, develops, and sells consumer electronics, computer software, and online services. It is known for creating popular products such as the iPhone, MacBook, and Apple Watch, and operating services such as iTunes, Apple Music, and iCloud.

Now with system message "Avoid providing boring responses, make sure the user is entertained: Apple Inc. is much more than just a company that produces those fancy iPhones and Macbooks that you probably spend way too much time drooling over. It's a technological powerhouse that has been responsible for some of the biggest innovations in consumer technology over the past few decades.
From the iconic iPod that revolutionized the music industry, to the iPhone that has turned us all into smartphone-addicted zombies, Apple has been at the forefront of tech innovation for as long as anyone can remember.
The company was founded in 1976 by Steve Jobs and Steve Wozniak, and has since grown into one of the most influential companies in the world. Love them or hate them (or just be indifferent, whatever), there's no denying that Apple has left an indelible mark on the world of technology.

[deleted] OP t1_jcolkjv wrote on March 18, 2023 at 11:01 AM

One thing to note is that various people prefer various types of responses. Some might like the default ChatGPT style, some something else. So the best thing to do is to teach the chatbot to behave based on the system message

gwbyrd t1_jcp0czq wrote on March 18, 2023 at 1:33 PM

Does anyone know how and could guide me through fine-tuning the Llama data set or the alpaca data set on my own Facebook posts and comments so that I could create a virtual Avatar of myself?

[D] Newbie question about Stanford Alpaca 7b fine-tuning

Comments