TiredMoose69
TiredMoose69 t1_j4l525j wrote
Reply to comment by ephemeral_happiness_ in [D] Simple Questions Thread by AutoModerator
no :( But i did train a GPT-2 355M model on chatbot like data. The output of it was fun but not that great hahaha
I am now looking into something like this:
https://github.com/daveshap/LongtermChatExternalSources
I think i will use the API from openai to load messages like this so that it can "remember" them every time i prompt to it. If you're interested in working on something similar PM me we can share ideas.
TiredMoose69 t1_j3w9j4x wrote
Reply to [D] Simple Questions Thread by AutoModerator
I would like to start a project (possibly) using openai's API to make a GPT-3 based bot finetuned on Messenger/Whatsapp chatlogs of mine. Any suggestions on which model to use?
From what i see i have around 100M tokens for it to learn, I am currently working on changing the format of the json file and the txt file of the exported data but i am confused regarding which model to use. I think the davinci003 is an overkill for something like that.
(My data is in Greek, i used a GPT-2-Small-125M model trained for many hours on a colab notebook, and the results were not that great :D that's why i wanted to try a bigger model)
The format i want to use it for is idealy a chat bot that you ask something and replies as the other person would (my friend).
Do you think it's possible to train it on my local PC (RTX3060ti) for privacy reasons?
Any help/suggestions would be highly appreciated!
TiredMoose69 t1_jdf31vk wrote
Reply to [D] Simple Questions Thread by AutoModerator
Why does LlaMa 7B (pure) perform so MUCH better than Alpaca 30B (4bit)?