Submitted by 00001746 t3_1244q71 in MachineLearning
EvilMegaDroid t1_je0d2a0 wrote
Reply to comment by Professional-Gap-243 in [D] FOMO on the rapid pace of LLMs by 00001746
There are many open source projects which in theory can do better than chatgpt.
The issue? Spend millions of dollars on the data to fed it.
Open source LLM are useless, the data is the important part.
Google microsoft etc can fed them their own data and they still spend millions of $,imagine how much it would cost for the normal joe to buy that data and the operating cost.
I doubt there will ever be an open source chat gpt that just works.
Zealousideal-Ice9957 t1_je5vo1c wrote
You better have a look at the OpenAssistant initiative made by Laion, their Human assisted data collection process is said to be of very good quality compared to the underpaid croworder-based one used by OpenAI
EvilMegaDroid t1_je6s99k wrote
Good idea, I'm kinda skeptical if enough users would complete tasks for it to get enough data.
Not impossible though, there are huge open source projects so who knows.
Zealousideal-Ice9957 t1_jebdm73 wrote
They just completed the data collection a few days ago, and they claim prompts of really high quality due to strict filtering algorithm and the propension of the community to create a better open source alternative to OAI.
EvilMegaDroid t1_jec89t6 wrote
That would be insane (I mean as noted, was not impossible given that people have come together to improve things such as big open source projets like linux, mpv etc).
I checked it out for a while but got confused, is everyone supposed to access the data because i could not.
HerculeanSubmarine t1_jeaeqow wrote
Alpaca LoRA cost pretty much nothing to get the dataset from GPT-3
GPT4All was fine-tuned using a 430k dataset that costed $100 in OpenAI API fees
Viewing a single comment thread. View all comments