Submitted by jaqws t3_10dljs6 in MachineLearning
As the title says, I'm curious about using open source models like GPT-J, GPT-NeoX, Bloom, or OPT to compete with ChatGPT for *specific use-cases* such as explaining what a bit of code does. ChatGPT does this task quite well, but it's closed-source nature prevents it from being useful in documenting or commenting proprietary code. There's also limitations such as the amount of text ChatGPT will read or respond with.
Getting beyond these limitations is something I'm interested in pursuing, perhaps with the help of somewhere in this subreddit. Some assumptions you can safely make:
- We can get (lots of) funding for the training, hardware, etc...
- The end product should be on-premises
- The inference does not actually need to run very quickly. If it costs millions to buy enough GPUs just due to VRAM limitations, we could simply run on CPUs and utilize ram, as long as inference could be done a few times per day.
So I guess my questions are where would we start? What model is best to fine-tune? How would you specifically fine-tune to improve specific use cases?
avocadoughnut t1_j4m12v2 wrote
There's currently a project in progress called OpenAssistant. It's being organized by Yannic Kilcher and some LAION members, to my understanding. Their current goal is to develop interfaces to gather data, and then train a model using RLHF. You can find a ton of discussion in the LAION discord. There's a channel for this project.