Submitted by [deleted] t3_110chc7 in MachineLearning
[deleted]
Submitted by [deleted] t3_110chc7 in MachineLearning
[deleted]
What I meant by “not to concerned” is just at the moment, once people give some suggestions I’ll look at them and see what would work with my system and I’d go from there. I have been by hugging face before, but personally, I wanted to hear from the community on their opinion/experiences on the matter. I have edited my post, I hope the edit helps in any way.
>I would like this model to end up functioning like chatgpt. Not only to have it respond like a human/nlp but to also give me full technical answers, descriptions, and just simple specific answers to my questions. I will in the future update the model’s data/knowledge and also train it to do new tasks like image recognition and so on.
Based on your edit, I'm not sure you realize the scope of what you are trying to do. ChatGPT required almost 200 hundred billion data parameters, multiple NVIDIA A100s and many terrabytes of RAM to train.
You simply cannot expect to create a general purpose, human-sounding AI that does all of the things you expect to train it to do on a home computer, even if you were somehow a brilliant data scientist.
You can either use Hugging Face Transformers as they have a lot of pre-trained models that you can customize. Or Finetuners like this one: which is a toolkit for fine-tuning multiple models.
I don’t think you have a sound idea what you are trying to do. So you want chatGPT + extra!!! What you are asking does not exist, at least currently. Making a model size of chatGPT will cost at the very least $5M and absolutely not possible locally. You need a distributed setup. Not to mention all the technical difficulties of making such a setup.
Check out this - https://huggingface.co/models
You can download models and try them out locally, depending on your specs. It's unlikely you'll find a single model that does everything you need, but there's a chance you can use a combination of models to get close to what you want. You'll need to be a bit more specific about your end goals to get better suited suggestions.
I have been by hugging face before, but personally, I wanted to hear from the community on their opinion/experiences on the matter. I have edited my post, I hope the edit helps in any way.
And a sightly more advanced version with information about fine-tuning https://huggingface.co/docs/transformers/tasks/question_answering
Chatgpt was trained on 1024 GPUs. Let that sink in before you set out to do something similar at home.
I'd go with RWKV, clever architecture that allows training an RNN like a normal transformer model.
https://github.com/BlinkDL/RWKV-LM
You can use a quantized variant to run larger models on modest hardware (int8 or mixed int8/int4 has been shown to work well with LLMs).
Malignant-Koala t1_j886t96 wrote
> I’m not to concerned about the size and spec requirements
You should be. Many of the things you're talking about run on massive cloud infrastructure using multiple GPUs.
But if you're a reasonably advanced python dev, you could look at Huggingface which has some excellent AI models that might allow you to program your own simplistic versions of some of the things you want.
But it sounds like you want to be running a local version of a general purpose AI like ChatGPT on your home computer, and that's going to be extremely hard to do on home hardware and require you to be a fairly advanced developer.