Submitted by AutoModerator t3_11pgj86 in MachineLearning
No_Complaint_1304 t1_jc7szbg wrote
Complete beginner looking for insight
I made an extremely efficient algorithm in C that skim through a data base and search for words, I want to add a feature that if it is not found the program can somehow understand the context and predict what is the actual word intended and also conjugate the verbs accordingly. I have no idea if what I am saying is crazy hard to implement or can easily be done by someone with experience. This field interest me a lot and i will definitely come back to this sub sooner or later, but right now i don’t have time to dig in this subject, I just want to finish this project, slap a good looking gui and get over with it. Can I achieve what i stated above in a week or am i just dreaming? If it is possible what resources do you think I should be looking at? Ty :>
LeN3rd t1_jcgq97y wrote
You will need more than a week. If you just want to predict the next word in a sentence, take a look at large language models. ChatGPT being one of them. BERT is a research alternative afaik. If you aim to learn the probabilities yourself, you will need at least a few months.
In general what you want is a generative model that can sample from the conditional probability distribution. In sequences usually transformers like BERT and chatgpt are state of the art. You can also take a look at normalizing flows and diffusion models to learn probability distributions. But this needs some maths, and i unfortunatly do not know what smaller models can be used for computational linguistic applications like this.
No_Complaint_1304 t1_jcgshk7 wrote
Well I did expect this but still month’s! I’ll look into everything you mentioned. And I’ll drop the project for now, if I can’t finish it by studying heavily, I might as well learn slowly but surely, absorb all the information and then go back to make a project that involve predictions and analyzing data. ty4ur help
LeN3rd t1_jcguoiy wrote
I didn't mean to discourage you. Its a fascinating field, but it is its own field of research for a reason. Start with BERT and see where that gets you.
These ones are also a nice small watch:
No_Complaint_1304 t1_jchh2u9 wrote
Damn I hope no one got me wrong I wanted to learn the basics in a week (and finish my side project asap) I didn’t claim I could study such a large and complex field in a week.
LeN3rd t1_jchkhuv wrote
What you can try is to start with linear or log Regression and try to learn on Wikipedia. That might be fun and give you decent results.
Viewing a single comment thread. View all comments