currentscurrents t1_j9nqcno wrote
Reply to comment by Seankala in [D] 14.5M-15M is the smallest number of parameters I could find for current pretrained language models. Are there any that are smaller? by Seankala
What are you trying to do? Most of the cool features of language models only emerge at much larger scales.
Seankala OP t1_j9nqmf5 wrote
That's true for all of the models. I don't really need anything cool though, all I need is a solid model that can perform simple tasks like text classification or NER well.
currentscurrents t1_j9nr930 wrote
Might want to look into something like https://spacy.io/
Friktion t1_j9oxnz6 wrote
I have some experience with FastText for e-commerce product classification. Its super lightweight and performs well as a MVP.
cantfindaname2take t1_j9qohtm wrote
Yeah, I would try FastText before LLMs.
cantfindaname2take t1_j9qov0f wrote
For simple NER tasks some simpler models might work too,like conditional random fields. The crfsuite package has a very easy to use implementation of it and it is using a C lib under the hood for the model training.
Viewing a single comment thread. View all comments