Seankala OP t1_j9npctd wrote
Reply to comment by adt in [D] 14.5M-15M is the smallest number of parameters I could find for current pretrained language models. Are there any that are smaller? by Seankala
Thanks for the detailed answer! My use case is that the company I work at currently uses image-based models for e-commerce purposes, but we want to use text-based models as well. The image-based model(s) are already taking up around 30-50M parameters so I didn't want to just bring in a 100M+ parameter model. Even 15M seems quite big.
currentscurrents t1_j9nqcno wrote
What are you trying to do? Most of the cool features of language models only emerge at much larger scales.
Seankala OP t1_j9nqmf5 wrote
That's true for all of the models. I don't really need anything cool though, all I need is a solid model that can perform simple tasks like text classification or NER well.
currentscurrents t1_j9nr930 wrote
Might want to look into something like https://spacy.io/
Friktion t1_j9oxnz6 wrote
I have some experience with FastText for e-commerce product classification. Its super lightweight and performs well as a MVP.
cantfindaname2take t1_j9qohtm wrote
Yeah, I would try FastText before LLMs.
cantfindaname2take t1_j9qov0f wrote
For simple NER tasks some simpler models might work too,like conditional random fields. The crfsuite package has a very easy to use implementation of it and it is using a C lib under the hood for the model training.
Viewing a single comment thread. View all comments