currentscurrents t1_j9nqcno wrote on February 23, 2023 at 7:38 AM

Reply to comment by Seankala in [D] 14.5M-15M is the smallest number of parameters I could find for current pretrained language models. Are there any that are smaller? by Seankala

What are you trying to do? Most of the cool features of language models only emerge at much larger scales.

Seankala OP t1_j9nqmf5 wrote on February 23, 2023 at 7:41 AM

That's true for all of the models. I don't really need anything cool though, all I need is a solid model that can perform simple tasks like text classification or NER well.

currentscurrents t1_j9nr930 wrote on February 23, 2023 at 7:49 AM

Might want to look into something like https://spacy.io/

Friktion t1_j9oxnz6 wrote on February 23, 2023 at 3:18 PM

I have some experience with FastText for e-commerce product classification. Its super lightweight and performs well as a MVP.

cantfindaname2take t1_j9qohtm wrote on February 23, 2023 at 9:48 PM

Yeah, I would try FastText before LLMs.

cantfindaname2take t1_j9qov0f wrote on February 23, 2023 at 9:50 PM

For simple NER tasks some simpler models might work too,like conditional random fields. The crfsuite package has a very easy to use implementation of it and it is using a C lib under the hood for the model training.