Submitted by WigglyHypersurface t3_10jka1r in MachineLearning
WigglyHypersurface OP t1_j5ldsn7 wrote
Reply to comment by terath in [D] Embedding bags for LLMs by WigglyHypersurface
The reason I'm curious is that FastText embeddings tend to work better on small corpora. I'm wondering if you took one of the small-data-efficient LLMs that you can train yourself on a few A100s (like ELECTRA) and changed the embeddings to a bag-of-character ngrams if you'd see further gains on small training sets.
Viewing a single comment thread. View all comments