WigglyHypersurface OP t1_j5ldsn7 wrote on January 23, 2023 at 8:52 PM

Reply to comment by terath in [D] Embedding bags for LLMs by WigglyHypersurface

The reason I'm curious is that FastText embeddings tend to work better on small corpora. I'm wondering if you took one of the small-data-efficient LLMs that you can train yourself on a few A100s (like ELECTRA) and changed the embeddings to a bag-of-character ngrams if you'd see further gains on small training sets.