Submitted by kkimdev t3_124er9o in MachineLearning
SOTA LLMs are getting too big, and not even available. For individual researchers who want to try different pre-training strategies/architecture and potentially publish meaningful research, what would be the best way to proceed? Any smaller model suitable for this? (and yet that people would take the result seriously.)
asdfzzz2 t1_jdzbuav wrote
https://arxiv.org/abs/2212.14034 might be a good starting point.