Submitted by kkimdev t3_124er9o in MachineLearning
SOTA LLMs are getting too big, and not even available. For individual researchers who want to try different pre-training strategies/architecture and potentially publish meaningful research, what would be the best way to proceed? Any smaller model suitable for this? (and yet that people would take the result seriously.)
Nezarah t1_jdz1zqc wrote
For specifically personal use and research? And not commercial? LlaMA is a good place to start, and/or Alpaca 7B. Small scale (can run on most hardware locally), can be Lora trained and fine-tuned. Also has High token limits (I think it’s 2000 or so?).
Can have outputs comparable to GPT3 which can be further enhanced with Pre-Context training.
Can add branching functionality through the Langchain library.