Viewing a single comment thread. View all comments

machineko t1_je05orp wrote

I agree. While these giant centralized models are all over the news, there are ways to make smaller models much more efficient (i.e. LoRA mentioned above). And during the process working with these techniques, we can perhaps discover new methods and architecture .

We are working on an open-source project focused on making fine-tuning for LLMs, simple, fast and efficient: https://github.com/stochasticai/xturing.

OP, we till got a ton of stuff we want to try out to make fine-tuning faster and more compute/memory efficient, if you are interested in contributing.

6