Is there a blog post or a paper comparing open source / open weights models? I know flant t5 is really good at instruction following, but I am specifically refering to performance after finetuning. Preferably it compares models from somewhere around 1b to 11b parameters.

Comments

You must log in or register to comment.

borisfin t1_j8upatv wrote on February 17, 2023 at 1:57 AM

#1,838,081

There is some interesting comparisons found in the flan t5 paper. Checkout the paper "Scaling Instruction-Finetuned Language Models". Hope this helps.

adt t1_j8v1vlp wrote on February 17, 2023 at 3:37 AM

#1,839,057

For models, see my up-to-date list of models:

https://docs.google.com/spreadsheets/d/1O5KVQW1Hx5ZAkcg8AIRjbQLQzx2wVaLl0SqUu-ir9Fs/edit#gid=1158069878

For performance, Papers with code keep good benchmarks:

https://paperswithcode.com/area/natural-language-processing

https://paperswithcode.com/task/question-answering

Franck_Dernoncourt t1_j8v206a wrote on February 17, 2023 at 3:38 AM

#1,839,068

For summarization: Tianyi Zhang, Faisal Ladhak, Esin Durmus, Percy Liang, Kathleen McKeown, Tatsunori B. Hashimoto. Benchmarking Large Language Models for News Summarization. arXiv:2301.13848.