Submitted by Vegetable-Skill-9700 t3_121agx4 in deeplearning
Vegetable-Skill-9700 OP t1_jdpr474 wrote
Reply to comment by Jaffa6 in Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
Thanks for sharing! It's a great read, I agree most of the current models most likely be under-trained.
Jaffa6 t1_jdq1rua wrote
It's worth noting that some models were designed according to this once it came out, and I believe it did have some impact in the community, but yeah wouldn't surprise me if it's still a problem.
Glad you liked it!
Viewing a single comment thread. View all comments