36% of HellaSwag benchmark contains errors [D] Submitted by BB4evaTB12 t3_zff5mh on December 7, 2022 at 9:51 PM in MachineLearning 6 comments 33
Mefaso t1_izdjye3 wrote on December 8, 2022 at 8:51 AM The fact that big bench includes kanji-ascii art classification task is pretty funny. But i guess if you want to have over a hundred tasks in a benchmark you run out of ideas at some point Permalink 5
Viewing a single comment thread. View all comments