Submitted by matus_pikuliak t3_124frc3 in MachineLearning
rshah4 t1_je0crbz wrote
Reply to comment by matus_pikuliak in [P] ChatGPT Survey: Performance on NLP datasets by matus_pikuliak
I agree, these baselines are useful. I think we should push for is more human baselines for these benchmarks. That would help figure out how far we have left to go.
Viewing a single comment thread. View all comments