turnip_burrito t1_j9kgb2q wrote
Reply to comment by gelukuMLG in What. The. ***k. [less than 1B parameter model outperforms GPT 3.5 in science multiple choice questions] by Destiny_Knight
We already knew parameters aren't everything, or else we'd just be using really large feedforward networks for everything. Architecture, data, and other tricks matter too.
Viewing a single comment thread. View all comments