Ambiwlans t1_j9lab3g wrote
Reply to comment by gelukuMLG in What. The. ***k. [less than 1B parameter model outperforms GPT 3.5 in science multiple choice questions] by Destiny_Knight
At this point we don't really know what is bottlenecking. More params is an easyish way to capture more knowledge if you have the architecture and the $$... but there are a lot of other techniques available that increase the efficiency of the parameters.
Viewing a single comment thread. View all comments