ElbowWavingOversight

ElbowWavingOversight t1_j870smg wrote

No. Not until these LLMs came around, anyway. What other examples do you have of this? Even in the case of few-shot or zero-shot learning, which allow the model to generalize beyond the classes it sees in its test set, is limited to the associations between classes that it learns during training. It can't learn new associations given new data after-the-fact without rerunning the training loop and updating the parameters.

20

ElbowWavingOversight t1_j86z5rp wrote

> I'm sorry, isn't this just how ML models are implemented?

No. The novel discovery is the fact that these large language models appear to have learned a form of gradient descent at inference time. This is why they appear to be able to learn even without updates to the weights. FTA:

> We show that it is possible for these models to learn from examples on the fly without any parameter update we apply to the model.

This bodes well for the generalizability of these models, because it means they have the potential to learn new associations merely from the additional context provided during inference, rather than having to be provided with that data ahead of time as part of the training set.

75