[R] Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers Submitted by currentscurrents t3_10ly7rw on January 26, 2023 at 6:06 PM in MachineLearning 32 comments 235
ApprehensiveNature69 t1_j61wedt wrote on January 27, 2023 at 3:01 AM This is really awesome! Permalink 2
Viewing a single comment thread. View all comments