master3243 t1_j62aoln wrote on January 27, 2023 at 5:01 AM

Reply to comment by currentscurrents in [R] Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers by currentscurrents

You're right they've been around for 5 years (and the idea for attention even before that) but almost every major conference still has new papers coming out giving more insight into transformers (and sometimes algorithms/methods older than it)

I just don't want to see titles flooded with terms like "secretly" or "hidden" or "mysterious", I feel it replaces scientific terms with less scientific but more eye-catchy ones.

Again I totally understand why they would choose this phrasing, and I probably would too, but in a blog post title not a research paper title.

But once again, the actual work seems great and that's all that matters really.