master3243 t1_j62aoln wrote
Reply to comment by currentscurrents in [R] Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers by currentscurrents
You're right they've been around for 5 years (and the idea for attention even before that) but almost every major conference still has new papers coming out giving more insight into transformers (and sometimes algorithms/methods older than it)
I just don't want to see titles flooded with terms like "secretly" or "hidden" or "mysterious", I feel it replaces scientific terms with less scientific but more eye-catchy ones.
Again I totally understand why they would choose this phrasing, and I probably would too, but in a blog post title not a research paper title.
But once again, the actual work seems great and that's all that matters really.
Viewing a single comment thread. View all comments