master3243 t1_j61wtpt wrote on January 27, 2023 at 3:05 AM

This is great work in collaboration with Microsoft Research. I'll have to read more than just the abstract and quickly skimming over it.

My only slight annoyance is the word "Secretly" in the title, I just feel a better word would be "implicitly" that would also be less "clickbait'-y

currentscurrents OP t1_j627rd0 wrote on January 27, 2023 at 4:34 AM

Meh, transformers have been around for like 5 years and nobody figured this out until now.

I think this mostly speaks to how hard it is to figure out what neural networks are doing. Complexity is irrelevant to the training process (or any other optimization process), so the algorithms they implement are arbitrarily complex.

(or in practice, as arbitrarily complex as the model size and dataset size allow)

master3243 t1_j62aoln wrote on January 27, 2023 at 5:01 AM

You're right they've been around for 5 years (and the idea for attention even before that) but almost every major conference still has new papers coming out giving more insight into transformers (and sometimes algorithms/methods older than it)

I just don't want to see titles flooded with terms like "secretly" or "hidden" or "mysterious", I feel it replaces scientific terms with less scientific but more eye-catchy ones.

Again I totally understand why they would choose this phrasing, and I probably would too, but in a blog post title not a research paper title.

But once again, the actual work seems great and that's all that matters really.