undefdev t1_izbui6y wrote
Reply to comment by JustOneAvailableName in [D] If you had to pick 10-20 significant papers that summarize the research trajectory of AI from the past 100 years what would they be by versaceblues
> So I am not gonna cite Fast Weight Programmers when I want to write about transformers.
I think you are probably refering to this paper: Linear Transformers Are Secretly Fast Weight Programmers
It seems like they showed that linear transformers are equivalent to fast weight programmers. If linear transformers are relevant to your research, why not cite fast weight programmers? Credit is cheap, right? We can still call them linear transformers.
JustOneAvailableName t1_izbzbaq wrote
Because Schmidhuber claiming that transformers are based on his work was a meme for 3-4 years before he actually did that. Like here.
There are hundreds more relevant papers to cite and read about (linear scaling) transformers
undefdev t1_izc3tr1 wrote
> Because Schmidhuber claiming that transformers are based on his work was a meme for 3-4 years before he actually did that. Like here.
But why should memes be relevant in science? Not citing someone because there are memes around their person seems kind of arbitrary. If it's just memes, maybe we shouldn't take them too seriously.
Viewing a single comment thread. View all comments