Submitted by windoze t3_ylixp5 in MachineLearning
JackandFred t1_iuzb389 wrote
I feel like if your going to include transformers you should include the attention is all you need paper.
PassionatePossum t1_iv05451 wrote
I would only include as a historical reference. It is certainly not a "must read" paper. It is written so poorly that you are better off to just look at the code.
ukshin-coldi t1_iv0qocf wrote
Any good resources for writing a well written ML paper?
Intelligent-Aioli-43 t1_iv1lgvq wrote
Check out MLRC
flaghacker_ t1_iv5jf05 wrote
What's wrong with it? They explain all the components of their model in enough detail (in particular the multi head attention stuff), provide intuition behind certain decisions, include clear results, they have nice pictures, ... What could have been improved about it?
onyx-zero-software t1_iv0bgza wrote
Agreed
Viewing a single comment thread. View all comments