JackandFred t1_iuzb389 wrote on November 4, 2022 at 2:51 AM

I feel like if your going to include transformers you should include the attention is all you need paper.

PassionatePossum t1_iv05451 wrote on November 4, 2022 at 8:42 AM

I would only include as a historical reference. It is certainly not a "must read" paper. It is written so poorly that you are better off to just look at the code.

ukshin-coldi t1_iv0qocf wrote on November 4, 2022 at 12:51 PM

Any good resources for writing a well written ML paper?

Intelligent-Aioli-43 t1_iv1lgvq wrote on November 4, 2022 at 4:26 PM

Check out MLRC

flaghacker_ t1_iv5jf05 wrote on November 5, 2022 at 1:57 PM

What's wrong with it? They explain all the components of their model in enough detail (in particular the multi head attention stuff), provide intuition behind certain decisions, include clear results, they have nice pictures, ... What could have been improved about it?

onyx-zero-software t1_iv0bgza wrote on November 4, 2022 at 10:14 AM

Agreed