fxmarty OP t1_ixd5yaq wrote on November 22, 2022 at 3:38 PM

Reply to comment by visarga in [P] BetterTransformer: PyTorch-native free-lunch speedups for Transformer-based models by fxmarty

I believe it does not in PyTorch 1.13. However if you try PyTorch nightlies there is support for FlashAttention and MemoryEfficientAttention. Example notebook: https://colab.research.google.com/drive/1eCDJ4pql8102J_BtGSyjCRJwLp3TTN_h . Digging into the source code of PyTorch we indeed see them.

However, this is only limited to inference for now, but given that there is work from PyTorch's team to include this natively, I would expect to see support for training in the future!