Submitted by fxmarty t3_z1titt in MachineLearning
fxmarty OP t1_ixd5yaq wrote
Reply to comment by visarga in [P] BetterTransformer: PyTorch-native free-lunch speedups for Transformer-based models by fxmarty
I believe it does not in PyTorch 1.13. However if you try PyTorch nightlies there is support for FlashAttention and MemoryEfficientAttention. Example notebook: https://colab.research.google.com/drive/1eCDJ4pql8102J_BtGSyjCRJwLp3TTN_h . Digging into the source code of PyTorch we indeed see them.
However, this is only limited to inference for now, but given that there is work from PyTorch's team to include this natively, I would expect to see support for training in the future!
Viewing a single comment thread. View all comments