Lolologist t1_itux8cs wrote on October 26, 2022 at 2:24 PM Reply to [P] Up to 12X faster GPU inference on Bert, T5 and other transformers with OpenAI Triton kernels by pommedeterresautee This all looks very impressive! I'm not terribly well-versed in the nitty-gritty of ML's underpinnings so forgive me if this is a dumb question but: How might we apply your speedup to, say, spaCy? Is this something that is dragged and dropped in somewhere? Permalink 1
Lolologist t1_itux8cs wrote
Reply to [P] Up to 12X faster GPU inference on Bert, T5 and other transformers with OpenAI Triton kernels by pommedeterresautee
This all looks very impressive!
I'm not terribly well-versed in the nitty-gritty of ML's underpinnings so forgive me if this is a dumb question but:
How might we apply your speedup to, say, spaCy? Is this something that is dragged and dropped in somewhere?