Viewing a single comment thread. View all comments

zzzthelastuser t1_j7ulu8h wrote

> CUDA graphs require us to capture a graph per input tensor shape, there is a non-negligible warmup time. We measure around 10mn on 2 different machines / GPUs (down from 50mn in our previous Kernl version). One user reported with the new version a bit more than 20mn of warmup time. We are aware of obvious ways to decrease it significantly.

Dumb question, but what's mn? millineconds?

10

pommedeterresautee OP t1_j7uml76 wrote

lol unfortunately no, minutes :(

15