jeankaddour t1_ir4vzq6 wrote
Reply to comment by bernhard-lehner in [R] Stop Wasting My Time! Saving Days of ImageNet and BERT Training with Latest Weight Averaging by rlresearcher
Hi, the author here. Thank you for your comment.
My goal with the paper was not to present weight averaging as a novel approach; rather, to study the empirical convergence speed-ups in more detail.
Please have a look at the related work section where I discuss previous works using weight averaging, and feel free to let me know if I missed one that focuses on speedups.
Viewing a single comment thread. View all comments