muchcharles t1_j65b3a6 wrote on January 27, 2023 at 8:34 PM

Reply to comment by element8 in [R] SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot by Secure-Technology-78

Deepmind put out a paper on adjusting the pruning mask during training (by reviving pruned weights if a transiently stored gradient exceeds some threshold).

The paper is called Rigging the Lottery (referencing initial weight lottery hypothesis) and method RigL I think.