Submitted by Secure-Technology-78 t3_10mdhxb in MachineLearning
muchcharles t1_j65b3a6 wrote
Reply to comment by element8 in [R] SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot by Secure-Technology-78
Deepmind put out a paper on adjusting the pruning mask during training (by reviving pruned weights if a transiently stored gradient exceeds some threshold).
The paper is called Rigging the Lottery (referencing initial weight lottery hypothesis) and method RigL I think.
Viewing a single comment thread. View all comments