Submitted by Secure-Technology-78 t3_10mdhxb in MachineLearning
element8 t1_j64uglo wrote
Is network pruning in this case analogous to discarding specific evidence for more general intuitions, or is that over anthropomorphizing? How does it affect future training once pruned? can the pruning mask be applied during training since the method is operating within a local subset?
muchcharles t1_j65b3a6 wrote
Deepmind put out a paper on adjusting the pruning mask during training (by reviving pruned weights if a transiently stored gradient exceeds some threshold).
The paper is called Rigging the Lottery (referencing initial weight lottery hypothesis) and method RigL I think.
[deleted] t1_j650m2z wrote
[deleted]
Viewing a single comment thread. View all comments