element8 t1_j64uglo wrote on January 27, 2023 at 6:49 PM

Is network pruning in this case analogous to discarding specific evidence for more general intuitions, or is that over anthropomorphizing? How does it affect future training once pruned? can the pruning mask be applied during training since the method is operating within a local subset?

muchcharles t1_j65b3a6 wrote on January 27, 2023 at 8:34 PM

Deepmind put out a paper on adjusting the pruning mask during training (by reviving pruned weights if a transiently stored gradient exceeds some threshold).

The paper is called Rigging the Lottery (referencing initial weight lottery hypothesis) and method RigL I think.

[deleted] t1_j650m2z wrote on January 27, 2023 at 7:27 PM

[deleted]