ThisIsMyStonerAcount t1_ivy34sr wrote on November 11, 2022 at 1:55 PM

Reply to comment by jrkirby in [R] ZerO Initialization: Initializing Neural Networks with only Zeros and Ones by hardmaru

Knowing about subgradients (see other answers) is nice and all, but in the real world what matters is what your framework does. Last time I checked, both pytorch and jax say that the derivative of max(x, 0) is 0 when x=0.

samloveshummus t1_iw1ofup wrote on November 12, 2022 at 6:44 AM

Good point. But it's not the end of the world; those frameworks are open source, after all!