Submitted by hardmaru t3_ys36do in MachineLearning
ThisIsMyStonerAcount t1_ivy34sr wrote
Reply to comment by jrkirby in [R] ZerO Initialization: Initializing Neural Networks with only Zeros and Ones by hardmaru
Knowing about subgradients (see other answers) is nice and all, but in the real world what matters is what your framework does. Last time I checked, both pytorch and jax say that the derivative of max(x, 0)
is 0 when x=0.
samloveshummus t1_iw1ofup wrote
Good point. But it's not the end of the world; those frameworks are open source, after all!
Viewing a single comment thread. View all comments