Mr_Smartypants
Mr_Smartypants t1_jc9ki0n wrote
Reply to comment by jujujuice92 in My family doesn't eat meat after 6pm because of The Consequence by WeirdBryceGuy
I want to know when it resets.
Is it midnight, like when people fast for Lent?
Or is it sun-up, like with that Mogwai rule?
Mr_Smartypants t1_jb3tlu5 wrote
> We begin our investigation into dropout training dynamics by making an intriguing observation on gradient norms, which then leads us to a key empirical finding: during the initial stages of training, dropout reduces gradient variance across mini-batches and allows the model to update in more consistent directions. These directions are also more aligned with the entire dataset’s gradient direction (Figure 1).
Interesting. Has anyone looked at optimally controlling the gradient variance with other means? I.e. minibatch size?
Mr_Smartypants t1_j3n7tt0 wrote
Reply to comment by Hunter-of-darkness in A terrifying confession from my childhood by Hunter-of-darkness
.22 should be fine if you can get some silver bullets.
They will be among the cheapest silver ammo, because of the low caliber.
Mr_Smartypants t1_j228iec wrote
Reply to So I probably should have listened better when the rep was touring me around the apartment. by AsALark
What's a kidney for a filtered pool? You got two!
Mr_Smartypants t1_ir7rbxp wrote
Reply to comment by harharveryfunny in [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada
At the end of RL training, they don't just have an efficient matrix multiplication algorithm (sequence of steps), they also have the policy they learned.
I don't know what that adds, though. Maybe it will generalize over input size?
Mr_Smartypants t1_jdgxsnc wrote
Reply to I have a paralyzing fear of water. After I became a dad, I found out why. by scarymaxx
Earplugs. You can learn to sleep with them, no problem. I did it in college!