Mr_Smartypants t1_jdgxsnc wrote on March 24, 2023 at 9:34 AM

Reply to I have a paralyzing fear of water. After I became a dad, I found out why. by scarymaxx

Earplugs. You can learn to sleep with them, no problem. I did it in college!

Mr_Smartypants t1_jc9ki0n wrote on March 15, 2023 at 7:19 AM

Reply to comment by jujujuice92 in My family doesn't eat meat after 6pm because of The Consequence by WeirdBryceGuy

I want to know when it resets.

Is it midnight, like when people fast for Lent?

Or is it sun-up, like with that Mogwai rule?

Mr_Smartypants t1_jb3tlu5 wrote on March 6, 2023 at 5:31 AM

Reply to [R] [N] Dropout Reduces Underfitting - Liu et al. by radi-cho

> We begin our investigation into dropout training dynamics by making an intriguing observation on gradient norms, which then leads us to a key empirical finding: during the initial stages of training, dropout reduces gradient variance across mini-batches and allows the model to update in more consistent directions. These directions are also more aligned with the entire dataset’s gradient direction (Figure 1).

Interesting. Has anyone looked at optimally controlling the gradient variance with other means? I.e. minibatch size?

Mr_Smartypants t1_j3n7tt0 wrote on January 9, 2023 at 7:07 PM

Reply to comment by Hunter-of-darkness in A terrifying confession from my childhood by Hunter-of-darkness

.22 should be fine if you can get some silver bullets.

They will be among the cheapest silver ammo, because of the low caliber.

Mr_Smartypants t1_j228iec wrote on December 29, 2022 at 3:03 AM

Reply to So I probably should have listened better when the rep was touring me around the apartment. by AsALark

What's a kidney for a filtered pool? You got two!

Mr_Smartypants t1_ir7rbxp wrote on October 5, 2022 at 11:14 PM

Reply to comment by harharveryfunny in [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada

At the end of RL training, they don't just have an efficient matrix multiplication algorithm (sequence of steps), they also have the policy they learned.

I don't know what that adds, though. Maybe it will generalize over input size?