notdelet
notdelet t1_j9g627c wrote
Reply to comment by [deleted] in [D] On papers forcing the use of GANs where it is not relevant by AlmightySnoo
> Assuming Gaussianity and then using maximum likelihood gives yields an L2 error minimization problem.
Incorrect, only true if you fix the scale parameter. I normally wouldn't nitpick like this but your unnecessary usage of bold made me.
> (if you interpret training as maximum likelihood estimation)
> a squared loss does not "hide a Gaussian assumption".
It does... if you interpret training as (conditional) MLE. Give me a non-Gaussian distribution with an MLE estimator that yields MSE loss. Also, residuals are explicitly not orthogonal projections whenever the variables are dependent.
notdelet t1_j7vv9pi wrote
Reply to comment by tdgros in [D] Constrained Optimization in Deep Learning by d0cmorris
You can get constrained optimization in general for unconstrained nonlinear problems (see the work N Sahinidis has done on BARON). The feasible sets are defined in the course of solving the problem and introducing branches. But that is both slow, doesn't scale to NN sizes, and doesn't really answer the question ML folks are asking (see the talk at the IAS on "Is Optimization the Right Language for ML").
notdelet t1_j552geq wrote
Reply to comment by SearchAtlantis in [R] Researchers out there: which are current research directions for tree-based models? by BenXavier
Haha yes, Cynthia Rudin.
notdelet t1_j4wi2gy wrote
Reply to [R] Researchers out there: which are current research directions for tree-based models? by BenXavier
Well there is the stuff that Rudin is doing with Rashomon Sets/small explainable trees. Then there is the stuff on optimal decision trees using mixed-integer programs. I'm not working on the area at the moment, but those are the things I have heard people talk about recently.
notdelet t1_j48yvht wrote
Reply to [D] Bitter lesson 2.0? by Tea_Pearce
Hot take: foundation models is pure branding, so if they say it's foundation models it will be foundation models that we're all using.
notdelet t1_iyjhvqd wrote
Reply to comment by whatsafrigger in [R] Statistical vs Deep Learning forecasting methods by fedegarzar
If you use a flawed evaluation procedure, does a solid baseline do you any good?
notdelet t1_j9gija3 wrote
Reply to comment by notdelet in [D] On papers forcing the use of GANs where it is not relevant by AlmightySnoo
In the future know that blocking someone after replying to them prevents them from responding to your reply. This means you are giving the false impression I am not responding to you (but can) to those who are not one of us.