I-am_Sleepy t1_j7ybb41 wrote on February 10, 2023 at 7:07 AM

Reply to comment by Ulfgardleo in [D] Critique of statistics research from machine learning perspectives (and vice versa)? by fromnighttilldawn

I don’t think ML researcher didn’t care about model calibration or tail risks. Just it often doesn’t came up in experimental settings

It also depends on the objective. If your goal is regression or classification, then tail risk and model calibration might be necessary as supporting metrics

But for more abstract use case such as generative modeling, it is debatable if tail risk and model calibration actually matter. For example GANs model can experience mode collapse such that the generated data isn’t as diverse as the original data distribution. But it doesn’t mean the model is totally garbage either

Also I don’t think statistics and ML is totally different, because most of statistical fundamentals is also ML fundamentals. And such many of ML metrics is directly derive from fundamental statistics and / or related fields

Ulfgardleo t1_j7yd02x wrote on February 10, 2023 at 7:28 AM

You are right, but the point I was making that in ml in general those are not of high importance and this already holds for rather basal questions like:

"For your chosen learning algorithm, under which conditions holds that: in expectation over all training datasets of size n, the Bayes risk is not monotonously increasing with n"

One would think that this question is of rather central importance. Yet no-one cares, and answering this question is non-trivial for linear classification already. Stats cares a lot about this question. While the math behind both fields is the same, (all applied math is a subset of math, except if you people who identify as one of both) the communities have different goals.