Viewing a single comment thread. View all comments

pm_me_your_ensembles t1_j01xzcw wrote

The two are not comparable. In a multi-class single-label problem, you do K distinct projections, one for each class, but then they are combined via softmax to give you something that resembles probabilities. Since no such function is applied, it's not possible to compare the two as they don't influence each other in any way.

However, you shouldn't treat whatever a NN outputs as a probability even if it's within [0,1] as NNs are known to be overconfident.

7

alkaway OP t1_j01zhl7 wrote

Thanks so much for your response!

This makes sense. Are you aware of any techniques that can be used to make these probabilities comparable?

I understand that the outputs shouldn't necessarily be treated as probabilities. I simply want a relative ordering of the pixels in terms of "likelihood."

3

trajo123 t1_j023qfb wrote

You could reformulate your problem to output 4 channels: "only disease A", "only disease B", "both disease A and disease B" and "no disease". This way a softmax can be applied to to these outputs, their probabilities summing to 1.

[EDIT] corrected number of classes

7

alkaway OP t1_j024u31 wrote

Thanks for your response -- This is an interesting idea! Unfortunately, I am actually training my network to predict 1000+ classes, for which such an idea would be computationally intractable...

2

trajo123 t1_j029y2r wrote

Ah, yes it doesn't really make sense for more than a couple of classes. So if you can't make your problem multi-class, have you tried any probability calibration on the model outputs? This should make them "more comparable", I think this is the best you can do with a deep learning model.

But why do you want to rank the outputs per pixel? Wouldn't some per-image aggregate over the channels make more sense?

3

alkaway OP t1_j02owfb wrote

Thanks so much for your response! Are you aware of any calibration methods I could try? Preferably ones which won't take long to implement / incorporate :P

2

trajo123 t1_j031wsx wrote

Perhaps scikit-learn's "Probability calibration" section would be a good place to start. Good luck!

2

LearnDifferenceBot t1_j02p3jr wrote

> won't to long

*too

Learn the difference here.


^(Greetings, I am a language corrector bot. To make me ignore further mistakes from you in the future, reply !optout to this comment.)

1

[deleted] t1_j023o61 wrote

[deleted]

1

alkaway OP t1_j02675d wrote

I'm not sure I understand. Are you suggesting I normalize each pixel in each NxN label-map to be mean 0 and std of 1? And then use this normalized label-map during training?

1

pm_me_your_ensembles t1_j02eijz wrote

Never mind my previous comment.

You could normalize both channels, ie for label 1, normalize the NxN tensor pixel, same for label 2.

1