LeN3rd t1_jcgo5ro wrote on March 16, 2023 at 6:19 PM

Reply to comment by PhysZhongli in [D] Simple Questions Thread by AutoModerator

You should take a look at uncertainty in general. What you are trying to do is calculate epistemic uncertainty. (google epistemic vs aleatoric uncertainty).

One thing that works well is to have a dropout layer, that is active during prediction!! (in tensorflow you have to feed training=True into the call to activate it during prediction). Sample like 100 times and calculate the standard deviation. This gives you a general "i do not know" function from the network. You can also do so by training 20 models and letting them output 20 different results. With this you can assign the 101 label, when the uncertainty is too high.

In my experience you should stay away from bayesian neural networks, since the are extremly hard to train, and cannot model multimodal uncertainty. (dropout can neither, but is WAAAAYYY easier to train).