Submitted by Lugi t3_xt01bk in MachineLearning
I_draw_boxes t1_iqvuh8g wrote
Reply to comment by chatterbox272 in [D] Focal loss - why it scales down the loss of minority class? by Lugi
>The alpha term is therefore being set to re-adjust the background class back up, so it doesn't become too easy to ignore.
This is it. The background in RetinaNet far exceeds foreground so the default prediction of the network will be background which generates very little loss per anchor in their formulation. Focal loss without alpha is symmetrical, but the targets and behavior of RetinaNet is not.
Alpha might be intended to bring up the loss for common negative examples to keep it in balance with foreground loss. It might also be intended to bring up the loss for false positives which are even more rare than foreground.
Viewing a single comment thread. View all comments