pocolai t1_irw0b82 wrote on October 11, 2022 at 1:45 PM Reply to [D] Classification with final layer having no activation? by AbIgnorantesBurros this is just for numerical stability when computing the loss. the user can apply softmax to the last layer during inference. Permalink 2
pocolai t1_irw0b82 wrote
Reply to [D] Classification with final layer having no activation? by AbIgnorantesBurros
this is just for numerical stability when computing the loss. the user can apply softmax to the last layer during inference.