Viewing a single comment thread. View all comments

mrpogiface t1_irz4o45 wrote on October 12, 2022 at 2:51 AM

The theoretical justification of having the softmax in the loss is nice. Aside from the numerical stability bit, using the softmax / cross entropy makes sense probabilistically