Viewing a single comment thread. View all comments

suflaj t1_j5r4u61 wrote

Dropout is not strictly a linear function (it can be randomly), and the chances are that it will add non-linearity for p>0, so yeah, that probably made the difference.

2