Submitted by MLNoober t3_xuogm3 in MachineLearning
jackfaker t1_iqzixrs wrote
A common theme in these topics is people observe the status quo design decisions, such as linear layers connected by relu, and then try and backwards rationalize it with relatively hand-wavy mathematical justification. Citing things such as the universal approximation theorem which are not particularly relevant.
The reality is that this field is heavily driven by empirical results, and I would be highly skeptical of anyone saying that "xyz is the clear best way to do it".
Viewing a single comment thread. View all comments