Submitted by MLNoober t3_xuogm3 in MachineLearning
ZestyData t1_iqwjiua wrote
- Layered linear nodes can model non-linear behaviours.
- Computational complexity. Its more efficient to use the aforementioned layers of linears than it is to use non-linear functions directly
MrFlufypants t1_iqx6bhc wrote
The activation functions are key. A linear combination of linear combinations is probably equal to a linear combination, so 10 layers would equate to a single layer, which is only capable of so much. The activation functions destroy the linearity though and are the key ingredient there
Viewing a single comment thread. View all comments