MrFlufypants t1_iqx6bhc wrote on October 3, 2022 at 7:16 PM

Reply to comment by ZestyData in [D] Why restrict to using a linear function to represent neurons? by MLNoober

The activation functions are key. A linear combination of linear combinations is probably equal to a linear combination, so 10 layers would equate to a single layer, which is only capable of so much. The activation functions destroy the linearity though and are the key ingredient there

jms4607 t1_iqxdhw3 wrote on October 3, 2022 at 8:02 PM

Without activation functions an mlp would just be y=sum(m•x) + b

ZestyData t1_iqybakm wrote on October 4, 2022 at 12:01 AM

Lol thank you! Totally dropped the ball by missing that crucial element out.