SleekEagle t1_ir0hr1m wrote on October 4, 2022 at 1:27 PM

The outputs of neurons are passed through a nonlinearity which is essential to any (complex) learning process. If we didn't do this, then the NN would be a composition of linear functions which is itself a linear function (pretty boring).

As for why we choose to operate on inputs with an affine transformation before putting them through a nonlinearity, I see two reasons. The first is that linear transformations are well understood and succinct to use theoretically. The second is that computers (in particular GPUs) are very good with matrix multiplication, so we do a lot of "heavy lifting" with them and then just pass the result through a nonlinearity so we don't get a boring learning process.

Just my 2 cents, happy for input/feedback!