Submitted by currentscurrents t3_10n5e8z in MachineLearning
One problem with distributed learning with backprop is that the first layer can't update their weights until the computation has travelled all the way down to the last layer and then backpropagated back up. If all your layers are on different machines connected by a high-latency internet connection, this will take a long time.
In forward-forward learning, learning is local - each layer operates independently and only needs to communicate with the layers above and below it.
The results are almost-but-not-quite as good as backprop. But each layer can immediately update their weights based only on the information they received from the previous layer. Network latency no longer matters; the limit is just the bandwidth of the slowest machine.
[deleted] t1_j670cvp wrote
Commenting because I’m also interested.