>Cool! Why do you think that for the base FF memory requirement keep increasing with the number of layers?
Hi u/ainap__! The memory usage of the forward-forward algorithm increases respect to the number of layers, but significantly less respect to the backpropagation algorithm. This is due to the fact that the increase in memory usage for forward-forward algorithm is just related to the number of parameters of the network: each layer contains 2000x2000 parameters which when trained using the Adam optimizer occupies approximately 64 MB. The total memory occupied difference between n_layers=2 and n_layers=47 is approximately 2.8 GB which corresponds to 64MB * 45 layers
Hi u/kevin_malone_bacon, I saw that repo, very good work by Mohammad! But to collect the results I needed the full implementation of the paper. For instance, Mohammad’s implementation does not include the recurrent network for MNIST nor the NLP benchmark.
Another thing I wanted to test was to concatenate the one-hot representation of the labels in FF baseline instead of replacing the values of the first 10 pixels, so that I could apply the same network to new datasets in the future
galaxy_dweller OP t1_j13sd41 wrote
Reply to comment by ainap__ in [R] PyTorch implementation of Forward-Forward Algorithm by Geoffrey Hinton and analysis of performances over backpropagation by galaxy_dweller
>Cool! Why do you think that for the base FF memory requirement keep increasing with the number of layers?
Hi u/ainap__! The memory usage of the forward-forward algorithm increases respect to the number of layers, but significantly less respect to the backpropagation algorithm. This is due to the fact that the increase in memory usage for forward-forward algorithm is just related to the number of parameters of the network: each layer contains 2000x2000 parameters which when trained using the Adam optimizer occupies approximately 64 MB. The total memory occupied difference between n_layers=2 and n_layers=47 is approximately 2.8 GB which corresponds to 64MB * 45 layers