galaxy_dweller

galaxy_dweller OP t1_j13sd41 wrote

>Cool! Why do you think that for the base FF memory requirement keep increasing with the number of layers?

Hi u/ainap__! The memory usage of the forward-forward algorithm increases respect to the number of layers, but significantly less respect to the backpropagation algorithm. This is due to the fact that the increase in memory usage for forward-forward algorithm is just related to the number of parameters of the network: each layer contains 2000x2000 parameters which when trained using the Adam optimizer occupies approximately 64 MB. The total memory occupied difference between n_layers=2 and n_layers=47 is approximately 2.8 GB which corresponds to 64MB * 45 layers

3

galaxy_dweller OP t1_j13ri6w wrote

Hi u/kevin_malone_bacon, I saw that repo, very good work by Mohammad! But to collect the results I needed the full implementation of the paper. For instance, Mohammad’s implementation does not include the recurrent network for MNIST nor the NLP benchmark.
Another thing I wanted to test was to concatenate the one-hot representation of the labels in FF baseline instead of replacing the values of the first 10 pixels, so that I could apply the same network to new datasets in the future

6