Mental-Swordfish7129 t1_j2twm92 wrote on January 3, 2023 at 10:37 PM

Reply to comment by t98907 in [R] Do we really need 300 floats to represent the meaning of a word? Representing words with words - a logical approach to word embedding using a self-supervised Tsetlin Machine Autoencoder. by olegranmo

This is the big deal. Interpretability is so important and I think it will only become more desirable to understand the details of these models we're building. This has been an important design criterion for me as well. I feel like I have a deep intuitive understanding of the models I've built recently and it has helped me improve them rapidly.

currentscurrents t1_j2uwlrh wrote on January 4, 2023 at 2:47 AM

I think interpretability will help us build better models too. For example, in this paper they deeply analyzed a model trained to do a toy problem - addition mod 113.

They found that it was actually working by doing a Discrete Fourier Transform to turn the numbers into sinewaves. Sinewaves are great for gradient descent because they're easily differentiable (unlike modular addition on the natural numbers, which is not differentiable), and if you choose the right frequency it'll repeat every 113 numbers. The modular addition algorithm worked by doing a bunch of addition and multiplication operations on these sinewaves, which gave the same result as modular addition.

This lets you answer an important question; why wasn't the network generalizable to other bases other than mod 113? Well, the frequency of the sinewaves was hardcoded into the network, so it couldn't work for any other bases.

The opens the possibility to do neural network surgery, and change the frequency to work with any base.

Mental-Swordfish7129 t1_j2v20d2 wrote on January 4, 2023 at 3:27 AM

That's amazing. We probably haven't fully realized the great powers of analysis we have available using Fourier transform and wavelet transform and other similar strategies.

[deleted] t1_j2zn5o5 wrote on January 5, 2023 at 1:21 AM

I think that's primarily how neural networks do their magic really. It's frequencies and probabilities all the way down

Mental-Swordfish7129 t1_j310xxm wrote on January 5, 2023 at 8:46 AM

Yes! I'm currently playing around with modifying a Kuramoto model to function as a neural network and it seems very promising.

[deleted] t1_j3152ys wrote on January 5, 2023 at 9:41 AM

Wellllll that seems cool as hell... Seems like steam punk neuroscience hahaha. I love it!