Mental-Swordfish7129 t1_j2y3l18 wrote on January 4, 2023 at 7:27 PM

Reply to comment by maizeq in [R] Do we really need 300 floats to represent the meaning of a word? Representing words with words - a logical approach to word embedding using a self-supervised Tsetlin Machine Autoencoder. by olegranmo

>I usually see attention in PP implemented, conceptually at least, as variance parameterisation/optimisation over a continuous space.

Continuous spaces are simply not necessary for what I'm doing. I avoid infinite precision because there is little need for precision beyond a certain threshold.

Also, I'm just a regular guy. I do this in my limited spare time and I only have relatively weak computational resources and hardware. I'm trying to be more efficient anyway; like the brain. It makes it all very efficient because there is not a floating point operation in sight.

Discrete space works just fine and there is no ambiguity possible for what a particular index of the space represents. In a continuous space, you'd have to worry that something has been truncated or rounded away.

Idk. Maybe my reasons are ridiculous.