Mental-Swordfish7129 t1_j2y3l18 wrote
Reply to comment by maizeq in [R] Do we really need 300 floats to represent the meaning of a word? Representing words with words - a logical approach to word embedding using a self-supervised Tsetlin Machine Autoencoder. by olegranmo
>I usually see attention in PP implemented, conceptually at least, as variance parameterisation/optimisation over a continuous space.
Continuous spaces are simply not necessary for what I'm doing. I avoid infinite precision because there is little need for precision beyond a certain threshold.
Also, I'm just a regular guy. I do this in my limited spare time and I only have relatively weak computational resources and hardware. I'm trying to be more efficient anyway; like the brain. It makes it all very efficient because there is not a floating point operation in sight.
Discrete space works just fine and there is no ambiguity possible for what a particular index of the space represents. In a continuous space, you'd have to worry that something has been truncated or rounded away.
Idk. Maybe my reasons are ridiculous.
Viewing a single comment thread. View all comments