banmeyoucoward t1_jdhg7kt wrote
I'd bet that screen recordings + mouse clicks + keyboard inputs made their way into the training data too.
nmkd t1_jdhmgpm wrote
Nope, it's multimodal in terms of understanding language and images. It wasn't trained on mouse movement because that's neither language nor imagery.
Jean-Porte t1_jdjagqg wrote
> use 2 images
> movement
> boom
snylekkie t1_jdo5afc wrote
Absolutely mental
[deleted] t1_jdhje54 wrote
[removed]
Viewing a single comment thread. View all comments