Viewing a single comment thread. View all comments

nmkd t1_jdhmgpm wrote on March 24, 2023 at 1:40 PM

Reply to comment by banmeyoucoward in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

Nope, it's multimodal in terms of understanding language and images. It wasn't trained on mouse movement because that's neither language nor imagery.

Jean-Porte t1_jdjagqg wrote on March 24, 2023 at 8:09 PM

> use 2 images
> movement
> boom

snylekkie t1_jdo5afc wrote on March 25, 2023 at 9:38 PM

Absolutely mental