Submitted by Angry_Grandpa_ t3_y92cl1 in singularity
newDeckardCain t1_it3ihks wrote
This is interesting something that stability.ai should do. A further interesting iteration of this would be to associate an image i.e. the current frame in the video to the token and maybe that prompts the model to also have a world model.
Like what Yan LeCun has been advocating for.
visarga t1_it4qoq7 wrote
After text, image and video (+ audio) I think we got all the bases covered. Nobody can claim AI is not grounded anymore. And with this grounding comes a nuanced, semantic understanding of the world. It's like an upload, but not of a person, the whole culture gets to be uploaded at once.
Viewing a single comment thread. View all comments