[deleted] t1_iuu6k35 wrote on November 3, 2022 at 1:08 AM

Reply to comment by ninjasaid13 in Meta's newest AI determines proper protein folds 60 times faster by Dr_Singularity

I think most of what they demoed is in the phase of "technically possible, but not consumer-ready yet".

Like, Codec Avatars. They initially accomplished 1.0 with a big camera-sphere. Neat, but not practical. We can't have every person visit a commercial camera-sphere to get an avatar.

So then they figure out how to do it in a way similar to FaceID - take a video of your face from a bunch of sides with a smartphone, and then do a bunch of photogrammetry post-processing on it, and build a map of the user's face. Consumers can do that with devices they have today. I think they've still said it takes many hours of processing, and then Codec 2.0 still requires the elongated headset they showed the other man using to animate their mouth properly, but I think that's what's coming for consumers in the future, and now that they're sure it's technically possible, they can start to optimize toward that very desirable endpoint, to achieve this result more quickly and easily.

Now, they also have to combine this stuff with high-res environments, to avoid this being too uncanny; you don't want your high-res avatars in a cartoon environment. So this is where item scanning comes in. Starts small, same basic technology as face-scanning, but ends with a user being able to digitally import a whole room, or an intersection of a major city, or whatever.

Luckily, game engines and hardware are "cooperating" with this timeline. You can look at Unreal Engine 5 demos, like Matrix City or the Train Station to see where that will be in the near future. Intel and Nvidia are constantly out there showing new "real-time raytracing" demos (for example, and this) as lighting continues to be optimized as well.

> I can't imagine the connection between what we got and what they have in the labs.

If I was to hazard a guess, it's partly them struggling to normalize/introduce it to people, and partly producing an MVP so they can observe how people 'use' it, and iterate as they discover what the real sticking points of the tech are. I think everyone knows that VR has an "input mechanism problem", in a number of places, and you can see them moving toward fixing it.

From a "hands" perspective, they introduced tracked controllers as the obvious MVP, but they're clearly also examining what the minimum necessary hand tracking is to allow a user complex and useful input options, in a way that's unobtrusive and intuitive, using on-device processing of small motor movements.

You instinctively want to "move" in VR, but this isn't compatible with the average person's real environment. If you virtualize movement, you end up with an inner-ear disconnect, and this makes people sick. Many companies, including Meta, are choosing native AR as medium ~~short~~-term solution, to marry the virtual and real environments together, so the user can navigate their real environment safely, since nobody but the enthusiasts are willing or able to have a "VR room" to facilitate safe movement.