Submitted by alik31239 t3_117blae in MachineLearning
marixer t1_j9b0x65 wrote
The step you're missing there is finding the cameras positions and angles with something like COLMAP, predicting them by extracting features from the images, pairing and triangulating. That data is then used alongside the RGB images to train the nerf
buyIdris666 t1_j9ctyh8 wrote
Yup. Nerf just replaced the construction step after you "register" all the camera positions using traditional algorithms. Usually via COLMAP.
Not saying that's a bad thing, existing algorithms are already good at estimating camera positions and parameters. It was the 3d reconstruction step that was previously lacking.
For anyone wanting to try this, I suggest using Nerf-W . The original Nerf required extremely accurate camera parameter estimates that you're not going to get with a cell camera and COLMAP. Nerf-w is capable of doing some fine adjustments as it runs. It even works decent reconstructing scenes using random internet photos.
The workflow is COLMAP to register the camera positions used to take the pictures and estimate camera parameters, then export those into the Nerf model. Most of the Nerf repos are already setup to make this easy.
This paper is a good overview of how to build a Nerf from random unaligned images. They did it using frames from a sitcom, but you could take a similar approach to Nerf almost anything https://arxiv.org/abs/2207.14279
Viewing a single comment thread. View all comments