still_tyler t1_iz587yc wrote on December 6, 2022 at 4:02 PM

What's the best way to go about a multiclassification problem in which 3 of the features are x, y, z coordinates, with each row only having one location per outcome?

I'd like to take advantage within the model of the idea that there is spatial correlation in the outcomes (e.g. one record close to another in x, y, z will likely have a similar outcome). The spatial components make me want to use a CNN, but each input being just a 1x3 vector rather than something bigger makes me think that's not possible?

(fwiw, xgboost has the best predictive accuracy. Tried a gaussian process too but XGB still beat it. Was thinking there might be a NN approach but google has not been fruitful)

trnka t1_iz9ol1s wrote on December 7, 2022 at 2:44 PM

> one record close to another in x, y, z will likely have a similar outcome

That sounds a lot like k-nearest neighbors, or SVM with RBF kernel. Might be worth giving those a shot. That said, xgboost is effective on a wide range of problems so I wouldn't be surprised if it's tough to beat. Under the hood I'm sure it's learning approximated bounding boxes for your classes.

I haven't heard of CNNs being used for this kind of problem. I've more seen CNNs for spatial processing when the data is represented differently, for example if each input were a 3d shape represented by a 3d tensor rather than coordinates.

still_tyler t1_iz9prl7 wrote on December 7, 2022 at 2:53 PM

Yeah, XGB still outperforms knn and svm here. There's a bunch of other non-coordinate covariates that contribute and XGB just kicks butt in this case. Fair enough, thanks for the response!

csreid t1_izbgnyy wrote on December 7, 2022 at 9:51 PM

> The spatial components make me want to use a CNN, but each input being just a 1x3 vector rather than something bigger makes me think that's not possible?

The point of the convolution is to efficiently capture information from surrounding pixels when considering a single pixel. Back in the pre-DL olden days, computer vision stuff still involved convolutions, they were just handcrafted -- we had a lot of signal processing machinery we could use to eg detect edges and such. In your case, you don't really have anything to convolve over.

You could try just feeding the coordinates into an MLP with the other covariates and it should be able to capture that spatial component.