backtorealite

backtorealite t1_iy7z5lu wrote

But I feel like this is equivalent to a statistician telling me to not trust my XGBoost model with 99% accuracy but is fine with my linear model with 80% accuracy. If it works, it works. Unrealistic model data transformations happen in all types of models and as long as you aren’t just selecting the prettiest picture that you arrived on by chance I see now problem with relying on a unsupervised transformation that may consistent of some unrealistic transformations if it fundamentally is still highly effective in getting what you want. If I know my data has interaction and non linear effects but don’t know which variables will have such effects, it seems like a UMAP or tsne transformation to two dimensions is a perfectly reasonable option and preferable to PCA in that situation. I feel like the problems you describe are mostly solved by just adjusting the parameters and making sure the clusters you find are robust to those alterations.

1

backtorealite t1_iy6uivr wrote

But saying it’s good for visualization is equivalent to saying it’s good for decomposing data into a 2D framework. So either it’s completely useless and shouldn’t be used for visualization or has some utility in downstream analysis. Doesn’t really make sense to say both. And we all know it’s not completely useless so I think it’s a bit unfair to say it should only ever be used for visualization.

2