Simusid OP t1_jbu3q8m wrote on March 11, 2023 at 6:52 PM

Reply to comment by polandtown in [Discussion] Compare OpenAI and SentenceTransformer Sentence Embeddings by Simusid

Here is some explanation about UMAP axes and why they should usually be ignored: https://stats.stackexchange.com/questions/527235/how-to-interpret-axis-of-umap

Basically it's because they are nonlinear.

onkus t1_jbwftny wrote on March 12, 2023 at 6:21 AM

Doesn’t this also make it essentially impossible to compare the two figures you’ve shown?

Thog78 t1_jbyh4w1 wrote on March 12, 2023 at 6:24 PM

What you're looking for when comparing UMAPs is if the local relationships are the same. Try to recognize clusters and see their neighbors, or whether they are distinct or not. A much finer colored clustering based on another reduction (typically PCA) helps with that. Without clustering, you can only try to recognize landmarks from their size and shape.

[deleted] t1_jbyaq18 wrote on March 12, 2023 at 5:40 PM

[deleted]

polandtown t1_jbu56lb wrote on March 11, 2023 at 7:02 PM

Thanks!