Viewing a single comment thread. View all comments

Simusid OP t1_jbsyp5n wrote

Yesterday I set up a paid account at OpenAI. I have been using the free sentence-transformers library and models for many months with good results. I compared the performance of the two by encoding 20K vectors from this repo https://github.com/mhjabreel/CharCnn_Keras. I did no preprocessing or cleanup of the input text. The OpenAI model is text-embedding-ada-002 and the SentenceTransformer model is all-mpnet-base-v2. The plots are simple UMAP(), with all defaults.I also built a very generic model with 3 dense layers, nothing fancy. I ran each model ten times for the two embeddings, fitting with EarlyStopping, and evaluating with hold out data. The average results were HF 89% and OpenAI 91.1%. This is not rigorous or conclusive, but for my purposes I'm happy sticking with SentenceTransformers. If I need to chase decimal points of performance, I will use OpenAi.

Edit - The second graph should be titled "SentenceTransformer" not HuggingFace.

79

ID4gotten t1_jbt63ni wrote

Maybe I'm being "dense", but what task was your network trained to accomplish? That wasn't clear to me from your description.

43

Simusid OP t1_jbt91tb wrote

My main goal was to just visualize the embeddings to see if they are grossly different. They are not. That is just a qualitative view. My second goal was to use the embeddings with a trivial supervised classifier. The dataset is labeled with four labels. So I made a generic network to see if there was any consistency in the training. And regardless of hyperparameters, the OpenAI embeddings seemed to always outperform the SentenceTransformer embeddings, slightly but consistency.

This was not meant to be rigorous. I did this to get a general feel of the quality of the embeddings, plus to get a little experience with the OpenAI API.

30

quitenominal t1_jbtr6g7 wrote

fwiw this has also been my finding when comparing these two embeddings for classification tasks. Better, but not enough to justify the cost

8

polandtown t1_jbu2zqe wrote

Learning here, but how are you axes defined? Some kind of factor(s) or component(s) extracted from each individual embedding? Thanks for the visualization, as it made me curious and interested! Good work!

6

Simusid OP t1_jbu3q8m wrote

Here is some explanation about UMAP axes and why they should usually be ignored: https://stats.stackexchange.com/questions/527235/how-to-interpret-axis-of-umap

Basically it's because they are nonlinear.

12

onkus t1_jbwftny wrote

Doesn’t this also make it essentially impossible to compare the two figures you’ve shown?

6

Thog78 t1_jbyh4w1 wrote

What you're looking for when comparing UMAPs is if the local relationships are the same. Try to recognize clusters and see their neighbors, or whether they are distinct or not. A much finer colored clustering based on another reduction (typically PCA) helps with that. Without clustering, you can only try to recognize landmarks from their size and shape.

2

Geneocrat t1_jbu4law wrote

Thanks for asking the questions seemingly obvious questions so that I don’t have to wonder.

2

imaginethezmell t1_jbszsey wrote

openai is 8k

how about sentence transformer

9

montcarl t1_jbtexjk wrote

This is an important point. The performance similarities indicate that the sentence lengths of the 20k dataset were mostly within the SentenceTransformer max length cutoff. It would be nice to confirm this and also run another test with longer examples. This new test should result in a larger performance gap.

10

Simusid OP t1_jbt13iy wrote

8K? I'm not sure what you're referring to.

3