Why are you using euclidiean distance? Use cosine distances. The former cares about vector magnitue, the latter doesn't. As a general rule of thumb for comparing vector embeddings, you don't care about magnitude, at best, that typically captures document length.
Do you have more than product titles, such as product descriptions? Where do you get the user queries from? Do you use a default tokenizer for BERT?
I have product brand, type and color other than titles. Yes I'll try cosine distances next. User queries are just tests done by me. Because there's no other way around except for A/B testing.
Thank you.
If you are looking for product-query similarity, you could try using a Word2Vec model. You can train a Word2Vec model on your dataset, and then use the model to find the most similar words for each product title and user query. This should give you a better understanding of the similarity between the two.
You can also try using an embedding-based approach, such as using an embedding layer in a neural network. This would enable you to learn more complex relationships between product titles and user queries.
You could also try using a matrix factorization technique such as Singular Value Decomposition (SVD) or Non-Negative Matrix Factorization (NMF). These methods can help you to identify latent features in your dataset, which can be used to generate better recommendations.
> You can also try using an embedding-based approach, such as using an embedding layer in a neural network. This would enable you to learn more complex relationships between product titles and user queries.
curiousshortguy t1_j6ak1cj wrote
Why are you using euclidiean distance? Use cosine distances. The former cares about vector magnitue, the latter doesn't. As a general rule of thumb for comparing vector embeddings, you don't care about magnitude, at best, that typically captures document length.
Do you have more than product titles, such as product descriptions? Where do you get the user queries from? Do you use a default tokenizer for BERT?