A_Again

A_Again t1_j716fo0 wrote

You could always correlate the existing weights to the existing classes in the dataset and wipe the lowest-N correlated weights from each layer while adding a new output with new weights. this could catastrophically impact performance but also would guarantee you minimize impact on existing classes ...

I work with AI but can't guarantee this works since you have no notion of how weights earlier in the network impact latter layers....

1

A_Again t1_iu23iwf wrote

Hello friend, using the contrast in labels without supervised labels usually is referred to as a "contrastive predictive coding". Contrastive because you're using contrast between embeddings, predictive because...well you get it

For your consideration though the term "triplet loss" applies what you're describing...just in a supervised mode. And dual-encoder retrieval systems are used for semantic search

I'm on a bus rn and gotta run I hope these at least help you in your search :)

3