Submitted by seraphaplaca2 t3_122fj05 in MachineLearning
In the last few days I had a new thought. I don't know if it is possible or already done somewhere? Is it possible to merge the weights of two transformer models like they do with merging stable diffusion models? Like can I merge for example BioBert and LegalBert and get a model that can do both?
tdgros t1_jdqbgqy wrote
the model merging offered by some stable diffusion UIs do not merge the weights of a network! They merge the denoising results for a single diffusion step from 2 different denoisers, this is very different!
Merging the weights of two different models does not produce something functional in general, it also can only work for 2 models with exactly the same structure. It certainly does not "mix their functionality".