radi-cho OP t1_j99fh5s wrote on February 20, 2023 at 6:57 AM

Reply to comment by walkingsparrow in [R] [N] In this paper, we show how a conversational model, 3.5x smaller than SOTA, can be optimized to outperform the baselines through Auxiliary Learning. Published in the ACL Anthology: "Efficient Task-Oriented Dialogue Systems with Response Selection as an Auxiliary Task." by radi-cho

About the intuition that it would produce responses further from the human ones (in fact, we see that for this variant, the BLEU is lower) - in a way, it could work as a regularization to produce more diverse responses and prevent some overfitting. That loss mostly affects the additional head's weights which are removed during inference, but we also multiply it by an optimal constant to be sure it doesn't affect the whole architecture too much. I've sent you a PM if you wish to receive some more details or empirical insights.

walkingsparrow t1_j9b7j3d wrote on February 20, 2023 at 5:29 PM

I think I understand now. Thanks for the explanation.