Submitted by PleaseKillMeNowOkay t3_xtadfd in deeplearning
sydjashim t1_iqtu9hp wrote
Did you keep same initial weights for both the networks ?
PleaseKillMeNowOkay OP t1_iqu4rs2 wrote
Same initialization but not the exact weights. However, I've run the experiments enough times with the same result for me to be sure that the initial weights aren't an issue.
sydjashim t1_ique162 wrote
I have got a quick guess here.. maybe can be of help to you.. take the n-1 layers weights of your first learned model (trained weights) then try finetuning with the 4 outputs and observe either your validation loss is improving.
If so, then later you can take the untrained initial weights of your first model (till n-1th layer) then trying converging them with 4 outputs. This step is mentioned such that you have got a model started training from scratch for 4 outputs but having the same initial weights for both the models.
Why am i saying this ?
Well. I think you could try in this way since you expect to keep maximum params esp. model parameters (weights) similar while running the comparision between them.
PleaseKillMeNowOkay OP t1_iqufvaa wrote
This seems interesting. I'll give this a shot. Thanks!
Viewing a single comment thread. View all comments