Yea this makes sense. If it's only detach for all layers it's like the .eval() method which needs to probably make a copy (as per your explanation) once of the whole model footprint but in this case, it has to keep multiple copies at every point I detach, I guess.
mishtimoi OP t1_ir9wwlx wrote
Reply to comment by suflaj in Time Complexity of Detach() in torch "[R]" by mishtimoi
Yea this makes sense. If it's only detach for all layers it's like the .eval() method which needs to probably make a copy (as per your explanation) once of the whole model footprint but in this case, it has to keep multiple copies at every point I detach, I guess.