anonymousTestPoster t1_ival53k wrote on November 6, 2022 at 3:44 PM

How is this idea different to using pre-trained networks (functions) then adapting these for a new problem context?

smallest_meta_review OP t1_ivancqx wrote on November 6, 2022 at 3:59 PM

Good question. I feel it's going one step further and saying why not reuse prior computational work (e.g., existing learned agents) in the same problem especially if that problem is computationally demanding (large scale RL papers do this but research papers don't). So, next time we train a new RL agent, we reuse prior computation rather than starting from scratch (e.g., we train new agents on Atari games given a pretrained DQN agent from 2015).

Also, in reincarnating RL, we don't have to stick to the same pretrained network architecture and can possibly try some other architecture too.

luchins t1_ivbuz90 wrote on November 6, 2022 at 8:40 PM

> I feel it's going one step further and saying why not reuse prior computational work (e.g., existing learned agents) in the same problem

could you make me an example please? I don't get what you mean with using agents with different architectures

smallest_meta_review OP t1_ivcghme wrote on November 6, 2022 at 11:05 PM

Oh, so one of the examples in the blog post is that we start with a DQN agent with a 3-layer CNN architecture and reincarnate another Rainbow agent with a ResNet architecture (Impala-CNN) using the QDagger approach for reincarnation. Once reincarnated, the ResNet Rainbow agent is further trained with RL to maximize reward. See the paper here for more details: https://openreview.net/forum?id=t3X5yMI_4G2