Submitted by smallest_meta_review t3_yng63w in MachineLearning
anonymousTestPoster t1_ival53k wrote
How is this idea different to using pre-trained networks (functions) then adapting these for a new problem context?
smallest_meta_review OP t1_ivancqx wrote
Good question. I feel it's going one step further and saying why not reuse prior computational work (e.g., existing learned agents) in the same problem especially if that problem is computationally demanding (large scale RL papers do this but research papers don't). So, next time we train a new RL agent, we reuse prior computation rather than starting from scratch (e.g., we train new agents on Atari games given a pretrained DQN agent from 2015).
Also, in reincarnating RL, we don't have to stick to the same pretrained network architecture and can possibly try some other architecture too.
luchins t1_ivbuz90 wrote
> I feel it's going one step further and saying why not reuse prior computational work (e.g., existing learned agents) in the same problem
could you make me an example please? I don't get what you mean with using agents with different architectures
smallest_meta_review OP t1_ivcghme wrote
Oh, so one of the examples in the blog post is that we start with a DQN agent with a 3-layer CNN architecture and reincarnate another Rainbow agent with a ResNet architecture (Impala-CNN) using the QDagger approach for reincarnation. Once reincarnated, the ResNet Rainbow agent is further trained with RL to maximize reward. See the paper here for more details: https://openreview.net/forum?id=t3X5yMI_4G2
Viewing a single comment thread. View all comments