Viewing a single comment thread. View all comments

smallest_meta_review OP t1_ivam34g wrote on November 6, 2022 at 3:50 PM

Yeah, or even across different classes of RL methods: reusing a policy for training a value-based RL (e.g, DQN) or model-based RL method.