Viewing a single comment thread. View all comments

JClub OP t1_j4v5d0y wrote on January 18, 2023 at 2:06 PM

Ah right, then you can just use the model's reward directly or pass it through a sigmoid so that the reward is between 0-1!

Do you think that the sigmoid is needed?