Submitted by EmbarrassedFuel t3_10w5f9u in MachineLearning
I have a problem I need to solve that, as far as I can tell, doesn't fit very well into most of the existing RL literature.
Essentially the task is to create on optimal plan over a time horizon extending a flexible number of steps into the future. The action space is both discrete and continuous - there are multiple available distinct actions, some of which need to be given continuous (but constrained) parameters.
In this problem however, the state of the environment is known ahead of time for all the future time steps, and the updated state of the agent after each action can be calculated deterministically given the action and the environment state.
Modelling the entire problem as a MILP is not feasible due to the size of the action and state space, and we have a very large data set for agent and environment state to play with. Does anyone have any suggestions for papers or models that might be appropriate for this scenario?
UnusualClimberBear t1_j7lvpz8 wrote
Looks like an optimal control problem rather than an RL one. RL is there for situations with no good model available. If stochasticity is present, but you still have a good model once the uncertainty is known, then Markov predictive control is a good way to go.