leocus4

leocus4 t1_j1cigff wrote

In a paper, l used a decision tree with Q learning to solve LunarLander. While it's not exactly what you asked, you can see a DT as a way to discretize the Q table, so basically that decision tree corresponds to a Q table with 5 discretized states.

If you're interested I can expand more this explanation, just let me know!

2