leocus4 t1_j1cigff wrote on December 23, 2022 at 7:40 AM

In a paper, l used a decision tree with Q learning to solve LunarLander. While it's not exactly what you asked, you can see a DT as a way to discretize the Q table, so basically that decision tree corresponds to a Q table with 5 discretized states.

If you're interested I can expand more this explanation, just let me know!