Viewing a single comment thread. View all comments

abstractcontrol t1_ivjjbfq wrote

Poker really brings out all the weaknesses of deep learning, it is hardly a solved thing. For example, if you log into Stars and do a HU SNG, you'll see that you start with 1,000 stacks and 10/20 blinds. That means you have 960 different raises + call + fold different actions to account for just in that small game. You also have large reward variance that deep RL algorithms can't deal with properly. Some algos like categorical DRL are just too memory inefficient to be used even on moderately large games. You'd be amazed at how much memory having around 1,000 different actions takes up once you start using mini-batches.

The academic SOTA is to just stick a tabular algorithm on top of some deep net, which is hardly elegant. All these algorithms are just hacks and I wouldn't use them for real money play.

6

LetterRip t1_ivk4gfy wrote

> The academic SOTA is to just stick a tabular algorithm on top of some deep net, which is hardly elegant. All these algorithms are just hacks and I wouldn't use them for real money play.

They absolutely crush the best players in the game, and beat less than the best by absurd amounts.

While there are is a huge action space, it turns out that very few bet sizes are needed on early streets (4 is generally adequate), and the final street can be solved on the go.

5