Similar to the AlphaZero program that has revolutionized the chess world to some extent, the new bot called ReBeL (short for Recursive Belief-based Learning) can achieve superhuman performance in heads-up no-limit hold’em through “self-play reinforcement learning.”
The bot is apparently even stronger than the 2019 poker AI called Pluribus. Versions of poker AI have been able to beat top human players in heads-up no-limit hold’em since 2017, when one called Libratus took down a group of elite poker pros.
Unlike chess, poker is a game where players don’t have access to all the important information. The two hole cards are not known. Poker is considered “an imperfect-information game.” According to the paper, previous self-learning AI had trouble with games like poker.
The work comes from Noam Brown, Anton Bakhtin, Adam Lerer, and Qucheng Gong at Facebook.
“Our goal in this paper is not to chase state-of-the-art performance by any means necessary,” they wrote. “Instead, our goal is to develop a simple, flexible, effective algorithm that leverages as little expert domain knowledge as possible. Experimental results show that despite its simplicity, ReBeL is effective in large-scale two-player zero-sum imperfect-information games and defeats a top human professional with statistical significance in the benchmark game of heads-up no-limit Texas hold’em poker while using far less expert domain knowledge than any previous poker AI.”
Google’s AlphaZero, which is similar to ReBeL, not long ago set the chess world on fire with its relentlessly attacking style of play, which implied a deeper understanding of that game. It appears that ReBeL could also raise the bar for poker, albeit in a way that is apparently indistinguishable to human players.
While ReBeL achieved so-called “superhuman” performance in heads-up no-limit hold’em, it’s not the first poker AI to reach that level. Previous bots have, but ReBeL apparently marks an enhancement to computer science’s mastery over poker, according to the paper.
The paper made numerous references to ReBeL being fast — really fast. It learned to play poker without pre-computed shortcuts or bundled information on how to play poker without being exploitable.
In a match against poker pro Dong Kim, ReBeL “played faster than 2 seconds per hand and never needed more than 5 seconds for a decision,” the paper said.
It also beat Kim more convincingly than previous AI had, over a sample size of 7,500 hands.
The speed of ReBeL allows it to be versatile across various stack sizes. ReBeL was “trained on all stack sizes between 5,000 and 25,000 chips, rather than just the standard 20,000,” stated the paper. The emphasis here is on all stack sizes. This apparently makes ReBeL an even greater threat to the integrity of online poker games because players buy in and play with arbitrary amounts, which also fluctuate over the course of the game as chips are won and lost.
“The most immediate risk posed by this work is its potential for cheating in recreational games such as poker,” the researchers wrote. “While AI algorithms already exist that can achieve superhuman performance in poker, these algorithms generally assume that participants have a certain number of chips or use certain bet sizes. Retraining the algorithms to account for arbitrary chip stacks or unanticipated bet sizes requires more computation than is feasible in real time. However, ReBeL can compute a policy for arbitrary stack sizes and arbitrary bet sizes in seconds. Partly for this reason, we have decided not to release the code for poker.”
If ReBeL was sitting at a live poker cash game table its speed would be meaningless — unless, of course, it could call the clock on its opponent. Poker AI isn’t that sophisticated. However, with online poker, in which players have a limited amount of time to make a decision, speed is strength. ReBeL’s playing speed in a game against a human player could lead to that player making more mistakes.
Subscribe to get the latest NJ online casino and sports betting news to your inbox.