In September, chess world champion Magnus Carlsen accused a younger grandmaster, Hans Niemann, of cheating after losing a game to him in the 2022 Sinquefield Cup. After weeks of speculation, a Chess.com investigation found that Niemann had likely cheated over 100 times in online chess, lending greater credence to Carlsen’s accusation.
While this particular case has captivated the public because of Carlsen’s involvement, as well as rumors about how Niemann might have gained the upper hand, many top players fear that it’s easy to cheat and get away with it.
“There is a great deal of paranoia,” grandmaster Jon Tisdall said in a recent conversation with me on the Second Captains podcast. “And this is the word that the top players use about it. My general view is that all of them suspect some of their colleagues—not necessarily the same ones, and not necessarily for the same things.” This paranoia poses an existential threat to the professional game of chess and must be addressed with changes to how cheating is detected, reported, and investigated by organizers and the governing bodies of chess.
How exactly would someone cheat during a chess game? While there is speculation about the possible use of high-tech gadgets such as this one, or the use of a small vibrating device as was alleged in the case of Borislav Ivanov (who was suspected of hiding a device in his shoe), the most common form of cheating in tournaments is much less glamorous: the use of cell phones in bathrooms. Such was the case of Igors Rausis, who was caught in the 2019 Strasbourg Open Chess tournament when a photograph of him in the bathroom on his cell phone, analyzing his game, leaked. While Rausis admitted to cheating and subsequently announced his retirement from professional chess, FIDE, the international governing body of chess, revoked his grandmaster title and banned him from playing FIDE-rated events for six years.
In another well-known case, several members of the French national team were caught in a complex scheme during the 2010 Chess Olympiad. One player followed a broadcast of the games from a remote location and analyzed them on a computer. He then texted the best moves to the team’s captain, who was present in the playing hall. The team captain then communicated the moves to the player at the board using a convoluted visual code.
Since 2006, when the cheating scandal known as “toiletgate” rocked the chess world, FIDE has used statistician Kenneth Regan’s model to analyze chess games and make determinations about cheating in situations where there is no concrete evidence.
Regan’s model determines the likelihood of cheating by analyzing the moves in a player’s game compared to the expected performance based on their rating. It is not designed to flag anyone who may be cheating, only to catch those who almost certainly have cheated. Such determinations are much easier to make with weaker players. Recently, Regan told ChessBase that there is no reason whatsoever to suspect Hans Niemann of cheating—a verdict now in doubt, given the Chess.com report.
Most PopularThe End of Airbnb in New YorkBusiness
The core issue is that FIDE’s system for handling cheating allegations has as its North Star the idea that a false positive—a case where an innocent player is wrongly accused—should be avoided at all costs because the potential damage to a player’s reputation is severe. FIDE, using Regan’s model, requires certainty of the upper bounds of 99 percent to find a player guilty. In addition, a player who is found to have made a baseless accusation of cheating could have their case sent to the Ethics and Disciplinary Committee, which has at its disposal a wide range of punishments, from warnings to being banned from play altogether. This explains Carlsen’s restrained statements about Niemann: “There is more that I would like to say. Unfortunately, at this time I am limited in what I can say without explicit permission from Niemann to speak openly.”
All this has created a problem: an anti-cheat detection system that may be rife with false negatives and unreported cases. “Personally, I take [the finding from Regan’s model] with a grain of salt,” Canadian grandmaster Eric Hansen said recently on his Youtube channel Chess Brah. “ I think most of the high-level cheating or cheating that could involve high-level players would bypass his model, because his model has to be conservative, understandably.”
Grandmaster Fabiano Caruana, in a recent episode of his new podcast C-Squared, also said he would take Regan’s analysis with a large grain of salt. “The reason why is not because I have any insight into his algorithm or his methods, but because I know of a case of—a very high profile case—where with absolute certainty I can say that someone was cheating in an important event,” Caruana said, “And the person was investigated and was also exonerated based on Regan’s analysis. And I am certain that there was cheating. There is no doubt in my mind that this person was cheating and they got away with it.”
So why did Regan’s model find that Niemann hadn’t cheated, while Chess.com’s did? First, the specific examples of cheating found in the Chess.com report happened in games played on Chess.com’s website and not in over-the-board tournaments, which are held in person. The report explicitly states that it did not find conclusive evidence of cheating in Niemann’s over-the-board play. Regan’s model, on the other hand, is being applied only to over-the-board chess games and is calibrated to operate as a fail-safe mechanism to catch exclusively the most egregious cases.
Plus, online chess sites have a higher volume of games played weekly than over-the-board chess games have in all of recorded history. The resulting data from online games is vast and varied enough to validate inferences that Regan told me he would stop short of making in over-the-board slow chess. For example, the Chess.com model determines the playing tendencies of a particular player, which can be used as a baseline for making a determination about deviations in the quality or style of play. Regan’s model, however, uses a single baseline from a player of a given rating.
Most PopularThe End of Airbnb in New YorkBusiness
Not every part of online cheat detection applies to in-person chess. But some factors of Chess.com’s method—such as the practice of reviewing time usage when compared to the difficulty of the moves on the board—map perfectly to the over-the-board format. Such a determination can be made only by other humans comparing the data with the games and drawing conclusions based on their expertise, taking into consideration factors such as clock times and whether the playing style matches the player’s profile. This is what you might call the “secret sauce” of Chess.com’s anti-cheat detection system. While their internal monitoring system may flag lots of players for further investigation, a team of human grandmasters reviewing the play cuts down severely on the number of false positives that are ultimately flagged in their model, and this approach should also be implemented for over-the-board games that are flagged for review.
Given the very ugly feud now playing out in headlines around the world, there is clearly a crisis in the confidence of the current system and a lack of faith in the determinations made by Regan’s model.
Much to their credit, groups like the St. Louis Chess Club are playing a leadership role in prevention. The St. Louis club added measures at the US Chess Championship such as metal detecting wands, radio frequency scanners, and even scanners that check for silicon devices. In addition to that, they have implemented a 30-minute delay on their broadcast.
However, what’s lacking is leadership on how to restore credibility to FIDE’s probabilistic assessment of events and players’ performance over time. FIDE already tracks some statistics in each player’s profile. Given the data already available, it should not be too difficult to increase transparency by adding more information to this existing dashboard—specifically, performance variance versus computer moves, database moves, or player performance variance measured against peers. FIDE could even update its Fair Play review protocols to provide for an automatic review once certain probabilistic thresholds are met, and publish quarterly findings in the aggregate to increase transparency into the process. This would ensure that a very personal and ugly feud like the one playing out in headlines right now would be less likely to happen again.
Players may feel slighted when their play is automatically flagged for review. However, if the process were transparent and credible, being automatically flagged because someone had outperformed their expected playing strength on several metrics would turn into a badge of honor.