Chess Rating (for Laymen)

Paper by Dr Vladica Andrejić, University of Belgrade, Faculty of Mathematics, Serbia

vladica andrejic

Introduction

In 1970 the World Chess Federation (FIDE) accepted the rating system founded by Arpad Elo that has been functioning in the similar way ever since today. The number of players included in the first rating lists was less than 600, whereas the rating floor was 2200 points and as such corresponded to decent understanding of chess alike candidate master.

In the meantime, a lot of things have changed but the rating system remained the same. Probably due to commercial reasons, FIDE decided to assign certain points to each chess player and to decrease the rating floor onto 1200 at which level one is familiar with moving of chess pieces, but not much more.

With the sample that today amounts to more than 119,000 rated players with enormous rating differences among them, it comes logical that the initial system of chess rating can not keep on its own any longer. FIDE tried to make some corrections, but eventually it all boiled down to inappropriate introduction of an artificial increase of the opponent’s rating so as to avoid some apparent contradictions: a difference in rating of more than 400 points shall be counted for rating purposes as though it were a difference of 400 points.

Keeping mathematics aside, in this Paper I try to present my standpoint and suggestions that can improve certain shortcomings. Nonetheless, all the details and calculations that hold up subsequent theoretical propositions can be found in [7].

The nature of the chess rating

The main problem of chess rating system is to determine the f function, which assigns probability to rating difference i.e. to the expected result. In the course of time the original principle of that calculation disappeared and only the tables used by FIDE have remained. Today there is no Arpad Elo and only few can give answer to why the figures are as specified in these tables. Precisely that was the first problem I inspected.

One of potential hypotheses is that the nature of rating system preserves established ratios. In order to make it easier to understand I will provide an example and those keen on mental gymnastics may consult the original text [7] at any time. So, let’s assume that player 1 and player 2 have played a match of 10 games and that the player 1 won by 6:4. Let’s also assume that player 2 was not discouraged because of the score and that he played another match of 10 games against player 3, on which occasion he was more successful and won by 8:2 (i.e. the ratio was 4:1).

If we compare these ratios that we want to preserve, by derivation of 6:4 and 4:1 we obtain 6:1 i.e. a projection that according to our assumption the expected outcome of the match between player 1 and player 3 will correspond to the ratio 6:1 (meaning that in the event of 10 games our best estimation of the result of this match would be the triumph of player 1 by 8.5:1.5).

If we apply the hypothesis about ratio preservation on our function for rating calculation, we get (for details see [7]) the solution for which FIDE Handbook states that is a close approximation of real (FIDE) values for f [4], but that FIDE tables are to be exclusively used for making calculations [3]. Entire tables are created in accordance with calculations based on different formulas, but discrepancies between corresponding values are rather small and they amount to about 1%.

Does the ratio preservation law from our hypothesis adequately reflect the real state of things? I questioned it for the first time after I had read the article by Sonas [5] where he proposed linear scale as the alternative. For that reason I carried out several experiments using data from my website Perpetual Check [6] where I covered almost all significant tournaments (from classical time control over rapid to blitz) that were organized in Serbia and Montenegro since 2006.

Results of the experiments have indisputably convinced me that the ratio preservation hypothesis is the natural law that has to be inherent to any chess rating system. That is why in calculating I rather make use of direct formula from [4] (without rounding on two decimal places), instead of tables from [3].

A previously unrated player

We will not be able to go any further without formulas, but do not get discouraged because of that! If we apply higher mathematics in the cases of small rating difference x between the players (namely in cases where |x| is under 100), it will turn out that formula f(x) ~ 0.5 + x/695 works very well. This only proves the commonly known fact among chess players that each percent over 50% amounts to 7 (or more precisely 6.95) rating points. In order to illustrate it we will use the game between player 1 and player 2 in which the score 6:4 respectively corresponds to 60% and 40%, i.e. to the difference of 10% measured from 50%.

This difference of 10% corresponds to 7 x 10 = 70 rating points, which means that in the match between players with such rating difference the expected score should coincide with 6:4 ratio, and vice versa: if the score of this match corresponds to this proportion, it can be concluded that the most probable difference between the two players is about 70 rating points regardless of whether they play at grandmaster or amateur level.

Since player’s results as a rule deviate from expectations, it is necessary to establish how many rating points they gain or lose in such situations. Even the least experienced chess player is familiar with the method used in such situations: the expected score in percentages is expressed in points and then the difference between this score and the acquired score is calculated. Multiplying this difference by the so-called “development coefficient” one loses or gains rating according to the acquired score.

FIDE currently uses three development coefficients for rating calculations: K=25 for a player new to the rating list until he has completed events with at least 30 games; K=10 for a player who at certain point of his career has reached over 2400 rating points and K=15 for the rest of players [3].

To put it simple, changes of the development coefficient that follow the career of a certain player represent FIDE’s endeavor to rapidly provide the player at earlier stages (by assigning greater coefficient) with rating that approximately reflects his real power and that after that “point of stabilization” new changes do not occur so turbulently. In the following paragraph we will try to examine how good this works.

It is time for some mathematics, but an important conclusion is to be drawn: when K stands for the player’s development coefficient and N for the ordinal number of a game, I managed to demonstrate ([7]) that KN ~ 695, as well as KN > 695. In simple terms, it means that the development coefficient and the ordinal number of a game against the rated player are in a simple correlation (their product is always 695) and therefore one can conclude that the influence of the last played game is the same as if it was the one among the total of 695/K games played.

Now, let’s pay attention to the previously unrated player. According to FIDE Handbook, when one plays at least 9 games against rated players his initial rating will be calculated in two different ways depending on whether he scored less or more than 50%. As his initial rating the player will be assigned a performance rating of some sort for results up to 50% and if he scores more than 50% he will get the average rating of all opponents plus 12.5 rating points for each half point scored over 50% [3].

Now let’s observe the appearance of a previously unrated player, who has been assigned K=25 after a 9-round tournament. As long as he has that coefficient (first 30 games), his last game will be valued as the one of 695/K ~ 28, though the actual number of games played is much smaller than that. This may be easier to understand if the games and their respective development coefficients are presented in the following table:

N - 8 9 10 11 12 13 14 15 16 17 18 19 20

K - 77 77 25 25 25 25 25 25 25 25 25 25 25

695/N - 87 77 70 63 58 53 50 46 43 41 39 37 35

It becomes even more transparent when we realize that the first nine games are valued as if they were played under the coefficient only to be followed by a dramatic coefficient decrease for more than three times onto 25. Even greater injustice has been imposed on players who initially made more than 50% and gained only 12.5 rating points for each half point over that, because they were earning bonuses according to coefficient 25 despite the fact that those games required three times greater coefficient.

It is exceptionally important to make a good choice of the first rating tournament in which a player is going to compete. For example, if you take part in the 9-round tournament against players whose average rating is 2000 and you win almost all the games, your rating will not be greater than 2100. On the other hand, a player who competes at the average of 2500 and gains 25% (e.g. loses the first game and plays to a draw one, and so on) will eventually score 2300 rating points and immediately become a FIDE master.

We should also point out that it will not harm those 2500 players, as their draws have no influence over their rating. This is where different kinds of abuse may emerge: for example, a draw (just like the one mentioned before) between the non-rated player and the player rated 2500 can be a result of a series of arranged draws; such draws can be a result of the team captain’s orders at the team championship where, cunningly enough, the non-rated player is placed at the first board so as not to make harm to anyone else.

Consequently, we may conclude that FIDE’s solution for the previously unrated players is not the best one, which may have far-reaching consequences. If a player gets a rather unreasonable initial rating it will take him a lot of time and games to play so as to reach the level proportional to his strength, while in the meantime he will positively or negatively influence the rating of his opponents.

I will try to make it clearer: FIDE has tried (and mostly succeeded) to prevent “catapulting” of new players and their appearance at the rating list with high ratings, but at too steep a price: almost all players (especially if they “trick themselves into” playing the first tournament against the lower-rated opposition) appear on the rating list rated below their real playing strength, and then they need too much time to reach the level they actually deserve according to their strength.

With current increasing popularity of chess, number of newcomers on the rating list is ever growing and as a rule these are young players who advance swiftly. The rising number of such players with irrationally low ratings increasingly undermines the reliability of rating list and their each game additionally disrupts the system objectivity.

Because of that, it is completely clear to every objective observer that well-conceived changes should be introduced and implemented as soon as possible. By no means should rating depend on player’s ability to choose the tournament with opponents who are strong enough or on his ability to estimate his strength and accordingly make his choice.

What I would like to suggest is to provide a player with a sliding coefficient K = 695/N as soon as he gets his initial rating, where N represents the ordinal (chronological) number of a game that he has played against rated opponents. According to previous results it is a good approximation, while according to KN > 695 it is also a minimum that should be set.

The sliding coefficient should be valid until it falls under the value of K0 intended for players with stable rating, i.e. until the number of played games is under 695/K0. Naturally, reaching certain rating with the aim of obtaining an international title should be calculated only after the previously explained had happened i.e. the rating becomes stable, so that the results before that moment should not be taken into consideration.

There is also a possibility of a more radical solution according to which at the beginning of the event a so-called initial rating is assigned to all players who enter the above mentioned procedure and face possible decrease of coefficient K for extremely small values of N (primarily N=1 and N=2), for example like in the following table:

N - 1 2 3 4 5 6 7 8 9 10 11 12

K - 232 232 232 174 139 116 99 87 77 70 63 58

In that case, due to the unreliability of rating of players with small number of games, the opponent’s coefficient should also be modified (especially if he has a stable K0), so that neither he would be affected. I believe that in case when the opponents (according to the previously explained) have coefficients K and K1 (is less than K), the second one (K1) should be replaced with K1x(K1/K).

Regarding the best K0 value, a number of experiments that I have carried out proved that if we have a well-conceived rating system the coefficient K should be as small as possible. In my opinion, the main problem is the badly conceived introduction of a previously unrated player, and I think that values between 10 and 16 are quite fair.

However, I can hardly endorse the introduction of greater values of development coefficients for players with stable rating as it would completely devastate the very essence of rating system. Additionally, I believe that there should not be any discrimination as to whether a player has ever exceeded the 2400-point plateau in his career and thus acquired a different coefficient.

Rating performance

Throughout recent years, the rating performance of a player has been an extremely important issue as it is what brings grandmaster and international master norms as well as medals based on player’s performance at the Chess Olympiad. Nevertheless, the way this performance is calculated is quite controversial and it is the total of the opponents’ average rating plus the table value that depends on the player’s percentage at the tournament.

For many years such a way of calculation has been acceptable, but with the decrease of the rating floor it now requires certain changes. I would kindly ask for your patience in the following paragraph as I hope that the accompanying examples will help to clear up your doubts.

It is a common occurrence at many tournaments that certain opponents have ratings so small that beating them actually decreases one’s performance; hence, for the sake of norm performance the rating of the worst rated player is boosted so that the game can potentially be improved. Now, there is a recent grandmaster tournament example of IM Miša Pap’s performance at Paleochora Open 2010.

In the first three rounds he won all the games he played (against lower-rated opponents), and in the following six rounds he made 3.5 points against opposition with an average rating of 2576. The result he obtained in the last six rounds is equivalent to the FIDE performance of 2633, but with his three initial wins that performance (with the additional rating increase of the weakest player) fell onto 2600 sharp (without the additional rating increase it would be 2536). Luckily enough, last year the requirement for the grandmaster norm was set down from 2601 onto 2600, and thus Pap managed to get it.

I would suggest defining the performance as the rating for which the acquired result is also the expected result at the same time. It is actually the rating of a player that would remain the same after subsequent rating calculations after the tournament in question. It can easily be calculated iteratively, and the good thing is that iterative calculation does not depend on the (small) development coefficient that is included in the calculation.

For these calculations I would not use rounded values from the table, but the real values for f (i.e. according to the previously elaborated theoretical considerations), and most certainly without any rating limits like those of 400 rating points that are artificially imposed by FIDE today.

Whatever the case may be, winning the game certainly can not do any harm (as it actually did in the previous paradoxical example) and some other things are also improved. For example, let’s assume that a player gains 1.5 points in two games playing against opposition with an average rating of 2400. FIDE’s calculation of such a performance would be 2591 (or 2593 according to the tables); that would also be the value of my calculation, but only if both players have 2400 each. However, rating is not linear so if the players have 2300 and 2500 instead, the performance is 2605, and if the players have 2200 and 2600 the performance is 2648!

Even better illustration of the present illogical FIDE calculation can be derived from the similar situation, i.e. in case of a score of 1.5 points from two games against a certain rating average. To calculate the performance, we will create a new situation where a stronger opponent has rating that is greater than the calculated performance while the weaker opponent has such a rating that the rating average is the same as in the first situation (good example would be the abovementioned average of 2400 against the opponents with 2200 and 2600). If 1.5 points are gained by means of a draw against the stronger opponent and a win against the weaker opponent, if we disregard a triumph over the weaker opponent, then logically the performance can not be smaller than the rating of the stronger opponent, but according to FIDE calculations it actually is (i.e. 2591 compared to 2600)! According to my calculation, Pap’s performance at the mentioned tournament is 2648.

There has recently been a discussion at the Chessbase’s website about the rating performance [1,2] and it was proposed to calculate the performance with an additional virtual draw against oneself. Nevertheless, this concept breaches the fundamental rating performance principle according to which rating performance does not depend on the rating of the player whose rating we are calculating. The aim is not to get more realistic numbers (closer to players’ ratings), but to define that both in case of all wins or all losses rating performance should not be calculated.

As for Navara’s fantastic result 8.5/9 at the Czech championship (FIDE performance 2963), in my opinion the appropriate performance should be 2982. It simply means that if Navara had entered the tournament with this rating (2982) and played it the way he did, he would not either gain or lose any rating points.

I hope that with this Paper I have managed to clarify some of my considerations (and supplementary suggestions) also to the readership who do not want to pore over mathematical explanations of the problem (and I apologize if I did not always succeed to articulate it with the right measure).

The elaborated issue is related to all chess players who take part in competitions and who should be given an opportunity to comprehend the rating system. Even more important thing is to have this system well-conceived so that it can meet the needs of contemporary chess tournaments and as such contribute to the popularity of chess.

It goes without saying that this is only one of many possible solutions and I am looking forward to receiving feedback both from the readers with relevant mathematical knowledge as well as from laymen. I will be truly happy if this is a step towards a better and fundamentally fair rating system.

“Chess Rating (for Laymen)” by Dr Vladica Andrejic in PDF format. The author is deeply grateful to IM Ivan Markovic who translated the original text into English.

Dr Vladica Andrejic is a chess player (highest elo 2275), chess arbiter and owner of Perpetual Check website. Ph.D. in Mathematics (Differential Geometry) and Professor (docent) at University of Belgrade, Faculty of Mathematics.

References

* [1] ChessBase, Navara with a 3241 performance at the Czech Championship, 2010.

* [2] ChessBase, Navara wins Czech Championship with 8.5/9 points, 2010.

* [3] FIDE Handbook, The working of the FIDE Rating System, 2010.

* [4] FIDE Handbook, Some comments on the Rating system, 2010.

* [5] Jeff Sonas, The Sonas Rating Formula – Better than Elo?, 2002.

* [6] Vladica Andrejic, Perpetual Check, 2010.

* [7] Vladica Andrejic, The Truth about Chess Rating, 2010.

Viswanathan Anand simul against mathematicians

Part of the International Congress of Mathematicians

For the first time ever, the Fields Medal — popularly known as the Nobel Prize for mathematics — will be announced from India. The country will be the host of the prestigious International Congress of Mathematicians 2010. The Congress, which was first held way back in 1897, will take place in Hyderabad from August 19-27. The Fields Medal is awarded to the world’s best mathematicians at the Congress, held once every four years.

The women part of the congress will take place two days before. It will feature over 400 women mathematicians, 40 of which play chess against world champion Vishwanathan Anand.

This will be one of the first public events that Anand has scheduled for after the World Chess Championship.

Bundesliga Viswanathan Anand

Viswanathan Anand

Chess and fascinating mathematics

chess, math, and mythology

Legend has it that the game was invented by a mathematician in India who elicited a huge reward for its creation. The King of India was so impressed with the game that he asked the mathematician to name a prize as reward. Not wishing to appear greedy, the mathematician asked for one grain of rice to be placed on the first square of the chess board, two grains on the second, four on the third and so on. The number of grains of rice should be doubled each time.

The King thought that he’d got away lightly, but little did he realise the power of doubling to make things big very quickly. By the sixteenth square there was already a kilo of rice on the chess board. By the twentieth square his servant needed to bring in a wheelbarrow of rice. He never reached the 64th and last square on the board. By that point the rice on the board would have totalled a staggering 18,446,744,073,709,551,615 grains.

Playing chess has strong resonances with doing mathematics. There are simple rules for the way each chess piece moves but beyond these basic constraints, the pieces can roam freely across the board. Mathematics also proceeds by taking self-evident truths (called axioms) about properties of numbers and geometry and then by applying basic rules of logic you proceed to move mathematics from its starting point to deduce new statements about numbers and geometry. For example, using the moves allowed by mathematics the 18th-century mathematician Lagrange reached an endgame that showed that every number can be written as the sum of four square numbers, a far from obvious fact. For example, 310 = 172 +42 + 22 + 12.

Some mathematicians have turned their analytic skills on the game of chess itself. A classic problem called the Knight’s Tour asks whether it is possible to use a knight to jump around the chess board visiting each square once only. The first examples were documented in a 9th-century Arabic manuscript. It is only within the past decade that mathematical techniques have been developed to count exactly how many such tours are possible.

It isn’t just mathematicians and chess players who have been fascinated by the Knight’s Tour. The highly styled Sanskrit poem Kavyalankara presents the Knight’s Tour in verse form. And in the 20th century, the French author Georges Perec’s novel Life: A User’s Manual describes an apartment with 100 rooms arranged in a 10×10 grid. In the novel the order that the author visits the rooms is determined by a Knight’s Tour on a 10×10 chessboard.

Mathematicians have also analysed just how many games of chess are possible. If you were to line up chessboards side by side, the number of them you would need to reach from one side of the observable universe to the other would require only 28 digits. Yet Claude Shannon, the mathematician credited as the father of the digital age, estimated that the number of unique games you could play was of the order of 10120 (a 1 followed by 120 0s). It’s this level of complexity that makes chess such an attractive game and ensures that at the Olympiad in Russia in 2010, local spectators will witness games of chess never before seen by the human eye, even if the winning team turns out to have familiar names.

The effect of math and chess

The Chess Academy, Chicago, USA

Research studies have shown that chess can be used as an effective game-based teaching method. However, all the past studies used chess as a separate instructional tool. There were no math contents in chess instruction provided and there was no math and chess integrated workbook used. This study examined the effect on pupils’ math scores when a truly integrated math and chess workbook was used as an instructional practice workbook. The results show that the integrated math and chess workbook significantly increased pupils’ math scores between pre-tests and post-tests among grade 1 to grade 8 pupils.

Introduction

Research papers have demonstrated that chess instruction improves analytical reasoning, problem solving skills, and academic achievement (Chrisiaen & Verholfstadt (1978); Frank & D’Hondt (1979); Smith & Cage (2000)). Research conducted by Gaudreau (1992) shows no significant differences among the groups on basic calculations. These research studies point to the direction that chess has strong effect on improving children’s cognitive ability than their arithmetic computation ability. By teaching math and chess as two separate subjects, children do not have opportunities to work on basic arithmetic operations using acquired chess knowledge, this may explain why by playing chess, it may not statistically significant improve children’s basic arithmetic computation ability.

How to maximize the benefits of chess instruction in such a way that not only chess benefits children’s cognitive development, but also their computation ability? All the past chess instruction research studies have used chess instruction as an independent teaching tool and it is not truly integrated with math instruction. The author Frank Ho created a math and chess integrated workbook. The theoretical basis of how math and chess are integrated has been published by Ho (2006). We believe that with the creation of truly integrated math and chess workbooks, pupils will be able to increase their computation ability by working on these math and chess integrated workbooks. This is particularly important for those children who have no interest in playing chess, but they could still get benefit of chess instruction by working on math and chess integrated workbooks.

No research has been done before on the effects of using math and chess integrated workbook, this study will compare the effect of pupils’ math computation ability before using the math and chess integrated workbook and after using it to see if there is a significant difference.

Method

One hundred and nineteen pupils, in grade 1 to grade 8, from five public elementary schools in Chicago, Illinois, USA, participated in the after-school program for 120 minutes, twice a week, for a total of 60 hours of instruction. None of the students has possessed any substantial knowledge in chess. The study began by administering pre-tests in the first week of this study at the beginning of the program on 10/23/06 and a post-test was conducted at the end of the program on 3/28/07. Tests of TONF (The Compass Learning Explorer Online Diagnostic Tool was used for both the pre-test and post-test. The Compass Learning Explorer Assessment meets the requirements as a true valid and reliable criterion-referenced assessment tool.) were given to all pupils for both tests. Each lesson consisted of lecturing, practice on math and chess integrated worksheets and chess playing.

Results

Paired t test was used to analyze the data. The results of this study shows significantly different on their math scores for all grade 1 to grade 8 pupils between pre-test and post-test at level of p is less than 0.01.

Group Group One Group Two

Mean 36.46 55.45

SD 15.82 19.37

SEM 1.45 1.78

N 119 119

t = 12.8729

Discussion

The results of this study demonstrate that a truly integrated math and chess workbook can help significantly improve pupil’s math scores. Our observations show that the effect of using a truly integrated math and chess workbook also provides mental entertainment and thought by pupils as more fun than traditional computation practices. Pupils were able to sit longer when working on math and chess integrated workbook than working on traditional computation worksheets.

The result of this research is particularly interesting for children who do not have a high interest in playing chess since the math and chess integrated workbook involves visualization, analyzing, spatial relation and data processing, these types of problems provide high order cognitive skills. Without spending substantial time on playing chess, we believe that children can get the similar benefits of playing chess on cognitive effects by working on math and chess integrated workbooks. This may require further study.

Why children like to work on math and chess integrated workbook than on the traditional computation worksheets? Math and chess integrated work has visual images, chess symbols, directions, spatial relation, and tables; all these are stimuli to kids and keep their interests high while working on computation problems. This also gives children ample opportunities to think visually. Most of the time, the computation questions themselves are not written for children to work on immediately but for children to “create” themselves and these questions have to be actually “mapped” out by following directions and children love them. Children learn best while having fun.

Source

The number of Shannon

a simple proof how deep chess can be

Shannon

Claude Shannon

photo: Wikipedia

Claude Elwood Shannon (1916-2001) was a famous electrical engineer and mathematician, remembered as “the father of information theory”. He was fascinated by chess and was the first one to calculate with precision the game tree complexity of chess i.e. the number of possible chess games. He based his calculation on a logical approximation that each game has an average of 40 moves and each move a player chooses between 30 possible moves. That makes a total of 10120 possible games. This number is known as the number of Shannon.

To a similar conclusion came Peterson in 1996. An interesting comparison is the estimation of the total numbers of atoms in the universe 1081 . The number of legal positions in chess according to him, however, is about 1050 .

All these calculations will suffer slight changes when we apply new rules to chess, such as the Sofia rule or further estimation of the effect of en-passant. However, the numbers are close enough to show you how deep chess can be.

Other game tree complexities (log game tree):

Tic tac toe 5


Connect Four 21


Othello 58


Chess 120


Backgammon 140


Connect six 140


Go 766

Number of positions in chess after n moves

Chess mathematics can be fascinating. At first sight chess seems to be easy to calculate. It has logical patterns and finite board space. However, the simplest questions may require serious mathematical skills.

A good example is the number of possible positions after n moves, n being 1, 2, 3, etc. After the first move there are exactly 20 positions, after the second, there are 400. White has a choice of 20 first moves, Black the same number of replies, making 400 different possible positions after one move for each color. From here on it is difficult to keep on counting since the number is rapidly growing. After the third move we have 5362 positions, and after the fourth the number is 71852. A really large number for 8×8 board!

These numbers are a good back up of the complexity of The number of Shannon. In 1889 Cunningham got close to the number of moves after the 4th move, stating they are 71782. Fabel got even closer in 1895, he calculated 71870 possible moves. The first one to find the correct number, 71852, was C. Flye St. Marie in 1903.

As far as the Chessdom team knows, there are estimations of the number of positions after the 5th and the 6th moves. They are 809798 and 9132484 respectively. However, we would like to receive confirmation or a more correct information from our mathematician readers. Do not forget to include special moves like en passant .