Some additional theory of othello games
Even though I fell that I can’t be compared to those scientists who published their great achievements regarding search algorithms and views on othello programming, I would like to contribute to the intellectual debate and and property developed by these famous researchers who did their best to map out different areas of game programming.
First of all let me clarify one important issue for beginners: why is it so difficult to write a good othello program? Because:
- you need good and sophisticated search algorithms;
- you need to tune your coefficients;
- you need good samples;
- you must have a degree in mathematics (just consider how difficult it is to understand the Konjugate Gradient method).
The sensitiveness of the discs on the board
In addition to the difficulties listed above let me please clarify something very important regarding normal othello games which I haven’t found explained in detail in various documentation available on the internet: the perfect value of a board position is extremely sensitive to small changes on the board.
Have a look at the board in the next picture. The perfect outcome is -14 for black.
Now let’s change the colour of a disc on the board and see how the endgame result varies:
change G1 from white to black, the outcome is -20 ( 4.7% change in the result).
change C2 from black to white, the outcome is +4 ( 14% change in the result).
change B8 from black to white, the outcome is +8 !!! ( 17% change in the result).
Oops! With a small, 1.5% change on the board (you only changed one disc out of 64), you get a 17% change in the endgame result. 17% = abs(result difference) / max result difference’s range = abs( -14 -8) / (64+64). The result is even worse when you move a black or white disc into an empty square.
This is something you must definitely get used to and it has an important consequence: if you use the pattern based representation in your evaluation function, the dispersion of the evaluation function will be relatively high. In other words it is really difficult to find an adequate mathematical representation of the board evaluation in the normal othello game, because game positions are very sensitive to the variation of the squares’ colour. These “sensitive” squares vary from one board to another.
The distribution of possible endgame board values in the normal othello game
How often is a game draw in the normal othello? Is it rare that the outcome of the game is +64 or -64? I generated some samples to find the distribution of the results both in the case of normal and reversed games (see next graph). Showing a normal distribution form, in case of normal games results are spread between -64 and 64, with a pretty big dispersion.
In case of reversed games the distribution is narrower and the graph “ends” at about -42 and +42.
Distribution of the evaluation function accuracy
I calculated the accuracy of my evaluation function at different game stages (where it was possible to be calculated ). This depends on many aspects. You only need this parameter if you want to implement the MultiProbCut or Endgame ProbCut search methods, otherwise this will never bother you. To investigate the accuracy of the evaluation function you must compare value to the perfect endgame result. It turned out – at least in the case of Tothello – that the average dispersion of this difference (eval result-perfect result) also depends on the absolute value of the perfect result. The following graph shows this distribution for game positions having 8 empty squares.
Presumably this distribution derives from the samples used to tune the coefficients. Other samples may lead to a slightly different graph. As a result, Tothello now doesn’t use a constant empirical sigma value for the evaluation function’s dispersion, but an approximation table based on the perfect result & number of empty squares. See the approximated dispersion curve in the next picture for 13 empty squares (normal game mode).
A similar case is shown in the next picture (empties=23 and reversed game type).
The program is slowed down by this sigma approximation table, but this decrease in performance is not considerable.
Midgame sigma optimization
If you use MultiProbcut search algorithm in your program, you have to work with a given confidence level (it derives from t*sigma). How high does this confidence have to be? If you use different cut thresholds, your search time and the accuracy of the search will be affected. An optimal value of the cut threshold must exist (t).
In his groundbreaking dissertations, Michael Buro, author of the MPC search, underlined that the best empirical cut threshold was 1.5, which is equivalent to a 86.63% confidence level.
I investigated, on an empirical basis, the different dependencies among the confidence level (t*sigma used), search depth and search time. It turned out that the optimal value of the cut threshold can be determined or approximated empirically.
In the next diagram you will find an empirical graph showing the dependency of the search time and search depth at different confidence levels. The search time increases with the confidence level applied (which is normal).
For some calculations not only the total time of a given search is required, but the additional time (delta time) to achieve a deeper depth could also be important.
There is another dependency among the accuracy of the search (sigma), the search depth and the confidence level. Please note that the dispersion/accuracy at a given search depth does not change significantly if the confidence level varies. This is also a very important aspect of the othello games: the decrease of the confidence level does not affect the accuracy of the search so dramatically (as one would expect it to).
Finally, having calculated the above graphs , supposing there’s a constant search time, one can calculate the dependency between the accuracy of the search and the confidence level applied.
In my experience the optimal confidence level is about 83,8% for the midgame, which corresponds to a cut threshold t=1.4. Please note that this value highly depends on the implementation of the evaluation function and search algorithms. The value found is very close to the t=1.5 found by Michael Buro.