Before you start writing a whole chess engine I'd suggest you code up a couple of end-game combinations that are known wins so that you get a good idea of what works and what doesn't for data structures and such. Once you can win king, knight, bishop against king reliably you're ready for tougher stuff.
Some notes from past efforts:
- A simpler but much faster engine will beat a more complex engine any time because the advantage of another ply easily outweighs the advantage of some clever algo.
- A 10x10 board with an unreachable edge has some advantages when laid out as a linear array because it allows for much easier memory management (premature optimizations and all that...). On an 8x8 board laid out linearly the edges have no way of limiting piece movement but on a 10x10 you get that for free by putting an 'impossible' value in the unreachable edge fields. Even a knight won't be able to jump that effectively two field wide edge. This turns all legal moves for all pieces into a simple offset rather than a two dimensional affair.
- Optimization of engine code is much harder than optimization of the yield function.
- It is much better to not generate 'bad' moves than it is to prune them away after a lot of extra work has been done. Colin Wrights' law: 'You can't make computers faster, you can only make them do less work' clearly applies here.
Don't knights need at least a 12x12 board? You can overlap the 0th column with the 10th column of the previous line and go down to 10x12, but 10x10 would cause out of bounds accesses.
I am skeptical about the idea of starting with king, bishop and knight versus king. Winning this endgame from an arbitrary start position requires something like 25 moves / 50 ply. Brute force minimax plus alpha beta will struggle to do that in a reasonable amount of time without sophisticated pruning. And you can't progress gently to the goal starting with simple 2 or 3 ply calculations because mate is the only point of the calculation here, normal material and positional factors are irrelevant. It makes more sense to first make sure your engine can win material with a simple knight fork say, and build from there to multi move material winning combinations. Followed by pursuit of positional goals when material gain is impossible (control the centre, seek scope for the pieces etc.). The elementary mates are actually a problem area for engines because they require a rather atypical strategical approach, cutting off the king, bringing your king around, shrinking the box, eventually forcing mate.
I meant that example as the end stage of when the training wheels can come off. Before that, of course, you start with even simpler examples, king + knight vs king would be the best starting point.
I don't understand your point. King plus knight v King is completely pointless, no move for either side is any better than any other. In fact the game should simply be declared a draw and no more moves should be played. What possible value does it have? Test the ability to generate legal moves maybe?
Hadn't thought of that. Amusingly the classic chess algorithms even fail to mate with Q+K v K for reasons I discuss up thread. Eg Sargon 1978 version can't do it. It's basically a terrible idea however you slice it because these mates are a special skill, almost like a different game with different ideas to the rest of chess.
>- A 10x10 board with an unreachable edge has some advantages when laid out as a linear array because it allows for much easier memory management (premature optimizations and all that...). On an 8x8 board laid out linearly the edges have no way of limiting piece movement
But with an 8x8 board you can store all free fields in a single 64-bit integer
- A simpler but much faster engine will beat a more complex engine any time because the advantage of another ply easily outweighs the advantage of some clever algo.
There is a bit of a difference between the likes of AlphaZero and what your average hobbyist programmer is going to cook up. Once your toolbox includes bits like 'custom hardware' the equations change. Also: that was still a wicked fast machine in terms of the number of positions evaluated by any normal standard.
Certainly AlphaZero was run on very fast hardware and lots of it. Nevertheless, I think it shows that the days of "speed is all that matters" is over. It did substantially fewer evals than stockfish and still crushed it. LC0 is the same way. Neural network chess proved that a higher quality, much more expensive, eval (with a very high quantity baseline, to be sure) is a viable strategy for a chess engine.
Tell that to Stockfish NNUE :P. A simpler neural network running at Stockfish speeds, all on CPU, crushes Lc0 (Lc0 is probably stronger than AlphaZero IMO).
But alphazero is still using some kind of min-max, right? It's not a net that you input a board into and get a move out, but a net you input a board into and get a score out. So in a stockfish context it's mostly replacing the heuristic/scoring function?
For what it's worth, there is a pretty solid conventional wisdom in this space.
> - A simpler but much faster engine will beat a more complex engine any time because the advantage of another ply easily outweighs the advantage of some clever algo.
Depends a lot on what you call "simple". Stockfish's search and evaluation subroutines are pretty complex and distinctly clever:
But I agree that early on, simple is the way to go. It's much easier to optimize a simple engine.
> - A 10x10 board with an unreachable edge has some advantages [...]
Maybe in theory, but FWIW I don't know of any modern engine that does this. A more popular variant of the 10x10 or 10x12 idea was, back in the day, the 0x88 layout:
If you're going to be writing an engine today from scratch, use the bitboard representation. It's easy to implement and has the most resources out there if you need help.
(Bitboards is just the idea of representing the game state as a bunch of 64-bit ints, with one u64 per white/black type of piece.)
Your example use-case, of knight posisitions, is usually solved with bitboards using an 64-element array mapping squares to a bitboard of all the places a knight on that square can go. Here's where Stockfish does that:
Other commenters have noted that you can do a slightly fancier version of this lookup table technique to solve for the sliding pieces (bishops and rooks (queens are a bitwise OR of bishops and rooks)), because the raw data is low entropy, so finding a perfect hash function ("magic bitboard", in the lingo) is fast enough to do on startup:
It's existing techniques like this that make bitboards a good choice for beginners; if you get stuck, you can find resources online.
> - Optimization of engine code is much harder than optimization of the yield function.
Big +1 there! The yield function is also much easier to A/B test.
> - It is much better to not generate 'bad' moves than it is to prune them away after a lot of extra work has been done. Colin Wrights' law: 'You can't make computers faster, you can only make them do less work' clearly applies here.
Yes, but it's pretty hard to reliably get your move generator to generate good moves first. The other side of making mixmax search fast is quickly throwing out branches that aren't relevant.
For instance, delta pruning and null-move pruning do this by trying to detect if one of the sides has thrown the game with their last move; since minmax is about assuming optimal play on both sides, branches where someone makes a blunder aren't worth looking into.
> - memoization can bring insane speed gains.
Yup! Also, a lot of engines have optimized routines that let you undo a move quickly, without having to keep a redundant stack of every position's state, which would involve a lot of memcpy-ing.
Have fun! I had to stop doing chess engines, they were becoming too much of a distraction for me. I'm kind of glad ML-AI is starting to beat human-crafted AI in chess, that way the space can go back to being a fun little hacker cottage industry.
Edit: Oh! When making a chess engine, start by implementing "perft":
That way you can compare how many legal moves your engine thinks there is in a position, versus how many Stockfish or some other known-good engine thinks there is. It's the closest thing there is to unit tests in the space.
"Claude Shannon calculated that there are around 10^120 possible games of chess in his seminal paper Programming a Computer for Playing Chess in 1950."
For people who'd like the joy of seeing their hobby engine face real people, Lichess has a nice system for verified bot accounts: https://lichess.org/team/lichess-bots
There was a recent thread about challenging projects every programmer should try, and I think writing a chess engine would be a nice addition to that list. Chess programming is a refreshing change from modern software development, because you don't need any complicated frameworks or libraries, just a basic grasp of the minimax algorithm for searching a game tree. I have fond memories of writing an engine as a fresh college grad - it was my first "big" project and I made every mistake imaginable, but it was great fun and sharpened my coding skills immensely. It's exhilarating to play against your own program and then try to improve it. Highly recommended!
You could expand this category to an AI for a number of similar games (like checkers and reversi) as well. I've written a double dummy solver for bridge--a program that computes the optimal result of the play of a bridge hand--as a side project and I think it's also a very good choice. A lot of the same tools as chess engine development apply: alpha-beta and the myriad variant search algorithms like MTD(f), transposition tables, heuristics like the killer move heuristic or the countermove heuristic. But you also need to apply some domain specific heuristics and tools, and since the whole thing is performance critical you need to worry about low level optimizations and good multithreading. In the bridge case it's not at all trivial to get something that can even finish at all in a reasonable timespan.
I wrote a chess program in college, and agree that it was fun. But isn’t the whole exercise a bit outdated? Building a chess program now surely would start with deep learning
There is plenty to learn about algorithms and data structures by coding up a 'conventional' chess program, you're likely not going to go up against the world champion players or computers but against a much more inferior opponent: yourself.
When I was 17 or so I wrote a chess program, at the time I was pretty deep into chess and color me surprised when after a week or so I could only beat my own program by really paying attention because it would never mess up and I would mess up all the time. All it would take is a little mistake and then you'd be lost already.
The code was written in 6502 assembler, it was a lot of fun to write. My buddy who wrote his own in Pascal was quite ticked off that my very straightforward 'dumb' program would beat his extremely elegant program hands down simply because it would look on average two ply further ahead. That's a lot of extra moves to analyze but the speed difference between machine language and Pascal (at the time, today this would definitely no longer be the case with our much better optimizing compilers) on an 8 bitter made this an uneven match.
On the plus side: his code was far more readable than mine.
The top two chess programs, Stockfish and LC0, both use neural nets for evaluation but Stockfish's is very shallow CPU-based one with a clever way not to have to re-evaluate the whole position after each move (borrowed from Shogi). LC0 is modeled after AlphaZero and uses reinforcement learning.
For me personally such projects are also about having fun, and I
just like implementing some of those elegant algorithms and
maybe understand them at a deeper level. I don't enjoy pushing
data through some intangible neural network and tweaking it
nearly as much. Especially if it involves reading through the
documentation of some library and the coding is just the
boilerplate required to shovel data in.
That said, everyone has their own preferences and as long as you
are motivated to spend a long time with it you can learn
something from it, even if the improvement isn't noticeable
immediately.
I have to commend the author for taking on a beast of a task. It's not easy to be learning/relearning chess while also trying to program an engine!
I think programming a chess engine that can beat most club players (let's say sub-2000 rating) is not too hard. At that level, human players will make inaccurate moves (as well as blunders at lower ratings). It is however much harder to develop engines that are competing at GM level (~2500+).
Implementing minimax and comparable algorithms may be straightforward, but the actual evaluation of positions for traditional engines is a distillation of thousands of pieces of expert knowledge. See for example the parameters that can be tweaked in Fritz (http://help.chessbase.com/Fritz/16/Eng/index.html?000038.htm).
If you look at the history of chess engines (https://www.youtube.com/watch?v=wljgxS7tZVE), by the 1990s and later most of the top chess engines have had masters, international masters, and grandmasters intimately involved with development. For example, Deep Blue had several consulting grandmasters. Rybka (the world's best engine 2007-2010) had IM Vasik Rajlich as primary author and GM Larry Kaufman closely involved with tweaking its evaluation functions. Kaufman also went on to write the (still) very strong engine Komodo (https://ccrl.chessdom.com/ccrl/4040/).
Traditional chess engines used to have glaring weaknesses like playing poorly in closed positions, being poor at avoiding disadvantageous endgames, etc. By the mid-2000s many of these weaknesses disappeared, but as recently as Fritz 9 there were well known opening sequences where engines could be tricked into playing losing lines.
An engine with a high search depth makes great moves most of the time, but one with a low search depth makes really weirdly bad moves because it misses trivial combinations that go a bit deeper than it looks. So low-difficulty chess programs tend to be very unsatisfying to play against.
Something I have always been interested in trying is writing a chess engine that makes human-like moves. So the evaluation function would try to emulate what a human would do, maybe with a few depths of evaluation, instead of playing accurate moves that never miss simple combinations.
I've always wanted to do something similar! I've thought about trying to consider the "obviousness" of certain moves, and then only explore those branches with a probability relative to its obviousness.
"Obviousness" would take into account things like the last move played ("ah, your pawn is attacking my bishop; I should move it"), and whether the move is a capture, check, or attacks another piece. A forward move for a knight is more obvious than a backward move, as is moving it towards the center of the board, versus moving it to the edge of board.
As the depth increases, the probability of exploring a branch decreases. I think it would be pretty easy to scale such a system to make it play better or worse by simply adjusting how much the probability of exploring a branch decreases as the depth increases.
Perhaps this could lead to more natural blunders where a line is simply missed?
I've also wanted to come up with an evaluation system that scales better with skill level. A position may be +3, but only if you play the next 5 moves exactly correctly. In another position, one move might be +1, and another might be +0.5, but the +0.5 requires four moves of perfect play from your opponent to keep it even. Can these subtleties be expressed clearly? Maybe something like "turns until positional advantage is converted to a clear material advantage." When you're just starting engine evaluations often don't make any sense.
I've watched a lot of chess on Twitch lately, and it'd also be cool to have scores for all the relatively-subjective terms that the commentators use: "lots of attacking chances" (just pure count of checks possible in next N moves? percentage of lines that lead to checkmate in next N moves?), "very sharp position" (how many moves drastically change the evaluation?), etc.
> Something I have always been interested in trying is writing a chess engine that makes human-like moves.
This was one of the debates in computer chess in the ‘70s. Should a good engine spend a lot of time on evaluating each position like a human using many heuristics, or should it go for speed and depth? In a famous match Dartmouth CP[0] was able to tie Northwestern’s program (the reigning champ) using this approach. But ultimately it turned out search depth was much more important. Modern engines don’t need to compromise compared to the ‘70s. They can do a great deal of depth and also evaluate way more heuristics for each position than in the ‘70s. But there is of course still a trade off between how much you analyze each position and how deep you can get in a reasonable amount of time.
Yeah, I have no doubt that search depth is better, but that is relative to the goal of winning more. I instead want to optimize for the metric of a chess Turing test: being indistinguishable by humans from a player of a given rating.
I once wrote a "chess engine" when I was much younger, however as I'm not much of a chess player, it wasn't very good. I encoded all the movement rules for each piece and then programmed the opponent to choose random valid moves. It would pick a random piece, pick a random square to move it to, and if it was considered valid, it would make the move, or pick another square. I think later I reprogrammed it to select 10 random valid moves, and select the one that allowed it to capture the highest value piece.
The great thing was that it worked, and it could "play chess". The sad thing was, I am such a bad chess player that it could still actually beat me around 50% of the time by choosing random moves.
I wrote a chess game using StockFish, and had it play against another computer player that only made random choices. And what surprised me is how long it would take StockFish to beat the random player.
There is a funny chess variant on F-Droid called Open Chaos Chess <https://github.com/CorruptedArk/open-chaos-chess> where the players choose what piece should be played but the move is chosen randomly. The first player to lose all pieces loses the game (there are no checks and mates).
What's funny is that my strategy for chess is randomly moving pieces as well, given I play a little defense when things get close with the queen or rooks.
I’ve said a few times on here that I think studying chess engines is about the best computer science education you could get from a single hobby. Definitely worth trying, and evolving from simple methods to more advanced over time.
I do think there’s also a big gap in the market for chess _tooling_ that makes learning and match prep easier. I actually think simpler more interpretable engines can be more useful as tools for humans in many cases. For example instead of increasingly inscrutable neural networks, or explanations of moves that are merely “after 10 moves this happens and it’s bad”, even a simple regression over some human interpretable features can give you an idea why the engine is suggesting a positional move (king safety, piece activity etc). I also think just pure stats and algorithmic stuff would be useful, like finding transpositions between openings, trying to work out what kinds of middle game structures or endgames a player is good or bad at, and navigating a path towards that in your preparation.
Depending on your goals, it might be helpful to know that Apple has open sourced their Chess game (under a restrictive license): https://opensource.apple.com/source/Chess/
That note is not from Chess.app, but from "Sjeng", a 3rd party dependency vendored into Apple's Chess.app. That notice has been there in all the versions of Chess that are available (from Chess-2.0 to Chess-347). See: https://en.wikipedia.org/wiki/Sjeng_(software)
I wrote a toy chess engine a few weeks ago as well as an example for one of my open source libraries [1]. I was inspired by a video of someone writing one in half an hour [2], so I also assumed it would take a day or two at most. It took more like a week. But it was very satisfying to not be able to beat my engine anymore.
I highly recommend that all programmers with even a passing interest in chess spend some time writing an engine.
I started writing my own chess engine after watching the Queen Gambit and missing chess. I realized I never wrote an engine, have you tried seeing how strong yours is?
I want to write a chess engine without using a search tree. Teaching the computer to look for tactics and specific patterns. I don't expect a super engine, but a decently good engine. Plus it would also be able to comment on the position, telling the tactics it sees. Is this possible?
It wouldn’t be very strong but you could use the evaluation rules that are used for move ordering (the ones that don’t require prior searching) https://www.chessprogramming.org/Move_Ordering
Absolutely. Just make a neuronal network that know chess moves and let it play against itself. After ~2 trillion games should be quite trained and will kick anybody's ass. Sure, you'd be long dead by then but that was not your question, yes?
The piece values are quite interesting, N=320, B=330 for example is a good heuristic to compensate lack of better knowledge. However be aware that there are contradictions:
For what it’s worth, John Watson, in his book Modern Chess Strategy, lists the Q+N > Q+B thing as a belief that used to be common in the past, but is proven wrong by modern practice. I’m not nearly strong enough to have an opinion on the matter myself.
Thanks for writing this, I'm sketching out a chess engine for playing the Horde chess variant (https://lichess.org/variant/horde) and studying the variant more deeply. Standard Stockfish is pretty bad at playing this variant for both white and black, and can be defeated by most competent humans. Great link at the bottom as well for anyone interested in chess programming (https://www.chessprogramming.org/Main_Page)
One of the first books I read about BASIC had a checkers game in it. The whole point of the game was to show how recursive algorithms worked (that was the AI, I don't remember any more details on it).
One of the exercises at the end of the chapter was, "Improve the checkers program to play chess"
Not sure if the author was kidding or not.
I don't remember any strong BASIC chess engines, but there were quite a few 8-bit engines that were fairly decent in those days.
>> There are more possible chess games than the number of atoms in the universe.
>> This program throws a RecursionError and prints 59691 — the number of different positions the search tree contained when it crashed. All we need to fix this, is another universe to run the program in.
To be fair, the recursion limit is not the same thing as running out of memory. You can use a loop to have your program run further than this.
15 years ago I made a simple alpha-beta minimax chess engine for a high school Java project. 10 years later it was used in a Minecraft mod called MineChess: https://github.com/theronic/chessmate
Some notes from past efforts:
- A simpler but much faster engine will beat a more complex engine any time because the advantage of another ply easily outweighs the advantage of some clever algo.
- A 10x10 board with an unreachable edge has some advantages when laid out as a linear array because it allows for much easier memory management (premature optimizations and all that...). On an 8x8 board laid out linearly the edges have no way of limiting piece movement but on a 10x10 you get that for free by putting an 'impossible' value in the unreachable edge fields. Even a knight won't be able to jump that effectively two field wide edge. This turns all legal moves for all pieces into a simple offset rather than a two dimensional affair.
- Optimization of engine code is much harder than optimization of the yield function.
- It is much better to not generate 'bad' moves than it is to prune them away after a lot of extra work has been done. Colin Wrights' law: 'You can't make computers faster, you can only make them do less work' clearly applies here.
- memoization can bring insane speed gains.
Good luck!