Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Grandmasters do search! They think many moves ahead for most moves, and obviously StockFish does search -- a lot of search, much more than a grandmaster.

I feel that the sort of structures we implicitly operate over during search can be usefully "flattened" in a higher dimensional space such that whole paths become individual steps. I feel that implicitly that's the sort of thing that the network must be doing.

If you watch videos of grandmasters talking about chess positions, often they'll talk about the "structure" of the position, and then confirm the structure via some amount of search. So, implicitly grandmasters probably do some kind of flattening too which connects features of the present state to future outcomes in a way requiring less mental effort.



>they'll talk about the "structure" of the position, and then confirm the structure via some amount of search

Yes that's sort of the point of the approach. Imagine when the board is any position, you have a list of all possible moves which are ranked based on how good they are (i.e., how likely they are to lead to you winning). In order to make this ranking you've "flattened" the future outcomes of that move into a single number which is used to rank the list. Then, you can play optimally without search: simply pick the best move at every board position. (Richard Bellman showed that, for the right definition of "best", this gives the optimal strategy for playing entire games, not just single moves.)

Now the question is, how does one actually calculate this sorted list of the best moves? The straightforward answer is essentially search and various techniques to make the search computationally tractable. What Deepmind's approach is use such a list computed by Stockfish (which explicitly searches) to compile a dataset which is then trained on in order to approximate the function that produces this ranking (called the Q-function). But the point is that search is just an algorithmic strategy to estimate the Q-function, mathematically speaking it is merely a function of the current board state and therefore in theory could be determined without any search. So, maybe a deep neural network could learn how to do this?


I don't think you've understood what I meant by "flattening". I wanted to imply something like an embedding of search paths into a high dimensional space within which a point in the space represents a whole path, and the proximity of paths implies similar realisations.

I'd imagine one could directly realise this by training a network with chess games and the distances between them -- by some useful path related measure -- as inputs. I think that the network would succeed in learning an embedding for chess games. I wouldn't be surprised if such an embedding could be fine tuned to output win probability instead for example.

I also wouldn't be surprised if you could train a network on an auto-completion task of one-move-ahead prediction, and then using that to assign win probabilities, much in the same way that LLMs are used for part of speech tagging.


Right, and this approach is more or less what experienced chess players do. We don’t evaluate every possible move; we look at the board and use our intuition and experience to immediately discard most candidates and choose a few options to prioritize focus on.

In many positions there’s surprisingly little actual computation going on.


I expect this is particularly borne out when there is no obviously good move, but many options. The grandmaster will "see" implications without calculating and then confirm them.

I think most strong chess players can "evaluate" a board and decide who is stronger to within a small error compared to an engine score. I think if you were to test this by giving strong players N seconds per position -- not enough time to calculate to any depth -- it would be shown that many are within a small (but consistent) margin of error compared to an engine.


Absolutely. A grandmaster will more or less instantly understand the core themes of any real position you put in front of them, and they usually can even guess with very high accuracy how the position was reached in the first place. The better the player, the more accurately they will perceive the nuances in a wider array of positions.

This process happens in much the same way as facial recognition. For most faces, we just “know” who the person is. We don’t consciously search a database of acquaintances, filtering by facial features, eye color, hairstyle, etc. We see and we recognize.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: