My opinion is that view is very simplistic and unnecessarily offensive to a whol...

Der_Einzige · on June 10, 2021

Unsupervised learning is exactly the wrong way to approach chess or other games that MuZero solves. It's also worth noting that traditional alpha-beta pruning + heuristics are basically neck and neck with the very best of neural network based techniques. I'll trust stockfish over a alpha-zero or MuZero for awhile longer if I'm trying to win a computer chess competition ...

hervature · on June 10, 2021

Sure, Stockfish just uses millions of years of evolution to build its heuristics and can't be transferred to any other game. The point remains, calling RL a cherry on the cake compared to unsupervised learning when they are completely orthogonal and not mutually exclusive techniques is simplistic and unnecessarily offensive.

unishark · on June 10, 2021

So it's bad to be the cake? I assume he means it's the foundation one falls back on when the more specialized categories of methods are not applicable.

You might not like my analogy either. I think of supervised and unsupervised learning as the majority of the genome of ML, while RL is that little Y chromosome sometimes tacked on to address a few high-profile tasks.

thom · on June 10, 2021

Not disputing your main point, but Stockfish now includes a neural network.