Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Interesting, a TD algorithm, developed by a Canadian AI researcher now working with Deepmind in the early 1990s, was previously used to beat expert players at Backgammon and advanced human understanding of the game:

> TD-Lambda is a learning algorithm invented by Richard S. Sutton based on earlier work on temporal difference learning by Arthur Samuel. This algorithm was famously applied by Gerald Tesauro to create TD-Gammon, a program that learned to play the game of backgammon at the level of expert human players.

> TD-Gammon achieved a level of play just slightly below that of the top human backgammon players of the time. It explored strategies that humans had not pursued and led to advances in the theory of correct backgammon play.

https://www.wikiwand.com/en/TD-Gammon



TD-Gammon was taught to us as a part of a classroom course on Reinforcement Learning (RL) in 2007. ML was known to a small set of people back then, there weren't many jobs in the area (this is in India), and even to many in this set, RL was either not known or not well known. It's interesting to see RL surge in popularity. In fact just a couple of weeks back, I was talking to the professor who taught us that course, and it was fun comparing ML/RL related awareness then to now :-)


Yes, me too. TD was also pretty much considered useless, until it was used for backgammon. And like many things in the AI/ML world, no one really knew exactly why it worked so well.

Backgammon is also interesting in that there is a non-determinstic element - the dice roll on every turn. This is where TD seems to shine.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: