This slide from David Silver's ICLR talk hint's at Google Deepmind's Gorila Parallel Large Scale Actor Critic Deep Q Architecture
There is some evidence that expert curriculii can make learning much faster , although with game agents I dont know of anyone exploring this since Michie and Chamber's 1968 work on tic-tac-toe and pole-balancing comparing expert training and self-play with these benchmarks.
This slide from David Silver's ICLR talk hint's at Google Deepmind's Gorila Parallel Large Scale Actor Critic Deep Q Architecture
There is some evidence that expert curriculii can make learning much faster , although with game agents I dont know of anyone exploring this since Michie and Chamber's 1968 work on tic-tac-toe and pole-balancing comparing expert training and self-play with these benchmarks.
http://aitopics.org/sites/default/files/classic/Machine_Inte...
Collobert, Weston and Bengio have explored evolving efficient curicula
http://ronan.collobert.com/pub/matos/2009_curriculum_icml.pd...