So you think basically recording human actions (possibly in some feature extracted way) in part of the 'replay memory' for DQN would work well?
I have also been reading about deepmind's recent survey of combining deep learning methods with actor critic models
What I also want to explore is the possibility to use evolution to evolve q-functions, which shouldn't be so hard to do, rather than evolve policies directly (like in this game).
The possibility to evolve self learning machines excite me, rather than just evolving machines in a fixed state.
I can also explore whether Darwinian evolution (weights are randomized at birth) is better or worst than Lamarckism evolution where weights are passed to offsprings for.
I put some further thoughts and references in my post here
This slide from David Silver's ICLR talk hint's at Google Deepmind's Gorila Parallel Large Scale Actor Critic Deep Q Architecture
There is some evidence that expert curriculii can make learning much faster , although with game agents I dont know of anyone exploring this since Michie and Chamber's 1968 work on tic-tac-toe and pole-balancing comparing expert training and self-play with these benchmarks.
Teaching Deep Convolutional Neural Networks to Play Go by Clarke & Storkey http://arxiv.org/abs/1412.3409
Move Evaluation in Go Using Deep Convolutional Neural Networks by Maddison , Huang , Silver & Sutskever http://arxiv.org/abs/1412.6564
Tesauro's TD-Gammon http://webdocs.cs.ualberta.ca/~sutton/book/ebook/node108.htm...
Playing Atari with Deep Reinforcement Learning , Mnih et al. https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf
Otoro's CNE Bibliography http://blog.otoro.net/2015/01/27/neuroevolution-algorithms/
Neuro Evolution - http://en.wikipedia.org/wiki/Neuroevolution