While Deep-Q-Learning is powerful, it uses backprop - your use of CNE genetic evolution rather than backprop may provide a more global search than Mnih's Deep-Q.
The methods seem complementary , I have an inkling that the combination may be greater than the sum of the parts.
Another innovation in the Atari paper, Mnih's deep-Q-agent remembers successful episodes and trains more from memory than experience.
Of course they are not training against a human player but perhaps this memory training could be used with human games to exploit expert knowledge.
The recent Neural Net Go players by Edinburgh's Clarke & Storkey group and Deepmind's Maddison & Huang both train on corpii of expert human play.
But the Go players are passive backprop learners.
Tesauro's TD-Gammon learned to expert level through self play.
I am speculating that a Reinforcement Learner could learn to utilise the benefits of human play with memory, expert corpii and self-play.
This is a fascinating parallel area - two equal players rather than Mnih's single player Atari - perhaps there are good 2-player Atari games that could be added to the ALE benchmark - Joust ?
So you think basically recording human actions (possibly in some feature extracted way) in part of the 'replay memory' for DQN would work well?
I have also been reading about deepmind's recent survey of combining deep learning methods with actor critic models
What I also want to explore is the possibility to use evolution to evolve q-functions, which shouldn't be so hard to do, rather than evolve policies directly (like in this game).
The possibility to evolve self learning machines excite me, rather than just evolving machines in a fixed state.
I can also explore whether Darwinian evolution (weights are randomized at birth) is better or worst than Lamarckism evolution where weights are passed to offsprings for.
I put some further thoughts and references in my post here
This slide from David Silver's ICLR talk hint's at Google Deepmind's Gorila Parallel Large Scale Actor Critic Deep Q Architecture
There is some evidence that expert curriculii can make learning much faster , although with game agents I dont know of anyone exploring this since Michie and Chamber's 1968 work on tic-tac-toe and pole-balancing comparing expert training and self-play with these benchmarks.
The methods seem complementary , I have an inkling that the combination may be greater than the sum of the parts.
Another innovation in the Atari paper, Mnih's deep-Q-agent remembers successful episodes and trains more from memory than experience.
Of course they are not training against a human player but perhaps this memory training could be used with human games to exploit expert knowledge.
The recent Neural Net Go players by Edinburgh's Clarke & Storkey group and Deepmind's Maddison & Huang both train on corpii of expert human play.
But the Go players are passive backprop learners.
Tesauro's TD-Gammon learned to expert level through self play.
I am speculating that a Reinforcement Learner could learn to utilise the benefits of human play with memory, expert corpii and self-play.
This is a fascinating parallel area - two equal players rather than Mnih's single player Atari - perhaps there are good 2-player Atari games that could be added to the ALE benchmark - Joust ?