Evolutionary Function Approximation for Reinforcement Learning

Ha, David

Evolutionary Function Approximation for Reinforcement Learning

April 8, 2015

Recently I have taken an interest in combining evolution techniques with learning. I think an interesting area of research, that not a lot of people are looking into, is to evolve neural network structures, and apply learning techniques to optimise weights. So far, all the experiments I have done is based on Darwinian evolution, where the next generation takes nothing learned in the life of the current generation of agents. But what if we can have the agents learn something in their lifetime to be better at their task, and share that learning, in the form of a more fine tuned weight parameters, with their offsprings?

It would be interesting to simulate Lamarckian evolution, where individual behaviour is passed onto its offsprings, or see if the Baldwin effect within darwinian evolution framework will allow agents to indirectly pass their traits to their offsprings via natural selection. Hinton has wrote about this back in 1987, where he had a toy example where agents must learn before evolving in order to find the solution. I discovered another paper, describing the possibility to use the NEAT algorithm to evolve Q value functions used in Q-learning, leading the way for agents to learn off an evolved network. I believe this may lead to more interesting results, even compared to deepmind’s atari-playing DQN, as the network structure is not hand chosen but rather evolved.