I haven’t had much look tackling the pendulum balancing problem with Q-learning. It seems that there are many states in the system, and the output (motor speed) should really be a continuous variable, that it didn’t work well, with the Q-learner that spits out discrete speeds, or even generate faster, stay, slower discretised states.
Continue reading
Implemented simple 2-d data fitter with p5.js and convnet.js libraries, to play around with stochastic gradient descent fitting of simple data.
Continue reading
“The greatest glory in living
lies not in never falling,
but in rising every time we fall.”
― Nelson Mandela
The pendulum will now swing back upright after it fails and falls down.
I’ve managed to incorporate the swing-up controller to complement the stabilising controller when the pendulum is upright. The trick was simple: the PID controller works by stabilising an inherently unstable system. This is done with a combination of feedback of the error, integration of the error, and differentiation of the error, where error is the difference between the angle and the targeted angle (0 degrees if totally upright). In control theory speak, the feedback works by moving the ‘poles’ of the system (natural frequency of the systems), which is originally positive, or on the right hand side, to the left hand side of the complex number plane. In other words, the real part of the pole will be negative.
When the pendulum has fallen, the system then becomes inherently stable since for small moves the pendulum will just fall back down due to gravity and stay there. In other words, the poles of that system is already negative. To destabilise the system, I just made the gains of a PID controller negative, specifically the integrator and differentiator. The trick is to keep the gain of the error term positive, which has the effect of keeping the pendulum near the center, rather than falling off the screen.
Link to demo here.
Continue reading
I’m trying to train a neural network using Q-learning to make a pendulum stand up again after it falls. Still much work to do to get it to work…
Continue reading
The first version of the inverted pendulum demo was just a hack, with messy code to show that the PID controller can work within the box2d and p5.js framework. I’ve rewritten the code again using more clean object oriented patterns. In addition, rather than using pixel space, I’ve used box2d-world space, and setup a camera object to project the box2d space onto the screen, so that the objects fill up the screen.
Continue reading