Model-Free control performance improvement using virtual reference feedback tuning and reinforcement Q-learning
Taylor and Francis
School of Engineering
This paper proposes the combination of two model-free controller tuning techniques, namely linear virtual reference feedback tuning (VRFT) and nonlinear state-feedback Q-learning, referred to as a new mixed VRFT-Q learning approach. VRFT is first used to find stabilising feedback controller using input-output experimental data from the process in a model reference tracking setting. Reinforcement Q-learning is next applied in the same setting using input-state experimental data collected under perturbed VRFT to ensure good exploration. The Q-learning controller learned with a batch fitted Q iteration algorithm uses two neural networks, one for the Q-function estimator and one for the controller, respectively. The VRFT-Q learning approach is validated on position control of a two-degrees-of-motion open-loop stable multi input-multi output (MIMO) aerodynamic system (AS). Extensive simulations for the two independent control channels of the MIMO AS show that the Q-learning controllers clearly improve performance over the VRFT controllers.