Three-level hierarchical model-free learning approach to trajectory tracking control

Document Type

Journal Article


Elsevier Ltd


School of Engineering




Originally published as :Radac, M. B., & Precup, R. E. (2016). Three-level hierarchical model-free learning approach to trajectory tracking control. Engineering Applications of Artificial Intelligence, 55, 103-118. Article found here


This paper suggests a novel three-level model-free hierarchical learning approach that solves the reference trajectory tracking problem for control systems (CSs). The new approach consists of the low level, the intermediate level and the high level, it relies on past memorized optimal input output execution patterns and adaptively merges them using a similarity measure. The low level feedback control is carried out in a novel model-free framework using a neural network (NN) controwller tuned by Virtual Reference Feedback Tuning (VRFT) in order to linearize the closed-loop CS and to match a linear reference model. The NN controller is tuned in two phases, an offline one and an online one. Nonlinear Model Predictive Control (NMPC) is first employed in the novel offline tuning phase. The online tuning phase makes next use only of the process sign in the dynamic back-propagation mechanism that updates the NN parameters. After the NN controller is trained and the feedback CS is fixed, the optimal execution patterns (input/output patterns) are defined in terms of optimal control problems, which balance control accuracy and control effort. The input/output patterns are formulated over a feedback CS and are solved in a model-free Iterative Learning Control (ILC) framework for linear time-invariant systems at the intermediate level. Once the optimal executions are learned at the intermediate level over the feedback CS, they are stored in a database (DB). Then each time a new trajectory is to be tracked as required by the high level planner, similar patterns of executions are looked-up in the DB and are merged by the weighted average sum of most similar patterns resulted from a sort algorithm using a distance metric. The proposed approach is tested on the position control of a Single Input-Single Output nonlinear aerodynamic control system and shows trajectory tracking performance improvement with respect to the case when no learnt experience is used. © 2016 Elsevier Ltd