Sample-and-Computation-Efficient Probabilistic Model Predictive Control in Model-based Reinforcement Learning: Application to Contact-Rich Manipulations
Cheng-Yu Kuo ( 1811416 )
Model-based Reinforcement Learning (MBRL) with probabilistic Model Predictive Control (MPC) and Gaussian Process (GPs) dynamics has demonstrated its excellent sample efficiency in several previous works. However, the computation efficiency of GPs largely depends on training sample size, causing the trade-off between MPC control frequency and model precision that makes it harder for real robot applications. To alleviate this trade-off, we proposed the SCP-MPC to achieve both sample and computation efficient nature. The SCP-MPC employs a linear Gaussian model with randomized Fourier features from Fastfood as an approximated GPs dynamics to derive an analytic uncertainty propagation for state prediction. The SCP-MPC achieves a semi-fixed MPC control frequency, which is independent of the training sample size. For contact-rich manipulations, we additionally introduce a safe exploration scheme to the SCP-MPC with state-control dynamic constraints. Safe exploration reduces the risk of causing damage while operating unreliable controls from insufficient knowledge. In specific, we limit the acceleration behavior attentive to the model-uncertainty in multi-step ahead predictions. We demonstrate both sample and computation efficiency of the SCP-MPC with simulation and real robot contact-rich manipulation tasks.