コロキアムB発表

日時: 9月17日(木)5限(16:50~18:20)


会場: L1

司会: 原 崇徳
武田 悠佑 M, 2回目発表 知能システム制御 杉本 謙二, 安本 慶一, 小林 泰介
title: Selective Imitation Learning for Diverse Demonstrations
abstract: Imitation learning is a method of learning by imitating demonstration trajectories by a expert. However, when multiple actions can be selected during a demonstration, such as going through a fork in the road, general imitation learning cannot be used. To solve this problem, we consider a new derivation of imitation learning with reverse Kullback-Leibler divergence, which has zero-forcing behavior to obtain only one of the components in the multi-modal true distribution, although it cannot be computed numerically. Therefore, we additionally introduce an inverse dynamics model, which can estimate which action was performed on state transition, to model the multi-modal true distribution even with uni-modal model like normal distribution. In order to verify the effectiveness of the proposed method, we performed a task of driving on a track where the user can choose between right turn and left turn. In this presentation, I describe the proposed method and the experimental results. To solve this problem, we introduce ReverseKL and an inverse dynamics model that can estimate behaviors from state transitions, and propose a method that can be applied to the case where multiple trajectories exist.
language of the presentation: Japanese