Assisting Human Action Learning through Novel View Synthesis and Augmented Reality

Fabian Lorenzo Dayrit (1461015)


Commonly, when people learn actions, such as dances or martial arts moves, they watch a teacher perform the action and then imitate it. However, learning from a video may not show all views of the action, and learning in-person is subject to the availability of the teacher. We introduce reenactments, which are generated novel views of recorded human actions. The aim of this work is to develop systems to easily capture and display such reenactments in order to aid learners' comprehension of actions.

One problem regarding reenactments is that it is difficult to capture enough angles of the motion sequence to generate a novel view. Thus, we limit our subjects to human only and use a human body model for representation. This makes capturing more robust and offers a degree of prediction of unseen areas. In this work, we present three human body models which we used to represent a reenactment: rough, rigid body parts model, voxel-carving based rigid body parts model, and statistical non-rigid body model.

Another problem regarding reenactments is how to display them in an intuitive way. We decided to use augmented reality (AR) for this, specifically the glasses metaphor, which lets a user "see through" a handheld device, and the mirror metaphor, which reflects the user's image. The handheld AR system is a mobile reenactment viewer that tracks the capturing location, letting the user view reenactments on the same location that they were recorded, allowing the reenactments to be seen in context. The mirror AR system is an action-learning system that places an image of the teacher on top of the user, facing the same way as the user, allowing for easy comparisons to be made. We evaluated the systems' effectiveness and output quality using user studies.