Free-viewpoint Image Generation-based Human Motion Reenactment from a Single RGB-D Video Stream

Fabian Lorenzo Dayrit (1251124)


In this research, we present a method to synthesize the appearance of a human subject in motion from an arbitrary viewpoint, as well as an augmented reality system that uses this method to increase comprehension of complicated actions.

The method, which we call reenactment, uses an RGB-D stream to generate a rough 3D model of the subject. The subject's motion is also captured from the stream. For each frame of the stream, the 3D model is fit to the captured motion. It is then textured according to the viewer's viewpoint as well as the pose of the subject in that frame, with textures coming from the RGB frames. For example, if the viewer rotates the reenactment to view the subject's back, the system looks for RGB frames where the subject's back is to the image plane and uses them to texture the rough 3D model.

The system consists of two stages: capturing and reenactment. In the capturing stage, a human subject¥'s motion is captured using a single RGB-D sensor. The extrinsic parameters of the sensor are also tracked per frame, in order to relate the captured motion in a common world coordinate system. In the reenactment stage, a viewer captures a continuous RGB stream. The same world coordinate system is used in order to generate a reenactment of the subject's motion and overlay it on the RGB stream.

We have implemented a prototype of the system which runs on a mobile computer. We have conducted a user study to determine if i) the system increases comprehension of 3-dimensional motion, ii) the system allows users to compare the reenactment's motion to a real human's motion, and iii) the system has other real-world applications.