Colloquium A

日時(Date) 平成31年1月22日(火)3限(13:30--15:00)
Tue. Jan. 22nd, 2019, 3rd Period (13:30--15:00)
場所(Location) L1
司会(Chair) Sakriani Sakti准教授
講演者(Presenter) Marc Delcroix ( NTT Communication Science Laboratories )
題目(Title) Deep learning based selective hearing with SpeakerBeam
概要(Abstract) Automatic speech recognition (ASR) has greatly progressed recently with the advent of deep learning technologies, bringing more and more exciting speech driven applications in our everyday life, such as voice search on smart phones, smart speakers etc. However, there are still some fundamental limitations if we want to develop systems that understand natural human beings conversations. For example, human beings are able to focus on listening to a desired person in an environment with noise and several people speaking at the same time (selective hearing). For current ASR systems, it is still challenging to recognize speech of a desired speaker in such conditions. In this talk, we will briefly introduce problems of noise robust speech recognition and recent deep learning based approach to handle the problem. We will then present our recent work on deep learning based selective hearing, which can extract speech of a target speaker based on characteristic of his/her voice.
講演言語(Language) English
講演者紹介(Introduction of Lecturer) Marc Delcroix is a senior research scientist at Media Information Laboratory, Signal Processing Group, NTT Communication Science Laboratories, Kyoto, Japan. He received the M.Eng. degree from the Free University of Brussels, Brussels, Belgium, and the Ecole Centrale Paris, Paris, France, in 2003 and the Ph.D. degree from the Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan, in 2007. He was a research associate at NTT Communication Science Laboratories from 2007-2008 and 2010-2012 and then became a permanent research scientist at the same lab in 2012. He was also a visiting lecturer at the Faculty of Science and Engineering of Waseda University, Tokyo, Japan, from 2015 to 2018. His research interests include robust multi-microphone speech recognition, acoustic model adaptation, integration of speech enhancement front-end and recognition back-end, speech enhancement and speech dereverberation. He was one of the organizers of the REVERB challenge 2014 and of IEEE ASRU 2017. He is a member of the IEEE Signal Processing society Speech and Language Processing Technical Committee.