Augmented Human Communication

Research Staff

  • Prof. Satoshi Nakamura

    Satoshi Nakamura

  • Assoc.Prof. Katsuhito Sudoh

    Katsuhito Sudoh

  • Assist.Prof. Sakriani Sakti

    Sakriani Sakti

  • Assist.Prof. Koichiro Yoshino

    Affiliate Assoc.Prof.
    Koichiro Yoshino

  • Assist.Prof. Hiroki Tanaka

    Hiroki Tanaka

  • Assist.Prof. Seitaro Shinagawa

    Seitaro Shinagawa

E-mail { s-nakamura, sudoh, ssakti, koichiro, hiroki-tan, sei.shinagawa }[at]

Go Beyond the Communication Barrier

The AHC Laboratory pursues research to solve problems related to human communication based on speech and language, paralanguage, and non-verbal information. By applying various artificial intelligence technologies including deep learning, our lab is pursuing tasks that were previously not able to be solved. Additionally, we seek knowledge related to human cognitive functions, as well as new information through brain measurement, and use it to perform research. Especially in research activities, we focus not only on theoretical aspects, but also on the applicability of technology, and aim at building prototype systems and validation. Below you can find our research areas.

NAIST launched the NAIST big data analytics project in April 2014, and subsequently the NAIST Data Science Center (NAIST DSC) in 2017. NAIST DSC focuses on material informatics, chemo-informatics, and social informatics by applying machine learning and artificial intelligence methodologies. The project also encourages close collaboration with industry. (For details, please see )

Research Areas

Real-time simultaneous speech-to-speech translation

Our current research project focuses on human-like simultaneous speech interpretation of complex utterances such as news and lectures, interpretation support technology for conferences attended by multiple speakers who speak multiple languages, and multimodal interpretation technology.(Fig.1)

Natural language processing

Our research into natural language processing focuses on deep learning machine translation and natural language interfaces between humans and computers, thus allowing computers to understand natural language queries and commands so that they may answer questions and follow directions.

Multi-lingual statistical speech processing

Speech recognition and synthesis are fundamental technologies for realizing natural human-computer interaction. We study statistical methodologies such as hidden Markov models, Gaussian mixture models, deep neural networks, and recurrent neural networks. We are extending these models for emotional, conversational spontaneous, and multilingual speech.

Goal-oriented and chatbot-type dialog systems

We focus on new statistical dialogue models for natural dialogue using verbal information, intonation, emotion, face and gesture information. Dialogue models are realized by machine learning techniques such as multi-modal unification and interactive transformation with deep learning.

Brain analysis for verbal and non-verbal communication

Our research on cognitive communication analyzes brain activity to detect real-time communication difficulty using Electroencephalograms (EEG). We also perform research on support for communication disabilities such as autism and dementia.(Fig.2)

Information distillation

Research to summarize information that comes from a variety of complex data sources and to inform people of the summarized results in an understandable manner.

People flow analysis

Analysis of large-scale people flow data using sequential deep learning models, people flow clustering, and prediction about future changes.(Fig.3)

Fig.1  Speech-to-speech translation

Fig.1 Speech-to-speech translation

Fig.2  A EEG measurement system

Fig.2 A EEG measurement system

Fig.3 People flow analysis

Fig.3 People flow analysis