Computational Linguistics

Research Staff

  • Prof. Yuji Matsumoto

    Prof.
    Yuji Matsumoto

  • Assoc.Prof. Masashi Shimbo

    Assoc.Prof.
    Masashi Shimbo

  • Assist.Prof. Hiroyuki Shindo

    Assist.Prof.
    Hiroyuki Shindo

  • Assist.Prof. Hiroshi Noji

    Assist.Prof.
    Hiroshi Noji

E-mail { matsu , shimbo , shindo, noji }[at] is.naist.jp

Research Area

1. Making natural language processing resources publicly available

We believe that publicly available software and resources are important for the advancement of computational linguistics. Therefore, fundamental work in building essential resources such as dictionaries and annotated corpora is performed. Various widely used software tools are also maintained for core natural language analysis. Examples include:

Software: Japanese Morphological Analyzer ("Chasen"), Dependency parser ("Cabocha"), Predicate Argument Structure Analyzer ("Syncha")

Resources: NAIST Text Corpus, NAIST Japanese/English/Chinese dictionaries

2. Learning-based natural language processing and knowledge acquisition

Machine learning approaches are investigated to acquire linguistic rules automatically from large-scale text data. This approach enables us to build highly accurate and robust statistical natural language taggers and parsers. We also perform research in lexical and expert knowledge acquisition from scientific documents.

3. Applications

We explore novel applications that are enabled by computer processing of natural language. For example, our work in language learning assistance studies how computers can be used to help humans learn second languages. Our Machine Translation effort focuses on automatic or semi-automatic construction of multi-lingual lexical or expression databases. Also, we have explored textual entailment, sentiment analysis, and information extraction.

Key Features

Natural languages are highly complex systems embodying various kinds of exceptions and subtle linguistic phenomena among beautiful grammatical rules. They are also systems for representing and describing our knowledge. To analyze and interpret languages computationally, one needs various theories and tools. Our lab organizes many research projects and reading groups focusing on areas from fundamentals to applications. Each group presents surveys of cutting-edge research topics and reads books and journals,while each project holds meetings on the research progress of its members. By participating in these reading groups and research projects, we encourage people to gain extensive knowledge on natural language processing that cannot be studied otherwise.

Fig.1: Online demo of information extraction of restaurant reputations: Customer review positive / negative opinions extraction and summary

Fig.1: Online demo of information extraction of restaurant reputations: Customer review positive / negative opinions extraction and summary

Fig.2: During a reading group session discussion

Fig.2: A reading group session discussion

Fig.3: Overview of corpus management and annotation tools

Fig.3: Overview of corpus management and annotation tools