Collocation Writing Assistant for Learners of Japanese as a Second Language

Lis Weiji Kanashiro Pereira (1361016)


Conventional word combinations, or collocations, have been long recognized as important in helping language learners to communicate more efficiently and to sound more like a native speaker. However, studies confirmed that collocations are challenging, even for advanced second language learners. While native speakers already have a large number of collocations available in their mental lexicon, learners often struggle to find the right combination of words.

 

The goal of this thesis is to prove the feasibility of using natural language processing techniques to develop a writing system to suggest more appropriate collocations in Japanese. In particular, we address the problem of generating and ranking candidates for correcting potential collocation errors in the learners’ text. The system generates possible correction candidates based on corrections extracted from a large Japanese learner corpus. This corpus is used to investigate the learner’s tendency to commit collocation errors and to produce a smaller and more realistic set of candidates. In addition, the system uses the Weighted Dice coefficient as the association measure to filter out inappropriate candidate pairs and rank the proper collocations.

 

We carried out experiments focusing on noun-verb constructions, which are one of the major types of collocation problems. We report the detailed evaluation and results on learner data. In addition, we show that our system statistically outperforms existing approaches to collocation error correction. Finally, we describe how to utilize this method to develop a writing assistant where learners can apply the given collocation suggestions to revise their composition.