Conventional
word combinations, or collocations, have been long recognized as important in
helping language learners to communicate more efficiently and to sound more
like a native speaker. However, studies confirmed that collocations are
challenging, even for advanced second language learners. While native speakers
already have a large number of collocations available in their mental lexicon,
learners often struggle to find the right combination of words.
The goal of this
thesis is to prove the feasibility of using natural language processing
techniques to develop a writing system to suggest more appropriate collocations
in Japanese. In particular, we address the problem of generating and ranking
candidates for correcting potential collocation errors in the learners’ text.
The system generates possible correction candidates based on corrections
extracted from a large Japanese learner corpus. This corpus is used to
investigate the learner’s tendency to commit collocation errors and to produce
a smaller and more realistic set of candidates. In addition, the system uses
the Weighted Dice coefficient as the association measure to filter out
inappropriate candidate pairs and rank the proper collocations.
We carried out
experiments focusing on noun-verb constructions, which are one of the major
types of collocation problems. We report the detailed evaluation and results on
learner data. In addition, we show that our system statistically outperforms
existing approaches to collocation error correction. Finally, we describe how
to utilize this method to develop a writing assistant where learners can apply
the given collocation suggestions to revise their composition.