Collocation Suggestion for Japanese Second Language Learners

Lis Weiji Kanashiro Pereira (1151128)


Native speakers of a language have extensive knowledge about which words should be used together and combine them accurately. However, for language learners, combining words in a foreign language appropriately can be a very challenging task.

This study addresses issues of Japanese language learning concerning word combinations (collocations), and proposes a method for helping language learners in choosing right collocations. We analyze correct word combinations using different collocation measures and word similarity methods. Our analysis includes the use of a large Japanese language learner corpus for generating collocation candidates in order to build a system that is more sensitive to constructions that are difficult for learners. Results show that our proposed method obtains better precision and recall rates compared to other methods that use only well-formed text.

We also compare the results from two large-scale corpora, Mainichi Shimbun, a corpora of newspaper articles, and BCCWJ, a balanced corpus of one hundred million words of contemporary written Japanese and we show that data coverage of the corpora, more than corpora size, contributes to a better performance score when applied in automatic collocation suggestion for Japanese second language learners.