Correcting Honorifics in Japanese with Anaphora Resolution

Miao Yu (1251131)


Japanese is an East Asian language spoken by about 125 million speakers, primarily in Japan, where it is the national language. One of the features of Japanese is that it places a heavy emphasis in establishing hierarchical relations among people, or paying respects to people comparing with other languages. The Japanese language includes the normal speech (常語) and honorific speech (敬語), with honorific speech playing an important role in establishing these relationships.

The honorifics in Japanese include different levels of respectful (尊敬語), humble (謙譲語), and polite (丁寧語) speech, which are frequently used in various social or business situations. The mechanism of honorifics in Japanese is complicated, and many non-native Japanese speakers, as well as members of the young generations in Japan, have trouble mastering it. This situation has encouraged the study of automatic systems that identify the proper form of honorifics in Japanese including automatic translating. However the only automatically translating the proper form of honorifics in Japanese is not very useful. That is because for honorific in Japanese, there are at least four ways to express the same meaning in different situations. Based on the relationship between the speaker and subject of the sentence, there will be different honorific forms to be expressed.

As Japanese is called "pro-drop" language, that subject omission is a common phenomenon in Japanese. So we not only need to automatically translating the regular expressions into appropriate honorifics in Japanese, we also need to anaphora resolution for transforming regular expressions into appropriate honorifics in Japanese. To address this problem, we propose the use of anaphora resolution for this task. We incorprorate anaphora resolution into a rule-based machine translation system to translate regular expression into appropriate honorific of Japanese, and examine the effectiveness of using correct subjects annotated by a human, and those automatically predicted by anaphora resolution.

The experimental results demonstrate that almost all forms of the system achieve better performance than the original source. Comparing accuracy of No Resolution, SynCha, and non-native speaker, we confirm that resolving the omission of subjects, and then translating into appropriate honorifics of Japanese is useful and necessary.