コロキアムB発表

日時: 9月19日(水)2限(11:00~12:30)


会場: L1

司会: 進藤 裕之
古川 智雅 M, 2回目発表 知能コミュニケーション 中村 哲, 松本 裕治, 吉野 幸一郎, 須藤 克仁
title: Dialogue system for detailed conversation based on question generation using case frame
abstract: Question generation in the dialogue can interact with speakers, drive the dialogue deeper and extend content of the dialogue. In traditional question generation task, information aquisition for the purpose are studied, but these studies require manual efforts for question generation. Tthese days, learning responses for non-oriented dialogue such as chat dialogue are studied, and question generation considering context of the dialogue can drive these non-oriented dialogue deeper. In this study, we propose the task-agnostic dialogue system which can aquire information and generate question from open domain dialogue using case frame.
language of the presentation: Japanese
発表題目: 格フレームを用いた質問生成によって深掘りを行う対話システム
発表概要: 対話における質問生成は、話者同士のやりとりを双方向なものにするだけではなく、対話を掘り下げたり発展させたりすることができる。従来の質問生成タスクでは目的に応じた情報獲得の研究が行われてきたが、目的に応じた人手での質問生成機能の設計が必要である。また、近年では雑談のような目的を設定しない対話の学習が行われているが、このような問題設定においても文脈に応じた質問を行うことによって、話題の掘り下げなどを行うことができる。そこで本研究では格フレームを用いて質問を生成することによって、対話の目的の有無やドメインにかかわらず、情報獲得を視野に入れた質問生成を行うことで対話を深堀りする対話システムを提案する。
 
LI MICHAEL WENTAO M, 2回目発表 自然言語処理学 松本 裕治, 中村 哲, 新保 仁, 進藤 裕之
title: Semantic Sentence Similarity for Machine Translation Evaluation
abstract: The automatic evaluation of machine translation, which may have arbitrarily many correct outputs for a given input, is a challenging task. We define a good translation as a sentence with the most similar meaning to the source sentence, and explore various semantic sentence similarity techniques to measure that similarity. Due to the lack of suitable training data for many of these techniques, we also investigate various methods of data augmentation, to encourage our evaluation systems to make fine-grained distinctions between translations of similar quality. Our results showed that while this approach has promise, it is still difficult to obtain a good correlation with human evaluation.
language of the presentation: English
 
LU YUXUN M, 2回目発表 自然言語処理学 松本 裕治, 中村 哲, 新保 仁, 進藤 裕之
title: Reduce Hubness in Knowledge Graph Embedding Models.
abstract: Knowledge graph embedding models provide a simplified way to manipulating knowledge graphs while preserving their inherent structures by embedding the symbolic entities and relations in knowledge graphs to continuous vector space. But due to the intrinsic property of high dimensional data, the hubness phenomenon occurs and it can affect the performance of knowledge graph embedding models whose results of query (h,r,?) or (?,r,t) are given by ranking the distance between all entities' embeddings and the transfered query embedding f(h,r) or f(r,t) descendingly. This presentation would introduce our investigation regarding to analyzing hubness with a typical translation-based model, TransE, and our attempt to relieve the problem.
language of the presentation: English
 
和田 崇史 M, 2回目発表 自然言語処理学 松本 裕治, 中村 哲, 新保 仁, 進藤 裕之
title: Unsupervised Cross-lingual Word Embedding by Multilingual Neural Language Models
abstract: We propose an unsupervised method to obtain cross-lingual embeddings without any parallel data or pre-trained word embeddings. The proposed model, which we call multilingual neural language models, takes sentences of multiple languages as an input. The proposed model contains bidirectional LSTMs that perform as forward and backward language models, and these networks are shared among all the languages. The other parameters, i.e. word embeddings and linear transformation between hidden states and outputs, are specific to each language. The shared LSTMs can capture the common sentence structure among all languages. Accordingly, word embeddings of each language are mapped into a common latent space, making it possible to measure the similarity of words across multiple languages. We evaluate the quality of the cross-lingual word embeddings on a word alignment task. Our experiments demonstrate that our model can obtain cross-lingual embeddings of much higher quality than existing models when only a small amount of monolingual data (i.e. 50k sentences) are available, or the domains of monolingual data are different across languages.
language of the presentation:Japanese