GUO ZHIYU | D, 中間発表 | 自然言語処理学 | 渡辺 太郎, 中村 哲, 進藤 裕之 |
title: Improving the Efficiency of Large Language Models on Handling Long Sequences
abstract: Remarkable advances in natural language processing have been achieved by scaling up the computational budget, training data, and model size. However, large language models are expensive to produce and incur high energy costs. The existing large language models are all based on Transformer architecture. The quadratic complexity of self-attention in Transformer architecture makes it inefficient for handling long sequence. Many efficient long-range attention variants have been recently proposed. Previous work indicate that simple local attentions used in Longformer and BigBird is competitive for long-context tasks. The implementation of local window attention is non-trivial, the block implementation is efficient. We find that the complexity of blockwise local window attention does not scale linearly with respect to the window length. By using non-overlapping implementation, we are able to significantly improve the efficiency of BigBird while closely matching accuracy. language of the presentation: English | |||
LI HUAYANG | M, 2回目発表 | 自然言語処理学 | 渡辺 太郎, 中村 哲, 進藤 裕之 |
title: Visualizing the Relationship Between Encoded Linguistic Information and Task Performance
abstract: Probing is popular to analyze whether linguistic information can be captured by a well-trained deep neural model, but it is hard to answer how the change of the encoded linguistic information will affect task performance. To this end, we study the dynamic relationship between the encoded linguistic information and task performance from the viewpoint of Pareto Optimality. Its key idea is to obtain a set of models which are Pareto-optimal in terms of both objectives. From this viewpoint, we propose a method to optimize the Pareto-optimal models by formalizing it as a multi-objective optimization problem. We conduct experiments on two popular NLP tasks, i.e., machine translation and language modeling, and investigate the relationship between several kinds of linguistic information and task performances. Experimental results demonstrate that the proposed method is better than a baseline method. Our empirical findings suggest that some syntactic information is helpful for NLP tasks whereas encoding more syntactic information does not necessarily lead to better performance, because the model architecture is also an important factor. language of the presentation: English | |||
廣瀬 惟歩 | M, 2回目発表 | 自然言語処理学 | 渡辺 太郎, 中村 哲, 進藤 裕之 |
title: Detection of idiomatic expression in English sentence using BIOE tagging abstract: In neural machine translation (NMT), the translation of idiomatic expressions is one of the issue. As NMT systems interpret the target sentences literally and translate them, literal translation errors are often caused for idiomatic expressions, which have meaning that cannot be inferred from composed words. Also, several automatically evaluation metrics for machine translation have been proposed. However, these metrics are not suitable to judge the quality of translation for idiomatic expressions, since they cannot focus on the specific parts of translation sentence. In prior research, as an automatically evaluation metrics for translation of idiomatic expressions, the “blacklist” method is proposed. This method uses a list of words that would occur when idiomatic expressions are literally translated. Following the previous study, the purpose of this study is to develop a method to evaluate machine translation results by whether or not idiomatic expressions are present. In this presentation, as part of this effort, we describe the result of experiment on idiom detection using BIOE tagging for English sentence in Wikitionary. language of the presentation: Japanese 発表題目: BIOEタグを活用した英文中のイディオム表現の検出 発表概要: ニューラル機械翻訳 (NMT) において、イディオム表現の翻訳は課題の1つとして挙げられる。機械翻訳システムは原文を文字通りの意味に解釈して翻訳するため、構成される単語からは推測できない意味を有するイディオム表現に対しては、原文を直訳するエラーが度々生じる。また、翻訳が妥当であるか判別する自動評価指標はいくつか提案されているものの、これらは翻訳文を特定の箇所に絞って評価することができないため、イディオム表現の翻訳の評価には適さない。先行研究では、イディオム表現の翻訳に対する自動評価指標として、イディオム表現を直訳した際に生じるであろう単語をリストにまとめた “ブラックリスト” 法が提案されている。そこで本研究では、この先行研究に倣い、機械翻訳結果をイディオム表現の有無で評価する手法を検討することを目的とする。本発表ではその一環として、Wikitionaryに掲載されている英文を対象とした、BIOEタグを活用した文中のイディオム検出の実験結果について述べる。 | |||
QU ZHI | M, 2回目発表 | 自然言語処理学 | 渡辺 太郎, 中村 哲, 進藤 裕之 |
title: Adapting to Non-Centered Languages for Zero-shot Multilingual Translation
abstract: Multilingual neural machine translation can translate unseen language pairs during training, i.e. zero-shot translation. However, the zero-shot translation is always unstable. Although prior works attributed the instability to the domination of central language, e.g. English, we supplement this viewpoint with the strict dependence of non-centered languages. In this work, we propose a simple, lightweight yet effective language-specific modeling method by adapting to non-centered languages and combining the shared information and the language-specific information to counteract the instability of zero-shot translation. Experiments with Transformer on IWSLT17, Europarl, TED talks, and OPUS-100 datasets show that our method not only performs better than strong baselines in centered data conditions but also can easily fit non-centered data conditions. By further investigating the layer attribution, we show that our proposed method can disentangle the coupled representation in the correct direction. language of the presentation: English | |||