HANG JIANGNAN | M, 2回目発表 | 自然言語処理学 | 渡辺 太郎 | 荒牧 英治 | 上垣外 英剛 | |
title: Enhanced Low-resource Language Document Machine Translation with One-To-One Translation by Large Language Model Approach
abstract: Document translation, which deals with real-world long texts, has increasingly become a key focus of the academia. Current strategies addressing this issue face challenges: smaller transformer models are limited by their inadequate discourse modeling in context and insufficient capacity to handle long tokens, while large language models struggle with issues of mechanism opacity and inaccuracies caused by hallucinations. To solve these challenges, we propose a multi-stage solution that combines the advantages of both approaches. Our method leverages the flexibility and strong performance of small models in sentence-level translation tasks, while involving large language models in extracting and restoring contextual information. Specifically, we focus on low-resource languages, and experimental results demonstrate that this design strategy effectively improves translation performance in low-resource languages, while enhancing the interpretability and controllability of the translation process. language of the presentation: English | ||||||
大南 英理 | M, 2回目発表 | 自然言語処理学 | 渡辺 太郎 | 荒牧 英治 | 大内 啓樹 | |
title: JDocQA: Japanese Document Question Answering Dataset for Generative Language Models
abstract: Document question answering is a task of question answering on given documents such as reports, slides, pamphlets, and websites, and it is a truly demanding task as paper and electronic forms of documents are so common in our society. We introduce Japanese Document Question Answering (JDocQA), a large-scale document-based QA dataset, essentially requiring both visual and textual information to answer questions, which comprises 5,504 documents in PDF format and annotated 11,600 question-and-answer instances in Japanese. We empirically evaluate the effectiveness of our dataset with text-based large language models (LLMs) and multimodal models. language of the presentation: Japanese | ||||||
片山 歩希 | M, 2回目発表 | 自然言語処理学 | 渡辺 太郎 | 荒牧 英治 | 大内 啓樹 | 東山 翔平 |
title: Evaluating Language Models in Location Referring Expression Extraction from Early Modern and Contemporary Japanese Texts
abstract: Automatic extraction of geographic information, specifically Location Referring Expressions (LREs), can aid humanities research in analyzing large collections of historical texts. In this study, to investigate how accurate pretrained Transformer language models (LMs) can extract LREs from historical texts, we evaluate two representative types of LMs, namely, masked language model and causal lanugage model, using early modern and contemporary Japanese datasets. Our experimental results demonstrated the potential of contemporary LMs for historical texts, but also suggest the need for further model enhancement, such as pretraining on historical texts. language of the presentation: Japanese 発表題目: 近世・現代日本語テキストからの位置参照表現抽出における言語モデルの評価 発表概要: 地理情報、特に位置参照表現(Location Referring Expressions:LRE)の自動抽出は、大量の歴史的テキストを分析する人文科学研究を支援することができる。本研究では、学習済みのTransformer言語モデル(LM)が歴史的テキストからLREをどの程度正確に抽出できるかを調べるため、近世と現代の日本語データセットを用いて、代表的な2種類のLM、すなわちマスク言語モデルと因果言語モデルを評価した。その結果、歴史的テキストに対する現代言語モデルの可能性が示されるとともに、歴史的テキストに対する事前学習など、さらなるモデルの改良の必要性が示唆された。 | ||||||
郷原 聖士 | M, 2回目発表 | 自然言語処理学 | 渡辺 太郎 | 荒牧 英治 | 上垣外 英剛 | |
title:Do LLMs Implicitly Determine the Suitable Text Difficulty for Users?
abstract:Educational applications, including adaptive learning platforms and intelligent tutoring systems need to provide personalized content with feedback in order to improve learners' skills. It is important to understand the individual learning level to improve the learners' understanding in educational applications. When using large language models (LLMs) for such applications leveraging its response generation capacity, the LLMs should be able to provide appropriate feedbacks to users. This work investigates how well LLMs can implicitly adjust their difficulty level to match with the user input when generating their responses. We introduce a new dataset from Stack-Overflow, consisting of question-answer pairs related to programming, and propose a method to analyze the ability in aligning text difficulties by measuring the correlation with various text difficulty metrics. Experimental results on our Stack-Overflow dataset show that LLMs can implicitly handle text difficulty between user input and its generated responses. Similar trends were observed for the multi-turn English lesson dataset of Teacher Student Chatroom Corpus (TSCC). We also observed that some LLMs, when instruction-tuned, can surpass humans in handling varying text difficulty. language of the presentation:Japanese 発表題目:LLM はユーザーに適したテキストの難易度を暗黙的に考慮しているのか? 発表概要:学生の理解度向上には、個人の学習レベルに適した教育が必要である。また、言語学習などの指導では、教員は各学生の理解度の把握が重要である。ただし、教員が全学生に個別指導を行うのは時間的な制約から困難である。解決策として、大規模言語モデル(LLM)で学生の質問応答をサポートする方法が考えられる。LLM は、幅広い分野への回答が可能なため、LLM を活用した細かな指導の自動化が期待される。しかし、LLM が指導者の代わりに質問応答できるとして、細かな指導の限界は未知である。そこで本研究では、教育分野での LLM の活用を促進するために文章の難易度に焦点を当てて、LLM が持つユーザへの暗黙的な難易度調整能力を調査する。 | ||||||
馬渕 航 | M, 2回目発表 | ソフトウェア設計学 | 飯田 元 | 松本 健一 | 市川 昊平 | 柏 祐太郎 |
title: Toward An empirical investigation into the use of Dockerfile Preprocessor for Docker image management
abstract: Dockerfile development projects usually support various base images, architectures, or versions of services. Depending on what they support, the contents of the Dockerfiles are different even for the same product. To generate multiple Dockerfiles automatically, Dockerfile Preprocessors (DPP) are often developed. While many projects employ DPP, it is still not clear what the criteria and the benefits of introducing DPP are. To explore the characteristics of projects using DPP and the contribution of developing DPP, this study analyzes 146 repositories that develop Dockerfiles. Our empirical results demonstrate that the projects adopting DPP have approximately 4.0 more versions and tags than the projects that do not use DPP. Additionally, 62\% of the projects using DPP increased the number of commits after employing DPP. language of the presentation: Japanese | ||||||
森川 靖仁 | M, 2回目発表 | ソフトウェア設計学 | 飯田 元 | 松本 健一 | 市川 昊平 | 柏 祐太郎 |
title: Token-Level Review Comment Recommendation Method Using CodeBERT
abstract:Code review is an important process for ensuring software quality. However, it has been reported that reviews require a significant amount of time. Recently, methods have been proposed to recommend points to be addressed in reviews to conduct them more efficiently. However, existing methods recommend inline comments at the line level, which is coarse-grained, making it difficult to identify the specific points within the line. Therefore, this study aims to recommend at the more granular token level. Specifically, we fine-tune CodeBERT, which has been pre-trained on source code, and use the tokens with high weights obtained from the Attention layers within CodeBERT for token-level recommendations. language of the presentation: Japanese | ||||||
米倉 未樹 | M, 2回目発表 | ソフトウェア設計学 | 飯田 元 | 松本 健一 | 市川 昊平 | 柏 祐太郎 |
title: Context-aware Self-Admitted Technical Debt detector
abstract: Self-Admitted Technical Debt (SATD) refers to issues or tasks that need to be resolved within the code, specifically those that the developer is aware of and deliberately embedded in the code. For example, developers can leave SATD comments in the code to inform the team that the current implementation is not optimal and will require future maintenance. In recent years, various SATD detection methods have been proposed to facilitate the analysis of SATD. However, there are numerous instances where comments explaining the processing of the source code are mistakenly detected as SATD. This study attempts to improve SATD detection accuracy by considering the context within the source code to prevent false positives. Specifically, we use CodeBERT to learn from the comments and related source code. Compared to existing methods, this approach significantly reduces false detections of comments describing the code content, demonstrating that context consideration enhances the model's performance. language of the presentation: Japanese | ||||||