KIM GAHEE | D, 中間発表 | ロボットラーニング | 松原 崇充 | 池田 和司 | 柴田 一騎 | 鶴峯 義久 | 佐々木 光 |
title: Active Simulation-based Inferece: Informative Action Design without Explicit Likelihood Models
abstract: In this talk, we introduce active inference framework for estimating simulation parameters. Our framework provide a sequential active parameter estimation framework that interleaves simulators with reality. When selecting the optimal action to maximize information gain from prior to posterior, approximating the surrogate model using a real dataset has been widely used. However, this approach is rarely applicable in complex real-world scenarios since collecitng a real dataset sufficient for training a dense surrogate model is difficult. Also, some simplistic assumptions are needed for an anlytical calculation of the posterior from the surrogate model. We focus on recently developing simulation technology and propose a simulation-based active estimation framework integrating simulators to learn posterior estimators and compute information gain. We compare the performance of our framework in obtaining the posterior with sequential observations using an intuitive numerical toy model and a sim2sim robot experiment. Also, we perform a real robot experiment to illustrate the potential of practical usage. language of the presentation: Japanese | |||||||
竹原 晃多 | M, 2回目発表 | 数理情報学 | 池田 和司 | 松原 崇充 | 久保 孝富 | 日永田 智絵 | Li Yuzhe |
title: Mathematical model for self-efficacy and skill in VR Kendama task
abstract: Self-efficacy refers to one's self-evaluation of one's ability or belief in one's ability to accomplish a particular task or goal and is expected to be improved even by pseudo-created success experiences. However, when seeking to improve skills at the same time, it is desirable to appropriately adjust the difficulty level of the target task. In this study, to observe changes in self-efficacy and skill depending on the difficulty of the target task, we proposed a model of the Kendama task in a VR environment and conducted simulation experiments using the model with multiple parameter patterns. The results showed that the group was divided into two groups, one with a strong impression of failure and the other with a strong impression of success. One group improved their skills while the other group did not, even at the same difficulty level, suggesting the need for appropriate adjustment of the difficulty level according to individual differences. language of the presentation: Japanese 発表題目: VR けん玉タスクにおける自己効力感と熟練度の数理モデル 発表概要: 自己効力感とは、自分自身が特定の課題や目標を達成する能力や信念に対する自己評価のことを示し、擬似的に作り上げた成功体験でも向上することが期待されている。しかし、同時にスキルも向上させることを求める場合は、対象課題の難易度を適切に調整することが望まれる。本研究では、対象課題の難易度による自己効力感とスキルの変化を観察するため、VR環境におけるけん玉タスクのモデルを提案し、複数のパラメータパターンで、モデルを使用したシミュレーション実験を行った。その結果、失敗を強く印象づけるグループと成功を強く印象づけるグループに分かれた。同じ難易度でも、一方はスキルが向上しているが、他方はスキルが向上していないことから、個人差に応じた適切な難易度調整が必要であることが示唆された。 | |||||||
林 和樹 | M, 2回目発表 | 自然言語処理学 | 渡辺 太郎 | 池田 和司 | 上垣外 英剛 | ||
title: Evaluating Image Review Ability of Vision Language Models
abstract: Large-scale Vision-Language Models (LVLMs) can process both images and text, demonstrating advanced capabilities in multimodal tasks like image captioning and visual question answering (VQA). However, it remains unclear whether they have an ability to understand and evaluate images, particularly in capturing the nuanced impressions and evaluations. To address this, we propose an image review evaluation method using rank correlation analysis. Our method asks a model to rank five review texts for an image. We then compare the model's rankings with human rankings to measure correlation. This enables effective evaluation of review texts that do not have a single correct answer. We validate this approach with a benchmark dataset of images from 15 categories, each with five review texts and annotated rankings in English and Japanese, resulting in over 2,000 data instances. Our experiments show that LVLMs excel at distinguishing between high-quality and low-quality reviews. language of the presentation: Japanese | |||||||
CAO ZHE | M, 2回目発表 | 自然言語処理学 | 渡辺 太郎 | 池田 和司 | 上垣外 英剛 | ||
title: Exploring Intrinsic Language-specific Subspaces in Fine-tuning Multilingual Neural Machine Translation
abstract: Multilingual neural machine translation models support fine-tuning hundreds of languages simultaneously. However, fine-tuning on full parameters solely is inefficient potentially leading to negative interactions among languages. In this work, we demonstrate that the fine-tuning for a language occurs in its intrinsic language-specific subspace with a tiny fraction of entire parameters. Thus, we propose language-specific LoRA to isolate intrinsic language-specific subspaces. Furthermore, we propose architecture learning techniques and introduce a gradual pruning schedule during fine-tuning to exhaustively explore the optimal setting and the minimal intrinsic subspaces for each language, resulting in a lightweight yet effective fine-tuning procedure. The experimental results on a 12-language subset and a 30-language subset of FLORES-101 show that our methods not only outperform full-parameter fine-tuning up to 2.25 spBLEU scores but also reduce trainable parameters to 0.4% for high and medium-resource languages and 1.6% for low-resource ones. language of the presentation: English | |||||||