SUN HONGYU | D, 中間発表 | 自然言語処理学 | 渡辺 太郎, | 池田 和司, | 上垣外 英剛 | |
title: Enhancing the Extrapolation Capabilities of Language Models via Morlet Wavelet Positional Encoding
abstract: The extrapolation capability of language models is crucial for robust generalization across diverse context lengths, and multi‑scale receptive fields are central to this ability. Morlet wavelet functions are integrated into the rotary position encoding framework to enrich receptive fields diversity and enhance extrapolation performance of language models. Furthermore, to evaluate long‑context understanding and reasoning capabilities, an automated pipeline has been developed to construct logically continuous instruction data, addressing both the scarcity of such data and the high annotation cost. Therefore this research comprises two stages: 1. the creation of a dataset of logically continuous instructions and 2. the development of Morlet positional encoding. language of the presentation: English | ||||||
志子田 直輝 | M, 2回目発表 | 自然言語処理学 | 渡辺 太郎, | 池田 和司, | 上垣外 英剛 | |
title: Enhancing Vision-Language Structural Alignment via Syntactic Distance
abstract: This work investigates methods to strengthen structural alignment between vision and language in CLIP-like models by leveraging techniques from syntactic analysis and optimal transport. While large-scale vision-language models have shown impressive capabilities, they still face challenges in aligning complex spatial arrangements in images with semantic or grammatical structures in language. Our first proposal is to adapt Syntactic Distance, a technique originally used in natural language processing to represent hierarchical syntactic relationships, to both image and text representations. This enables the unsupervised extraction of structural features and their alignment during CLIP training, facilitating a more structure-aware representation learning process. language of the presentation: Japanese 発表題目: 構文距離による視覚と言語の整合性強化の検討 発表概要: 本研究では、視覚と言語のモダリティ間に存在する構造的な対応関係に着目し、それをCLIPの学習および評価に活用する手法を提案する。大規模画像言語モデルは多様なタスクで高精度を達成しているが、複雑な空間配置や文法構造を含む入力に対する整合性や解釈性には課題がある。自然言語処理で構文的な階層構造をとらえるために用いられてきた構文距離(Syntactic Distance)の手法を拡張し、画像とテキストの内部表現から教師なしで構造的特徴量を抽出・整合する学習方式である。これにより、CLIPが空間的・意味的な構造をより深く捉えることを目指した。 | ||||||
神野 倫行 | M, 2回目発表 | 自然言語処理学 | 渡辺 太郎, | 池田 和司, | 上垣外 英剛, | 坂井 優介 |
title: ***Cosine Similarity as Logits?: Few-shot Knowledge Graph Completion with Embedding Vectors of a Generative Pre-trained Language Model and its Application in Knowledge Probing***
abstract: ***Text-based knowledge graph completion (KGC) models utilising pre-trained language models (PLMs) have recently gained popularity but pose new challenges. Encoder-based methods are often incapable of in-context learning, while decoder-based methods autoregressively generate tail-entities resulting in slow inference. KGCs are also used in knowledge probing to evaluate the factual knowledge retrieval capability of PLMs but face similar limitations when evaluating generative PLMs. To address this, we propose DEcoder-Embedding based Relational KGC (DEER), an encoder-based KGC, that utilizes embedding vectors acquired from a generative PLM. DEER is capable of in-context learning while retaining the efficiency of encoder-based methods, enabling large-scale inference without task-specific training. We empirically show that DEER is well suited for predicting new relation types in KGs with few relation types and that it aligns with an existing knowledge probe, validating its use for knowledge probing. *** language of the presentation: English | ||||||
夏見 昂樹 | M, 2回目発表 | 自然言語処理学 | 渡辺 太郎, | 池田 和司, | 上垣外 英剛, | 坂井 優介 |
title: Probabilistic minimum Bayes risk decoding with robust matrix completion that agreements with the knowledge distillation model
abstract: In machine translation tasks, minimum Bayes risk (MBR) decoding is a method to improve the quality of machine translation by evaluating output candidates using an evaluation metric and using the results for decoding. However, MBR decoding is computationally expensive because it requires the calculation of scores using an evaluation metric between all candidate sentences and pseudo-reference sentences, which are pseudo-reference sentences created using a machine translation model. (PMBR) decoding is faster by performing matrix completion from partial scores between candidate sentences and pseudo-reference sentences and calculating scores between all candidate and pseudo-reference sentences in an approximate approach. However, PMBR decoding has a problem that when the number of score calculations is reduced to achieve a significant speed-up, the overall score produced by MBR decoding and the complemented score significantly deviate from each other. To improve this problem, this study attempts to utilize not only a single evaluation metric model but also a knowledge distillation model that is a lightweight version of the evaluation metric model. Focusing on the highly accurate but slow features of the original model and the less accurate but fast features of the knowledge distillation model, this study attempts to improve this problem by using not only a single metric model but also a lightweight knowledge distillation model. The amount of computation of scores in the original model is reduced to compensate for the reduced amount of computation in the knowledge distillation model. Then, by performing matrix completion so that they agree with each other, robust decoding is achieved. language of the presentation: Japanese 発表題目: 知識蒸留モデルと合意をとる頑健な行列補完を用いた確率的最小ベイズリスク復号 発表概要: 機械翻訳タスクにおいて、最小ベイズリスク(minimum bayes risk; MBR)復号は出力候補間を評価指標を用いて評価し、この結果を復号に利用することで機械翻訳の品質向上を可能とする手法である。しかし、MBR復号は全候補文と機械翻訳モデルを用いて作成した擬似的な参照文である擬似参照文間に対して、評価指標によるスコアを算出する必要があり、高い計算コストを要する。MBR復号が低速である問題に対し、確率的最小ベイズリスク(Probabilistic MBR; PMBR)復号では、候補文と擬似参照文間の一部スコアから行列補完を行い、全候補・擬似参照文間のスコアを近似的に算出することで高速化した。しかし、PMBR復号では大幅な高速化のためスコアの算出回数を抑えた際、MBR復号が作成する全体のスコアと補完したスコアが著しく乖離する問題がある。これを改善するため本研究では、単一の評価指標モデルだけでなく、その評価指標モデルを軽量化した知識蒸留モデルの活用を試みる。元のモデルの高精度だが低速である特徴とその知識蒸留モデルの低精度だが高速の特徴に着目し、元のモデルでのスコアの計算量を少なく、知識蒸留モデルで削減した分の計算量を補う。そして互いに合意をとるように行列補完を行うことで頑健な復号を実現する。 | ||||||
ALCANTARA TACORA SANDRO MANUEL | M, 2回目発表 | ロボットラーニング | 松原 崇充, | 和田 隆広, | 柴田 一騎, | 鶴峯 義久, | 佐々木 光, | 角川 勇貴 |
title: Sample-Efficient Domain Randomized Reinforcement Learning through Scaling Law-Based Estimation of Required Samples
abstract: Training reinforcement learning agents in simulation for real-world deployment requires robust sim-to-real transfer. This is commonly addressed using Domain Randomization (DR), a technique that improves policy robustness by training across varied simulation parameters. Previous DR approaches expand domain parameters in a heuristic manner, but they ignore the sample efficiency to transfer knowledge to these new domains. This research aims to establish a quantitative method to evaluate sample efficiency when transfer to new domains. Our approach is to use a scaling law, which predicts transfer learning accuracy from the amount of training samples. We propose a method, Scaling Law-Based Estimation of Required Samples, to evaluate the learning efficiency of the next domain range. For evaluating our core ideas, we conduct a preliminary investigation into domain’s task difficulty and transfer gap when a pre-trained policy is evaluated in different domains. Our analysis provides evidence that domain transfer gap and its corresponding learning efficiency can be characterized before the transfer learning process. These findings lay the groundwork for future self-paced curriculum learning strategies that can select new domains based on a quantitative understanding of training efficiency and stability, rather than in a heuristic manner. language of the presentation: English | ||||||||
金崎 知華 | M, 2回目発表 | インタラクティブメディア設計学 | 加藤 博一, | 和田 隆広, | 澤邊 太志, | Isidro Butaslac | ||
title: Designing Robot Behaviors with Consideration for Human Privacy and
Evaluating User Impressions
This study aims to design robot behaviors that are considerate of users’ privacy and to evaluate users’ impressions of such behaviors. In recent years, robots that interact with people in homes and public spaces have become increasingly common, leading to more frequent encounters with users' sensitive or private information. For instance, the household robot Pepper can engage in conversations about users’ health conditions and emotions, while the delivery robot BellaBot interacts with customers in public environments. In such contexts, the information handled by robots may include content that users prefer not to disclose to others. However, current discussions on privacy protection are largely limited to technical solutions such as encryption, data deletion, and transmission control, and do not sufficiently address the robot’s actual behavior—particularly non-verbal aspects. Therefore, this study investigates what kinds of robot behaviors are appropriate in situations requiring privacy consideration, how these behaviors should be adjusted depending on the context, and how they influence users’ impressions, including trust, sense of security, and likability. As a first step, we implemented non-verbal privacy-considerate behaviors in a robot—such as covering its mouth with its hand, scanning the surroundings before speaking, and lowering its voice—and conducted a preliminary evaluation through user impression assessments. In the future, we plan to conduct stepwise dialogue experiments between humans and robots to further explore these issues. language of the presentation: Japanese 発表題目: プライバシーに配慮したロボットの行動設計と印象評価 発表概要: この研究は、プライバシーに配慮したロボットの行動設計と、その印象評価を目的とする。近年、家庭内や公共空間で人と関わるロボットが増加し、ユーザの秘密情報に触れる機会が多くなってきた。例えば、家庭用ロボットPepperは体調や感情の相談に応じることができ、配膳ロボットBellaBotは公共空間で顧客と接する。このような状況で、ロボットが扱う情報には個人が他者に知られたくないプライバシー情報が含まれる可能性がある。 しかし、現状のプライバシー保護の議論は暗号化や情報削除、データ送信の制御など技術的処理に偏っており、ロボットの振る舞い方、特に非言語的な側面に関しては十分に議論されていない。そこで本研究では、プライバシー配慮が求められる状況でロボットがどのような振る舞いを取るべきか、状況に応じてどのように振る舞いを調整すべきか、そしてそれがユーザの信頼感や安心感、好感度といった印象にどのような影響を与えるかを調査する。 その第一歩として本研究では、手で口元を覆う、発話前に周囲を見渡す、声量を下げるといった非言語的プライバシー配慮行動をロボットに実装し、ユーザの印象評価を通じた予備的な検証を行った。今後は、ロボットと人の対話実験を段階的に実施し、さらなる知見を得ていく予定である。 | ||||||||
CHENG YUAN HAU | M, 2回目発表 | インタラクティブメディア設計学 | 加藤 博一, | 和田 隆広, | 澤邊 太志, | Isidro Butaslac | ||
title: Coins on the Road: Enhancing Control and Comfort through Motion-Synchronized VR Gaming in Autonomous Vehicles
abstract: With the advancement of autonomous driving, passengers are freed from driving tasks, opening up new possibilities for in-car productivity and entertainment. Virtual Reality (VR) stands out as a key technology to create immersive passenger experiences, breaking the physical constraints of the vehicle. However, this presents significant challenges, primarily motion sickness, which arises from the sensory conflict between visual inputs in the virtual world and the physical motions of the car. This research proposes a novel solution: a VR coin-collection game where the game's path is synchronized with the vehicle's real-world movements. The objective is to mitigate motion sickness and enhance the user's sense of control and immersion by aligning the passenger's visual perception with the vehicle's actual motion. A pre-experiment was conducted to evaluate the effects of visual cues (coins) and camera perspective (first-person vs. third-person). The results indicate that visual cues significantly improve path prediction and enhance the sense of presence, while perspective changes are only effective when such cues are available. This study suggests that motion-synchronized gaming is a promising approach to improving the in-car VR experience. language of the presentation: English | ||||||||
太田 裕紀 | D, 中間発表 | サイバネティクス・リアリティ工学 | 清川 清, | 和田 隆広, | 内山 英昭, | Perusquia Hernandez Monica, | 平尾 悠太朗 | |
title: Rendering Diverse Tactile Sensations through Fingertip Contact Plane Tilt Manipulation
abstract: Achieving effective haptic feedback with ungrounded devices is a significant challenge due to constraints in volume and power output. In this study, we developed a handheld haptic device that renders shape, stiffness, and inertia primarily by manipulating the inclination of the contact plane at each fingertip. This approach eliminates the need to control fingertip position, which is a major bottleneck for device volume and output. Experiments confirmed that the developed device is capable of presenting multiple types of shapes and levels of stiffness. language of the presentation: Japanese 発表題目: 各指先接平面の傾き操作による多彩な触感提示 発表概要: 非接地型の触覚提示装置によって効果的な触覚提示を実現することは,限られた容積や出力の問題から困難な課題である.本研究では,デバイス容積と出力のボトルネックである指先位置を操作する機能を廃し,主として各指先接平面の傾きを操作することで形状・剛性・慣性を提示するハンドヘルド型触覚提示装置を開発した.実験では,開発した触覚提示装置が複数種類の形状,剛性を提示可能であることが確認された. | ||||||||