コロキアムB発表

日時: 12月13日 (金) 3限目(13:30-15:00)


会場: L2

司会: 柏祐太郎
曽田 涼介 M, 1回目発表 ロボットラーニング 松原 崇充 和田 隆広 柴田 一騎 鶴峯 義久 佐々木 光
title: Cooperative Grasping and Transportation of Diverse Objects with Multi-Agent Reinforcement Learning
abstract: This study explores multi-agent learning techniques for multiple robots to efficiently transport objects with diverse shapes. Existing studies simultaneously learn optimal grasping and control strategies for individual objects through trial and error. However, these methods often exhibit poor sample efficiency, leading to low transport performance when applied to diverse object shapes.The aim of this Colloquium is to clarify the limitations of existing techniques through simulations of cooperative transportation of objects with diverse shapes. Specifically, we compare models trained on multiple with individual object shapes and analyze the impact on transportation performance. By evaluating grasping, lifting, and transportation success rates, we identify key strategies for shape generalization and propose directions for improvement.
language of the presentation: Japanese
 
服部 舜 M, 1回目発表 ロボットラーニング 松原 崇充 和田 隆広 柴田 一騎 鶴峯 義久 佐々木 光
title: A Study on Harvesting Automation via Imitation Learning
abstract: This research aims to automate the harvesting of vegetables of various shapes and colors. Conventionally, the harvesting task, which consists of several steps, has relied on the experience and intuitive judgment of an expert worker. It is difficult to implement such an ambiguous, human-specific decision-making process into a robot using IF-THEN rules. In this study, a framework called imitation learning, which enables the robot to learn its policy directly from the expert’s data, is employed. In order to verify “the capability to learn patterns from diverse skilled workers’ trajectories and make consistent decisions,” which is one of the requirements for the automation of harvesting, an experimental environment was created using a pseudo-task with multi-modal behaviors. Then, an imitation learning algorithm capable of handling multi-modal actions, “Diffusion Policy,” was applied to this task, and its performance was evaluated to determine whether the learned policy satisfied the requirement.
language of the presentation: Japanese
発表題目: 模倣学習による作物の収穫自動化に関する検討
発表概要: 本研究は,形状や色彩が多様な作物の収穫作業の自動化を目指す.従来,複数の工程を経て進行する収穫作業は熟練者の経験や直感的判断に依存してきた.このような曖昧で人間特有の意思決定プロセスをIF-THENルールでロボットに組み込むことは困難である.そこで本研究では,熟練者の実作業データからロボットの方策を直接学習する枠組みである模倣学習を用いる.また,収穫の自動化に求められる要件の一つである,「多様な熟練者の教示データからパターンを学習し,一貫した意思決定を行う能力」を検証するために,行動の多峰性を有する疑似的なタスクを設定し,実機の実験環境を構築した.その上で行動の多峰性に対応可能な模倣学習手法Diffusion Policyを適用し,得られた方策が要件を満たすがどうか,その性能を検証した.
 
林 航平 M, 1回目発表 ロボットラーニング 松原 崇充 和田 隆広 柴田 一騎 鶴峯 義久 佐々木 光
title: Reinforcement Learning for Process-Specific Object Selection Based on an Image Segmentation Foundation Model
abstract: In both household tasks and professional work, humans operate within environments surrounded by multiple diverse objects. These environments are complex, as they include many objects unrelated to the current task, making robotic automation particularly challenging. In contrast, humans are thought to simplify their tasks by selectively shifting their focus to different objects as needed, thereby reducing the cognitive load of recognition and decision-making. In this study, we propose a method that learns which objects to attend to through trial and error by evaluating the effectiveness of object selection during task execution. Our proposed approach employs a segmentation foundation model to identify candidate objects of interest, and then refines these candidates using task and process information. This strategy enhances the efficiency of learning how to select the most relevant objects.
language of the presentation: Japanese
 
ALCANTARA TACORA SANDRO MANUEL M, 1回目発表 ロボットラーニング 松原 崇充 和田 隆広 柴田 一騎 鶴峯 義久 佐々木 光 角川 勇貴
title: Sample efficient Sim-to-Real Domain Randomization Reinforcement Learning by Scaling Law
abstract: Sim-to-Real learning framework consists of learning a control policy (a neural network function) in simulation and then transfer the policy to the real robot to achieve a target task. However, the learned policy fails in real environment because of the inaccuracy of simulation. To overcome this issue, domain randomization technique learns a control policy in wide domain ranges of physical simulation parameters. Previous works has focused on optimizing the domain ranges by conducting extensive simulation training and few real sample data. In this research, we propose a scaling law, which is a curve fitting equation observed in supervised learning to predict performances of neural networks. Our proposed approach includes the scaling law into the domain range optimization stage. This novel technique evaluates the learned policies trained with less simulation steps, which increase the time efficiency and training costs.
language of the presentation: English