コロキアムB発表

日時: 9月16日(木)2限（11:00~12:30）

会場: L1

司会: 品川政太朗

福本　晃汰	M, 2回目発表	知能システム制御	杉本　謙二,	池田　和司,	小林　泰介
title: Sampling-Based Model Predictive Control Focusing on the Asymmetry of Kullback-Leibler Divergence abstract: Model Predictive Control (MPC) is one of the effective control methods for complex systems such as automatic driving and robotics. As one of the MPC solvers, the cross-entropy method (CEM) is well known as the most flexible and general method. Although CEM can be applied to most systems, it requires a sufficient (theoretically infinite) number of samples and updates for convergence, resulting in extremely high computational cost. Therefore, we focus on the asymmetry of the Kullback-Leibler divergence used in the minimization problem of CEM, and propose a new algorithm for CEM by redefining its minimization problem, so-called risk aversion CEM (RA-CEM). RA-CEM allows the function that can be regarded as a weight for the sampled trajectory to take negative values, so that even with a small iteration, the algorithm actively avoids trajectories with poor performance and prioritizes convergence to trajectories with good performance. In a highway driving simulation, RA-CEM improved the success rate from the standard CEM. language of the presentation: Japanese 発表題目: カルバック・ライブラ情報量の非対称性に着目したサンプリングベースモデル予測制御発表概要: モデル予測制御（MPC）は、自動運転やロボットなどの複雑なシステムに有効な制御手法の一つである。MPCソルバーの一つであるクロスエントロピー法(CEM)は、最も柔軟で一般的な手法としてよく知られている。CEMはほとんどのシステムに適用できるが、収束のためには十分な数の（理論的には無限の）サンプルと更新を必要とし、結果として非常に高い計算コストがかかってしまう。そこで本研究では、CEMの最小化問題で用いられるKullback-Leibler divergenceの非対称性に着目し、その最小化問題を再定義することでCEMの新しいアルゴリズム、いわゆるリスク回避CEM（RA-CEM）を提案する。RA-CEMでは、サンプリングされた軌道の重みとみなされる関数に負の値を持たせることで、少ない反復回数でも、性能の悪い軌道を積極的に回避し、性能の良い軌道への収束を優先させる。高速道路の走行シミュレーションにおいて、RA-CEMは標準的なCEMよりも成功率が向上していることを確認した。

北村　俊徳	M, 2回目発表	知能システム制御	杉本　謙二,	池田　和司,	松原　崇充
title: Cautious Actor-Critic: Stable off-policy deep reinforcement learning for continuous control abstract: While recent off-policy actor-critic (AC) methods have demonstrated superior sample-efficiency and performance in many challenging continuous control tasks, they often suffer from significant performance oscillation during learning due to the persistent errors induced by off-policy learning. In this paper, we propose a novel off-policy AC algorithm cautious actor-critic (CAC) which achieves stable learning while maintaining the sample-efficiency and performance of off-policy AC methods. The name cautious comes from the doubly conservative nature of the algorithm, the conservative actor which linearly interpolates two consecutive policies and the conservative critic which prevents huge change between the consecutive policies. We compare CAC to state-of-the-art AC methods on a set of challenging continuous control problems and demonstrate that CAC achieves comparable performance while significantly stabilizes learning. language of the presentation: Japanese 発表題目: 慎重に学習するオフポリシー・アクター・クリティック法の提案発表概要: 近年のオフポリシーなアクター・クリティック法は様々な連続値制御タスクで優れたサンプル効率を示してきた一方, 方策の単調改善性が保証できず, 学習が不安定になる問題がある. 本研究では, オフポリシーな手法のサンプル効率を保持しながら, より安定した学習を実現した手法, Cautious Actor-Critic (CAC)を提案する. CACは現在の方策と事前に設計した参照方策の線形混合によって保守的に更新されるアクターと, 混合する方策同士を近づけるよう保守化されたクリティックにより, 従来の手法と比較してより慎重(cautious)に方策を更新する. 評価実験では高次元入力な連続値制御のベンチマークタスクを用いて有効性を評価した. 実験結果より, CACがオフポリシーな手法であるSoft Actor-Critic (SAC)と比較して同等のサンプル効率を保持しながら, より安定した学習が可能であることが確認された.

西本　宏樹	D, 中間発表	コンピューティング・アーキテクチャ	中島　康彦,	池田　和司,	TRAN THI HONG,	張　任遠
title: ①GPGPU Implementation of Variational Gaussian Mixture Models ②Implementation of Resampling in Sequential Monte Carlo with Fixed-Point Operations abstract: ① The efficient implementation strategy for speeding up highquality clustering algorithms is developed on the basis of general purpose graphic processing units (GPGPUs) in this work. Among various clustering algorithms, a sophisticated Gaussian mixture model (GMM) by estimating parameters through variational Bayesian (VB) mechanism is conducted due to its superior performances. Since the VB-GMM methodology is computation-hungry, the GPGPU is employed to carry out massive matrix-computations. To efficiently migrate the conventional CPU-oriented schemes of VB-GMM onto GPGPU platforms, an entire migration-flow with thirteen stages is presented in detail. The CPU-GPGPU co-operation scheme, execution re-order, and memory access optimization are proposed for optimizing the GPGPU utilization and maximizing the clustering speed. Five types of real-world applications along with relevant data-sets are introduced for the cross-validation. From the experimental results, the feasibility of implementing VB-GMM algorithm by GPGPU is verified with practical benefits. The proposed GPGPU migration achieves 192x speedup in maximum. Furthermore, it succeeded in identifying the proper number of clusters, which is hardly conducted by the EM-algotihm ② Fixed-point implementation of resampling algorithm in sequential Monte Carlo(SMC) and its evaluation are shown in this work. SMC is a powerful numerical algorithm that is widely applied to Bayesian inference and Kalman filter. However, SMC is difficult to apply huge data due to too much computations. Some previous works already reported how to speed up SMC with GPGPUs and FPGAs. In this study, we implement the resampling algorithm that is the major bottleneck of SMC with fixed-point operations. And we conclude our method can increase the execution speed and reduce the amount of FPGA hardware. The experimental results validate the feasibility of the fixed-point resampling algorithm and show that it has practical benefits. The proposed fixed-point implementation uses 91%, 85% less LUTs, BRAMs, respectively, than the conventional floating-point implementation, and is 1.99 times faster when the number of particles in SMC is 2^20. language of the presentation: Japanese 発表題目: ① GPGPUによる変分混合ガウスモデルのパラメータ推定高速化 ② 固定小数点実装による逐次モンテカルロにおけるリサンプリングアルゴリズムの実装及び評価発表概要: ① 本研究では，汎用グラフィック・プロセッシング・ユニット（GPGPU）を用いて，高品質なクラスタリング・アルゴリズムを高速化するための効率的な実装戦略を開発した。様々なクラスタリングアルゴリズムの中で、優れた性能を持つ、変分ベイズ(VB)機構を用いてパラメータを推定する洗練されたガウス混合モデル(GMM)を実施する。このVB-GMM手法は計算量が多いため、GPGPUを用いて膨大な行列計算を行っています。従来のCPU指向のVB-GMM方式をGPGPUプラットフォームに効率的に移行するために、13段階の移行フローを詳細に示します。また、GPGPUの利用率を高め、クラスタリング速度を向上させるために、CPUとGPGPUの協調動作、実行順序の変更、メモリアクセスの最適化を提案する。また、交差検証のために、5種類の実世界のアプリケーションと関連するデータセットを紹介した。実験結果から、VB-GMMアルゴリズムをGPGPUで実装することの実現性が検証され、実用的な利点が得られた。提案されたGPGPUマイグレーションは、最大で192倍のスピードアップを達成した。さらに、EM-algotihmではほとんど行われない、適切なクラスタ数の特定にも成功した。 ② 本研究では、シーケンシャル・モンテカルロ法（SMC）におけるリサンプリング・アルゴリズムの固定小数点実装とその評価を行った。SMCは強力な数値アルゴリズムであり、ベイズ推論やカルマンフィルターなどに広く応用されている。しかし、SMCは計算量が多いため、巨大なデータに適用することは難しい。いくつかの先行研究では、GPGPUやFPGAを用いてSMCを高速化する方法が既に報告されている。本研究では、SMCの主要なボトルネックであるリサンプリング・アルゴリズムを固定小数点演算で実装した。実験結果は、固定小数点リサンプリング・アルゴリズムの実現可能性を検証し、実用的なメリットがあることを示した。提案した固定小数点実装は，従来の浮動小数点実装に比べて，LUT，BRAMの使用量をそれぞれ91%，85%削減し，SMCの粒子数が2^20のときに1.99倍の速度を実現しました．

会場: L2

司会: 花田研太

勘場　大	D, 中間発表	ソーシャル・コンピューティング	荒牧　英治,	佐藤　嘉伸,	若宮　翔子
title: Natural Language Processing based Worries (Medical Needs) Extraction abstract: selection. In this study, we collect question data of breast cancer patients posted on Yahoo! Chiebukuro and extract their worries using natural language processing. In addition, we will introduce an application example from worries. In recent years, the epidemic of Covid-19 has caused many problems such as behavioral restrictions. If we can find and organize social issues from the troubles, it will be helpful information in the With Corona era. In Covid-19, we clarify that restricted things by totaling the co-occurrence words from posts including the word "Corona's fault" on Twitter. language of the presentation: Japanese 発表題目: 自然言語処理を用いた人々の悩み（医療ニーズ・社会課題）抽出発表概要: 人々は病気になった時に悩みを持つことがある．その悩みには医療ニーズや社会課題を含んでいるケースがあることから，悩みを収集し統計的に処理することは医療ニーズや社会課題を整理することに役立つであろう．例えば，がん患者が副作用の情報が見つけられないという悩みを持っているのでれば，副作用情報のデータベース化と誰でも分かりやすい副作用情報発信が医療ニーズとなり，がん患者が治療を選択する際の有益な情報となるであろう．本研究ではYahoo!知恵袋に投稿された乳がん患者の質問データを収集し，自然言語処理を用いて悩みを抽出する．さらに，その悩みからの応用例を紹介する．また，近年ではCovid-19の流行により行動の制限などの多くの困りごとが発生した．その困りごとから社会課題を見つけて整理することができれば，Withコロナ時代の有益な情報になる．Covid-19においてはTwitterで「コロナのせい」というワードを含む投稿から共起した単語を集計することにより制限を受けたことを明らかにする．

由井　朋子	D, 中間発表	ヒューマンロボティクス	和田　隆広,	佐藤　嘉伸,	趙　崇貴,	佐藤　勇起
title: Development of a hand scaling simulator for dental hygienists abstract: Hand scaling is one of the most important tasks of a dental hygienist. For a preventive measure against diseases such as periodontal disease, hand scaling is performed frequently in clinical practice. A lot of time is spent teaching it at dental hygienist training schools. However, this technique can only be mastered through repeated training. To make the training more efficient, the development of a simulator that can quantify the skills of dental hygienist students and provide feedback to improve the skill in real-time is necessary. In this research, we fabricated a device that measures the force applied to the tooth surface and the motion of the hand-held instrument with a force sensor and an IMU sensor, respectively. We believe that the contact state between the tooth surface and the instrument is related to efficiency and safe manipulation of hand scaling. Therefore, we are currently investigating whether we can determine the differences of a contact state between the tooth surface and the instrument from the measured values. In the experiment, we took data from two patterns in which we changed only the contact state between the instrument and the tooth surface without changing the motion. In this presentation, we will report whether it is possible to discriminate differences between hand-scaling motions. language of the presentation: Japanese 発表題目: 歯科衛生士のためのハンドスケーリングシミュレーターの開発発表概要: 歯科衛生士の重要な業務の1つにハンドスケーリングがある。この業務は、歯周病などの病気を防ぐ処置として実施するため、臨床でも実施頻度は高い。よって、歯科衛生士の養成学校でも多くの時間を割いて指導される。ただ、この技術はトレーニングを繰り返さなければ習得できない。より効率的なトレーニングを行うためには，歯科衛生士学生のスキルを定量化し，リアルタイムにスキル向上のためのフィードバックを行うことができるシミュレータの開発が必要である。本研究では、歯の表面にかかる力を力センサで、手で持った器具の動きをIMUセンサで計測するシステムを製作した。私たちは、ハンドスケーリング中の効率的で安全な器具操作には、歯の表面と器具の接触状態が関係していると考えている。よって現在、歯の表面と器具の接触関係を変えた場合に、その違いを計測値から判別できないか検討している。実験では、動作は変えずに歯の表面と器具の接触関係だけを変えた2パターンのデータを取った。今回は、実際に判別したいデータとなる初心者の不安定なハンドスケーリング動作で、判別が可能であるかを報告する。