Abstracts of Doctor Thesis 2007

近年, 化学産業では国際競争の激化や環境問題, 資源問題への意識の高まりから, より安定かつ高効率なプラントの運転が求められている. プロセス制御においても, 従来対象としてきた運転条件がほとんど変化しない場合の定常運転ばかりでなく, 頻繁な運転条件の変更がある場合の安定運転や緊急対応操作の自動化などが求められるようになっている. しかし, 化学プロセスは一般に非線形性を有するなど動特性が複雑であるため, 従来使われてきた固定パラメータの線形制御系を単独で適用しても, 広い領域内で頻繁に変更される運転条件にうまく対応できないことが多い. そこで本発表では, 広い領域内での運転条件変更にも対応できるような制御系の設計手法として, 複数の制御器をコーディネーションする方法を提案する.

最初に, 広い領域内で運転条件の変更を伴う化学プロセスの制御に関する従来研究について調査を行った結果を報告する. まず, 運転条件が頻繁に変更されるプロセスをモデル化する一つの手法として多重モデルアプローチの考え方を説明する. 多重モデルアプローチは, 局所運転領域で利用できる複数の局所モデルを結合して広い運転領域をカバーするモデルを構築する方法であり, 一般に複雑な動特性を有する化学プロセスのモデル化手法として有効であると考える. 次いで, 多重モデルアプローチのアイデアを活用した二つの制御系の設計手法を紹介する. 一つ目は, 多重モデルアプローチを素直に適用して大域モデルを構築し, この大域モデルに対して制御系設計を行う方法である. 二つ目は, 複数の局所モデル毎に制御器を構築し, これらを組み合わせて制御を行う手法である. 両手法を比較検討した結果, 制御器を組み合わせて使用する後者の制御手法の方が, よりフレキシブルな制御系の設計が可能であり, 広い領域内で運転条件が変化する化学プロセスの制御に適しているとの結論に至った.

そこで, 複数の制御器を組み合わせるというアイデアをもとにした, 新しい二つの制御方式を提案する. 一つ目の制御方式として, 複数の局所モデルそれぞれに対して適切なモデルベース制御器を設計し, それらを並列に並べてうまく結合する制御系を提案する. この手法を運転条件が極端に変化する化学反応器の温度制御に適用し, 局所制御器を単独で用いる場合と比較して良好な制御性能が得られることを示す. また本手法を適用すると, 従来提案されていた複数の制御器を組み合わせて使用する手法と比べてよい制御性能が得られること, 制御対象を完全に表現できるモデルを使用した制御系とくらべて遜色のない制御性能が得られることについても述べる. 二つ目の制御方式として, 利用可能なモデルが一つしか存在しない場合に, この一つのモデルに対して設計パラメータが異なる複数の制御器を設計し, これらを並列に並べてうまく結合する制御系を提案する. この制御系を設定温度が大きく変化する化学反応器の温度制御問題に適用し, 一つのコントローラのみを使う場合と比較して良好な制御結果が得られることを示す.

0561201 浦西友樹
「円筒鏡を用いた物体の全周形状計測」

本発表では，簡便なシステム構成で物体の全周形状を計測する手法を提案する．

自由視点から見回せる三次元全周モデルは，仮想博物館や考古遺物のデジタル保存および閲覧などの学術用途や，工業製品の計算機上でのプロトタイピングなどの産業用途へ応用されている．三次元全周モデルは，上記の学術および産業用途に加えて，インターネットオークションへの出品の際にサンプルとして提示したり，動物の動きを記録して再現できれば，生物図鑑などへの応用も可能となるなど，エンドユーザによる利用も期待される．エンドユーザによる物体の三次元形状計測，および再構成されたモデルの利用が普及するためには，簡単なシステム構成および計測手順による全周形状計測システムが望まれる．

動きのある物体にも適用可能な全周形状計測手法のひとつとして，ステレオ視が広く用いられる．これは，物体を複数の視点から撮影した多視点画像から，同一の点を撮影したと考えられる対応点を探索し，三角測量の原理に基づいて物体の三次元形状を計測する手法である．しかし，ステレオ視により物体の全周形状を一度に計測するには，物体表面のすべての点が複数のカメラから観測できるようにカメラを配置しなければならず，カメラの台数が増加する．この問題を解決するため，カメラを鏡やプリズムなどの光学機器に置き換え，カメラ一台と光学機器の組合せで多視点画像を撮影することによってステレオ視を実現する，反射屈折ステレオ視と呼ばれる計測系が提案されている．しかしながら，従来の光学機器の組合せでは，全周形状計測の計測の際，多数の光学機器を含むシステムの構成や前処理の複雑化は不可避である．

本研究では，簡便なシステム構成および計測手順をもち，かつ動物体の全周形状も計測できる，円筒鏡を用いた全周形状計測システムを提案する．この提案システムでは，円筒は内部が鏡面コーティングされ，カメラは円筒鏡の上部に光軸と円筒鏡の中心軸が一致するように下向きに設置されている．また，カメラには魚眼レンズが装着されている．計測対象となる物体は円筒鏡内に設置され，カメラで撮影される．この撮影画像は，カメラから直接観測される像と，円筒鏡の内部で反射してカメラに入射した像の両方を含み，これは実カメラおよび仮想カメラを用い，物体の任意の一点を複数視点から同時に観測していることと等しく，ステレオ視などの受動的計測手法を用い，物体の全周形状を非侵襲で計測できる．また提案システムは，カメラの光軸と円筒の中心軸が一致するようにカメラが設置され，カメラの光軸が撮影画像の中心を通ることを前提とする．そのため，カメラに入射する光は円筒の中心軸を通過する光に限られ，円筒の中心軸を通過する光は，接線が平行となる円筒鏡面の2点間を反射する．よって，物体上の同一点を始点とする光線がなす対応点の組は，撮影画像において，画像中心を通る同一直線上に必ず存在する．提案システムは，ステレオ視の際に必要となる対応点探索処理において，対応点の存在する範囲を直線上に限定でき，計算量および誤検出の削減が期待される．

本発表では，三次元形状に関連する手法を概観し，提案システムについて述べる．さらに，シミュレーション画像による全周形状計測実験および試作システムによる全周形状計測実験を行い，物体の全周形状を提案システムで計測できることを示す．

0561010 柿元健
「ソフトウェア開発管理における予測手法の利用的側面からの評価」

ソフトウェア開発管理を支援するために，開発工数と信頼性に関して数多くの予測手法が提案されているにも関わらず，その多くは開発現場で利用されていない．本発表では，予測手法を利用的側面から評価することで，その適用の基準や得られる効果を明確にし，開発現場での採用を促進するための方法について発表する．

まず，開発工数の予測における，予測に必要なデータセットに求められる基準を明らかにするための，データセットに含まれる欠損値，プロジェクトの件数，メトリクス数についての，類似性の基づく工数予測手法とステップワイズ重回帰分析の2種類の工数予測手法の評価について述べる．欠損値については，欠損値が生じる3つのメカニズムを想定し，それぞれについて欠損率を変化させたデータセットを多数作成し，各データセットを用いて工数予測を行うことで実験的に評価する．評価の結果，類似性の基づく工数予測手法がステップワイズ重回帰分析よりもロバスト性が高い，すなわち，欠損のメカニズムに関わらず，欠損率が増大しても予測精度が大きく低下しなかった．また，プロジェクト件数とメトリクス数について，それぞれを変化させたデータセットを作成し，各データセットを用いて工数予測を行うことで実験的に評価する．評価の結果，プロジェクト件数が50件以上の場合，類似性の基づく工数予測手法がステップワイズ重回帰分析よりも高い精度で予測でき，メトリクス数の増加に伴って予測精度が向上した．

次に，信頼性予測の一手法であるfault-proneモジュール判別における，予測により得られる効果を明らかにするための，各モジュールのfaultの有無，割り当てられるテスト工数，およびソフトウェアの信頼性（fault発見率）の関係のモデル化について述べる．費やされたテスト工数に基づいてfault発見率を示すモデル（fault発見率モデル）を指数型ソフトウェア信頼度成長モデル（SRGM: Software Reliability Growth Model）を参考に構築する．構築したfault発見率モデルを用いて，評価指標の1つであるF1値，faultモジュール含有率，fault-prone判別モジュール率についてシミュレーションを行った結果，プロジェクト全体の信頼性（fault発見率）は，F1値，faultモジュール含有率，fault-prone判別モジュール率によって決定されることが判明した．

0461201 小野寺　博和
「白血球除去カラムの開発及び白血球除去カラムの流れ解析システム」

0561210 Basel Alali
「Multi-input Multi-output Feedback Error Learning Control: Theory and Applications」

Recently, much attention has been paid to Feedback Error Learning (FEL) control, which gives much improvement on the tracking performance of the system by means of on-line learning, without a mathematical model of the object to be controlled (plant). A remarkable feature of this scheme is that it uses a feedforward controller which is adjusted by some learning law depending on the feedback error signal. In this presentation, we show how to generalize and apply FEL to Multi-input Multi-output (MIMO) systems from the perspective of linear control theory.

At rst, learning control structures are studied for the MIMO systems using FEL. By using linear system parameterization as a function approx- imator of the feedforward control, we derive a learning law to adjust the parameters of the inverse model of the plant. A theoretical treatment of how to generalize FEL to MIMO systems will be discussed in the framework of adaptive control. Then, we propose a new method for closed-loop identica- tion of MIMO plant. The learned feedforward controller gives a model of the plant, which will be eective for re-designing the control system to improve the performance.

Finally, we consider a problem of how to teach robots to write characters in actual environment. In particular, one must design a feedforward controller for two-link manipulators to improve the tracking performance in the face of limited knowledge of the surroundings. The basic idea of our experimental work is to achieve an approximated inverse of the plant adaptively using linear parameterization instead of Neural Networks (NN) in order to improve the tracking performance for each specic desired trajectory and also the speed of parameters convergence very fast by means of MIMO-FEL. This is also in contrast of achieving an exact inverse via precise system identication which requires huge amount of data and richness of the excitation input. Thus, by FEL, one can obtain an inverse model for specic reference signal with limited amount of data and limited range of frequency components. In practice, we switch the feedforward controllers depending on the target character to write. This is a clear contrast with the precise identication approach which uses a single general purpose controller.

0561040 鄭育昌 (CHENG Yuchang)
「Constructing a Temporal Relation Identification System of Chinese based on Dependency Structure Analysis (依存構造に基づく中国語事象表現の時間関係同定システムの構築に関する研究)」

"Temporal information (Time)" has been a subject of study in many disciplines particularly in philosophy, physics, and is an important dimension of natural language processing. The temporal information includes temporal expressions, event and temporal relations. There are many researches dealing with the temporal expressions and event expressions. However, researches on temporal relation identification and the construction of temporal relation annotated corpus are still limited. There is a well-known temporal information annotated guideline for English, TimeML. However, there is no such a research that focuses on this in Chinese. Our research is the first work of the temporal relation identification between verbs in Chinese texts. In this research, we propose a temporal information annotation guideline for Chinese and a machine learning-based temporal relation identification method.

Following the observation of our investigation, the distribution of events and temporal expressions is un-balance. The temporal information processing includes two independent tasks: anchoring the temporal expressions on a timeline and ordering the events to temporal order. Our research focuses on ordering the events, which is to identify the temporal relations between events. Because identifying the nominal event is difficult, we limit the events to the verbs in articles. The proposed annotation guideline is based on the TimeML language. We newly introduce dependency structure information to limit target temporal relations. The proposed method reduces the manual efforts in constructing the annotated corpus. To annotate temporal relations of all combinations of events requires n(n-1)/2 manual judges. Our proposed method requires at most 3n manual judges. While the dependency structure based attributes reduce manual annotation costs, the limited relations preserve the majority of the temporal relations.

We use a syntactic parsed corpus―Penn Chinese treebank as the original data for annotating a basic annotated corpus. For using the dependency structure in temporal relation identification, we first construct a dependency analyzer for Chinese and combine it into the temporal relation annotating system. The accuracy of the dependency analyzer is 88% for word dependency analysis and this is better than existed Chinese dependency analyzer. The process of temporal relation identification includes following steps: to analyze the dependency structure, to analyze the temporal relation attributes of events and to extend the relation using the inference rule. We define events as those expressed by verbs and define the temporal relation types of event pairs which include the adjacent event pairs, the head- modifier event pairs and the sibling event pairs. These relations include most meaningful information, and we extend these relations using the inference rules to acquire long distance relations.

We train a machine learner with our temporal relation annotated corpus to construct the temporal relation identifying system. SVM is used as the machine learner in this system. We survey the coverage of our system with a small corpus. The accuracies of the annotating experiments are 68%~70% for annotating the temporal relation attributes. The result shows that our proposed system covers about 52% of temporal relations of all possible event pairs.

0561029 MATSUURA Satoshi
「A study on geographical location based overlay networks for sharing global ubiquitous sensing data」

As the Internet covers all over the places and the price of wireless sensors diminishes rapidly, it can be expected that a large number of heterogeneous sensor networks are developing around the globe and interconnecting to share global-scale sensing data. And such data have a great effect on our daily lives, solutions of environmental problems, developments of business and lots of other application fields.

Sensor network technologies have been focusing on data collaboration in a local area, but they have been lacking for data share among lots of sensor networks. A decentralized data management mechanism is one of the essential keys to realize the goal of sharing sensing data over the globe. Many decentralized system, mainly overlay networks have been studied over the years. However, on these overlay networks some nodes unevenly have to store large data or retrieval cost become extremely high if these networks manage data on real space. This is because these works lack for considering patterns of sensor data stream and user queries besides geographical distribution of sensor networks in ubiquitous sensing environment, even though they have been tackling the problems of scalability and trying to provide distributed and self-organized systems on the Internet-scale network.

This dissertation proposes a new overlay network which can manage sensing data in terms of geographical locations. The proposed overlay network called Mill constructs a decentralized data management system without destroying the locality of sensing data. On this overlay network, nodes are distributed by geographical location and manage data of local areas. Proposed overlay network supports both multi-scale geographical range queries and multiple-attributes queries, managing only one dimensional ID-space. This one dimensional ID-space consisting of latitude and longitude enables a routing mechanism to become simple and fast. It is also discussed that an implementation design of Mill considers patterns of sensing data stream and user queries. This implementation optimizes a routing mechanism, and almost messages from users and sensing devices are directly sent to particular nodes without searching the overlay network each time. This feature greatly reduces the retrieval cost. A Mill network is evaluated by several criteria including retrieval performance, management cost, simultaneous connections and others, through simulation experiments and evaluations of implementation. And these evaluations clarify its scalability and flexibility as well as its limitations.

0461029 原一夫
「機械学習を用いたテキストマイニング ―医療情報抽出から並列句解析まで―」

本研究は、臨床試験論文からの情報抽出に焦点を当てる。これは近年一般的になりつつあるコンセプトである「エヴィデンスに基づく医療（EBM）」と密接な関連をもつ。 EBM の普及により、医療現場では診断、予後予測、治療、予防に関する最新で正確かつ効果的な方法についての知識が求められるが、それを支援するシステムの作成は人手作業で行われているのが現状である。我々の最終的な目的は、医療文献を自動的に要約してEBMに必要な情報を患者や医師に提示するシステムの作成であるが、本論文では、その前処理として必要となる、情報抽出タスクと並列句同定タスクについて論じる。

情報抽出タスクでは、既存の自然言語処理の技術を用いてどの程度の精度で重要情報抽出ができるかについて論じる。抽出対象はその臨床試験で比較する治療方法と対象患者である。そこで我々が最初に得る知見は、治療方法と患者を表す基本名詞句の切り出し自体は比較的容易に行えることである。しかし同時に、当該臨床試験で比較する治療方法ならびに対象とする患者だけを抽出するのは容易でないことも明らかになる。そこで我々は文分類によるフィルタリングを試みるが、文の構文構造を素性として用いようとする場合、構文解析の成功が前提となる。しかし、比較結果を記述する臨床試験論文においては、並列句が高頻度に出現し、並列句の存在が構文解析を困難にすることは、自然言語処理学分野ではよく知られている。なおかつ、並列句は情報抽出の観点からも重要な情報を含みやすい。

そこで我々は、並列句の解析手法を新しく提案する。従来手法はルールを発見的に作成するというものがほとんどである。これに対して、我々の提案手法は並列句同定問題を上三角形状の編集グラフにおける系列アラインメントの問題とみなし、編集コスト（素性の重み）を事前に与えることなく、訓練データから学習することができる。GENIAコーパスを用いた実験で、従来手法と比較して良い並列句同定結果を得ることに成功した。なお、提案手法は医薬生物学分野以外のテキストにも適用可能な、自然言語処理の要素技術として用いることができる。

0461003 新井イスマイル
「メタデータを活用したユーザ指向型行動支援情報取得手法に関する研究」

無線LANや携帯電話などの無線通信技術の発展によりユーザはいつでもどこでもWeb情報検索ができるようになり、移動先での飲食店情報や天気予報など、自らの位置やスケジュールといった状況を反映した行動支援情報を求めるようになった。本研究ではユーザが膨大なWeb文書から行動支援情報を省作業で的確に取得できることを目的とする。

現状の携帯情報端末は小さい画面に少数のボタンなどといったユーザインターフェイスの制約により、迅速な情報検索が困難である。また、現在地やスケジュールなどの状況情報はキーワードによる的確な記述が困難となる。さらに、今後の爆発的なコンテンツの増加に対して検索処理時間が安定しなければならない。

以上の問題を解決するため、本研究ではメタデータによるマッチング・スコアリングを基とする行動支援情報取得手法を提案する。メタデータにはユーザの属性や嗜好情報を記述したユーザプロファイルと、コンテンツの対象者や意味を記述したコンテンツメタデータがある。これらのメタデータを照合し、その適合具合を個別に演算することによって、従来の全文検索によるキーワードのパターンマッチングよりも的確な情報検索を実現する。本論文ではユーザの行動支援情報取得においての要求を整理し、ユーザプロファイルおよびコンテンツメタデータの詳細について設計する。あらかじめ設定されたユーザプロファイルを利用することによって、コンテンツ検索時のクエリ入力の手間を軽減し省作業な情報検索を実現する。そして、コンテンツが今後爆発的に増加することを考慮して、構造化されていないWeb文書を分析し、コンテンツメタデータを自動抽出する手法を検討する。また、検索処理時間がコンテンツ増加に対してスケーラブルとなるようにP2Pネットワーク上のエージェントがマッチング・スコアリング機能を持つコンテンツ推薦システムを設計する。

コンテンツメタデータの自動抽出について、位置および時間にもとづいたWeb文書の構造化手法を実装し実験した結果、有利な適合率によって目的とする情報の抽出が実現できた。また、実運用を考慮したコンテンツ推薦システムを構築しショッピングモールにおいて実証実験を行った結果、広いユーザ層のサービスに対する許容性、コンテンツ増加への耐性について良好な評価結果が得られた。

0561033 毛利寿志
「Constructing Efficient and Robust Infrastructure for Secure Communication in Autonomous Computer Networks」

本発表では，従来のネットワークと二種類の新しいネットワーク（アドホックネットワーク，センサネットワーク）に関し，それぞれの特徴や制約に沿った効率が良く，かつ頑健性のあるセキュリティ基盤技術を提案する．

計算機ネットワーク上で提供されるサービスが拡大されるに従い，セキュリティ保護等を目的としたシステム管理基盤技術がますます重要となっている．その一例として，暗号鍵共有がある．暗号鍵共有は，信頼できないネットワーク上でのセキュアな通信を行うために必要不可欠な手法であり，現在までに広く研究されているセキュリティ基盤技術である．一般的なネットワークでは，すでに実用的な公開鍵暗号技術などが研究されており，鍵共有方式を包括するセキュリティ基盤技術の構築が問題となっている．この場合，信頼できる第三者を仮定したセキュリティ技術の導入が最適である．一方，アドホックネットワークやセンサネットワークなどのように，セキュリティ基盤技術の構築そのものが問題となっているようなネットワークも存在する．アドホックネットワークでは，ノードの移動性及び特定の信頼できる第三者機関の不在などの制約により，既存の公開鍵暗号基盤をそのまま利用できないという問題点が指摘されている．この場合は，信頼できる第三者を仮定せず，ネットワーク内の当事者同士で鍵共有を行う方式が必要となる．また，センサネットワークでは，センサノードの計算，通信，メモリ資源の制約により，公開鍵暗号そのものの使用ができないことが指摘されている．そのため，ネットワーク施行後に行われる鍵共有プロトコルは用いず，事前に各センサノードに暗号鍵を埋め込んでおく方式が適している．

まず最初の研究成果として，システムの内部状態を導入した信用管理モデルを提案する．信用管理(Trust Management)とは，PKIに基づいたアクセス制御技術である．まず，システムの振る舞いを定義するために，ポリシ記述言語を提案する．また，本研究では，検証問題を与えられたポリシが与えられた検証項目を満たすかどうかを決定する問題と定義し，モデル検査手法を用いてこの問題を解く手法を提案する．さらに，ある具体例について，Prologを用いた実装法を提案し，検証に要する時間を示す．

次に，アドホックネットワーク上のweb-of-trust型信頼モデルにおいて，効率の良い証明書連鎖発見アルゴリズムを提案する．アドホックネットワークにおける鍵共有手法に関する研究として，ネットワークに参加している各ユーザが各自で公開鍵証明書を発行し合うような，web-of-trust型信頼モデルが注目されている．本発表で提案するアルゴリズムは，信頼モデル上で証明書連鎖を探索する段階と，探索により見つけた証明書連鎖を収集する段階から成る．探索段階では，web-of-trust型信頼モデルを表す有向グラフ上で生成木を構成する分散アルゴリズムを利用している．また，提案手法と既存手法の通信コストを数値解析，及び計算機シミュレーションによって比較し，提案手法の方が既存手法より少ないコストで問題を解くことができることを示す．

さらに本発表では，センサネットワークにおける鍵事前格納方式について，代数幾何に基づいた新しい方式を提案する．センサネットワークにおける鍵共有方式に関する研究として，鍵事前格納方式が注目されている．鍵事前格納方式では，センサネットワークの施行前に各センサノードに事前に鍵を複数組み込んでおき，各センサノードを配布してセンサネットワークを構築する．本発表では，新しい鍵事前格納方式として，有限二次平面上の各格子点にそれぞれ異なる鍵を割り当て，ある一直線上にある全ての鍵を一つのノードに格納するような方式を提案する．さらに，提案手法の効率性，頑健性を数値解析によって評価し，提案手法の方が既存手法より効率が良く頑健であることを示す．

0561024 中里祐介
「ウェアラブル拡張現実感のための
不可視マーカと赤外線カメラを用いた位置・姿勢推定システム」

ウェアラブル拡張現実感(Augmented Reality: AR)は，ユーザが装着したウェアラブルコンピュータやモバイル端末を用いて現実環境に仮想環境を重畳表示することにより，ユーザの位置に応じた情報などを直感的に提示可能な技術であり，ヒューマンナビゲーションなどの分野での実用化が期待されている．このウェアラブル拡張現実感では，現実世界と仮想世界の座標系の位置合せを行うためにユーザの正確な位置・姿勢計測が重要な課題となる．従来，屋内における位置・姿勢推定手法の一つとして，実環境に多数の画像マーカを配置し，それらをユーザの装着したカメラで撮影することで，ユーザの位置・姿勢を求める手法が提案されている．このような手法は，安価でかつインフラに電源を必要としないという利点があるが，景観を損ねるため実際の環境におけるウェアラブル型拡張現実感システムに利用することが難しいという問題がある．

そこで本研究ではこのような問題を解決し，屋内環境においてユーザの位置・姿勢を精度良く推定することが可能な位置・姿勢推定システムの実現を目的とする．ユーザ位置・姿勢推定システムの実利用を考えた場合，環境の景観を損ねずにユーザ位置・姿勢推定のためのインフラを容易に構築できることが望まれる．そのため提案システムでは，半透明の再帰性反射材からなる不可視マーカを印刷した壁紙を環境中に設置し，その不可視マーカをデジタルカメラで撮影してキャリブレーションするツールを提供することで環境構築の労力を軽減する．これにより多数のマーカを密に設置することができるため，ユーザが装着した赤外線LED付き赤外線カメラでマーカを撮影・認識することにより，景観を損なうことなくユーザの位置・姿勢を実時間で精度良く推定することが可能となる．

本発表では，まず，ウェアラブル拡張現実感とウェアラブル拡張現実感におけるユーザの位置・姿勢推定における技術的な課題と従来研究を概観し，本研究の目的と意義を明確にする．次に，不可視マーカと赤外線カメラを用いた位置・推定のための環境構築とユーザの位置・姿勢推定システムの詳細について述べる．最後に本研究を総括し，今後の展望について述べる．

0561004 石川智也
「全方位映像のマルチキャストによる実時間ネットワークテレプレゼンスに関する研究」

本研究は，全方位映像を用いた実時間ネットワークテレプレゼンスシステムにおける，利用者数の増加に対応可能なスケーラビリティの実現と，映像観賞時のインタラクティブ性の向上に関する研究である．遠隔の情景を高臨場感で提示することでその場に居るかのような感覚を再現する技術はテレプレゼンスと呼ばれる．近年の計算機の高性能化やネットワークの高速化により，環境の撮影から映像提示までを実時間で実行可能な環境が整いつつあり，放送と通信の融合による次世代ネットワークメディアとして実時間ネットワークテレプレゼンスが注目されている．

本研究では，遠隔の情景を自由な視線方向でインタラクティブに観賞可能な全方位映像を用いたテレプレゼンスに焦点を当て，実時間ネットワークテレプレゼンスにおける利用者数の増加に対するスケーラビリティの実現，及びよりインタラクティブなテレプレゼンスのために視線方向のみならず視点位置も自由に変更可能な画像提示技術の実現を目的とする．従来の実時間ネットワークテレプレゼンスシステムは，ユニキャストプロトコルによる映像伝送を行うため，利用者数の増加に比例したネットワーク帯域とサーバの処理コストを必要とした．これに対し，全方位映像は利用者の視線方向に依存しないという特徴を利用したマルチキャストプロトコルによる映像伝送を行うことで，利用者数に依存しないスケーラブルなシステムを実現した．これにより，ネットワーク帯域の狭い移動体無線通信においても，その移動体からの映像を複数の利用者が観賞可能であることを示した．上記ネットワーク拡張に加え，映像観賞時のインタラクティブ性向上のために，撮影環境中の多地点に配置した全方位カメラ群からの映像を用いた自由視点画像生成技術を用いることで，システム利用者が自由に視点位置と視線方向を変更可能とした．複数の全方位カメラにより撮影された広範囲の動的環境においても実時間での自由視点画像生成を行うために，MorphingとVisual Hullによる高速な画像生成手法を提案した．そして実験により，複数の利用者が同時に自由な視点位置・視線方向で遠隔の観賞が可能であることを確認し，被験者実験によりその有効性を確認した．

本発表ではまず，実時間ネットワークテレプレゼンスには全方位映像を用いたシステムが適していることを説明し，従来提案されている実時間ネットワークテレプレゼンスシステムの課題について述べる．そして，利用者数の増加に対応可能なスケーラビリティを有するテレプレゼンスシステムと，そのシステムを基礎として視点位置も自由に変更可能なハイスケーラブル自由視点テレプレゼンスシステムを提案する．提案システムによる実験とその考察の後，本研究を総括する．

0561038 Khoirul Anwar
「Peak Power Reduction for Multicarrier Systems in Satellite and Radio Communications」

Multicarrier (MC) modulations have become popular in most of the wireless communication systems due to their high spectral efficiency and robustness against multipath fading effects. However, one of the major drawbacks in multicarrier modulations is caused by its high peak power level when the modulated data of each subcarriers are added coherently. The ratio between this maximum peak power and its average power -called as peak-to-average power ratio (PAPR)- is large. Therefore, it requires a large back-off of high power amplifiers (HPA) to avoid performance degradation and out-of-band (OOB) radiation.
This dissertation formulates the PAPR problem of multicarrier systems in satellite and radio communications and proposes four new methods for to improve the performances of both systems. The considered multicarrier system are orthogonal frequency division multiplexing (OFDM) and multicarrier code division multiple access (MC-CDMA). One method is applied in satellite communications while the other three methods are for radio communications systems.
The first method proposes a technique to transmit digital television to the uncovered area by the broadcasting system. This method proposes clipping and utilizes the constant envelope of frequency modulation (FM) to transmit the clipped OFDM signals. The problem of satellite nonlinearity channel can be avoided while the FM gain can be increased by the appropriate clipping level. The results demonstrates that the proposed method is more effective as compared with transmitting OFDM signals directly to the satellite without FM modulation.
The second method proposes a new large spreading code set for PAPR reduction in OFDM and double the user capacity in MC-CDMA system. The low and uniform cross correlation of the proposed code can improve the bit-error-rate (BER) performance of the OFDM and MC-CDMA systems while the PAPR performance is comparable to the existing pseudo-orthogonal carrier interferometry (PO-CI) code.
The third method proposes new design of carrier interferometry (CI) and PO-CI spreading codes based on fast Fourier transform (FFT), called CI-FFT and PO-CI-FFT. The design achieves significant computational complexity reduction while no degradation both on the PAPR and BER performance.
The last method proposes spreading technique combined with iterative clipping (IC) on the CI-FFT/OFDM system to obtain lower out-of-band noise for OFDM after the non-linear channel. Here, the solid-state power amplifier (SSPA) is considered. The results confirm that CI-FFT/OFDM with the proposed iteration present advantages over the SSPA non-linearity.

0561009 奥村文洋
「拡張現実感における画質に着目した幾何学的・光学的整合性の向上に関する研究」

拡張現実感技術とは実環境に仮想物体を重畳表示することでユーザに対して位置依存情報を提供する技術である．近年では位置依存情報として写実的な仮想物体を提示するデザインシミュレーションなど，仮想物体の写実性が重要な分野での応用が期待されており，カメラで撮影した画像に仮想物体を重畳表示するビデオシースルー型拡張現実感システムによる実現例が多数存在する．このような応用例では，実環境と仮想環境の間の位置あわせに関する幾何学的整合性問題の解決のみならず，実環境と仮想環境の間の照明条件や画質に関する光学的整合性問題の解決や，重畳表示する仮想物体の写実性の向上が重要な課題となっている．しかし，これまでの研究ではカメラで撮影した画像で生じる画質の劣化はあまり考慮されておらず，撮影画像の画質の劣化によって幾何学的・光学的整合性が損なわれる問題は依然として解決されていない．

本発表では画質の劣化によって損なわれる幾何学的・光学的整合性を向上するための手法について述べる．提案手法は，シーン中に配置された画像マーカから実時間で画像のぼけを推定し，カメラの位置・姿勢の推定の際に推定されたぼけを考慮することで幾何学的整合性の向上を図る．また，推定されたぼけを仮想物体に再現することで実環境と仮想環境の間の画質に関する光学的整合性の向上を図る．さらに，材質の粗さに応じた写り込みを描画することで仮想物体の写実性の向上を図る．そして，試作した写実的な仮想物体を重畳表示可能な拡張現実感システムについて紹介し，提案手法の有効性を確認する．

0561019 末永剛
「非接触な顔情報計測に基づく運動視差提示3次元ディスプレイ」

3次元ディスプレイはより高い臨場感や場の共有感などが得られると期待されており，ロボットの操作や医療，アミューズメントなど様々な応用分野において利用が検討されている．しかし日常で利用されているテレビやコンピュータのモニタについては未だ2次元的にしか利用されていない．製品化されている裸眼立体視ディスプレイは，両眼視差に基づいており，左右の目に別々の像を提示するため，特殊なレンズやスリットを利用している．しかし，視差数を増やすと解像度が低くなってしまう問題があり，また高価なため手軽に利用できるものではない．
そこで，本研究では頭を動かすことで得られる像の変化である運動視差を用いた立体提示に焦点を当て，非接触な顔情報計測に基づく運動視差3次元ディスプレイを提案する．モニタ上部に配置されたステレオカメラからユーザの視点位置を計測し，視点に対応した画像を頭部の動きに追従させて提示する．ユーザは頭を動かしながら通常のモニタを見ることで立体感を感じることが可能である．CGモデルを利用するモデルベースシステムおよび実写画像を利用するイメージベースシステムの実装を行い，そのシステムの有効性を検証した．実装したモデルベースシステムでの奥行き知覚実験においては，提示物体までの奥行きの違いが知覚できることを確認した．またイメージベースシステムでは，提示画像の整合性について検証を行い，ユーザアンケートにより奥行き感が感じられることを確認した．

0661207 Manabu Hirano
「Studies on Authentication and Authorization Mechanism for Inter-device Communication on Wide Area Network Environment」

This dissertation presents the novel security mechanism for the inter-device communication system. Future ubiquitous networks will be connected to a large number of non-PC Internet-ready devices. The networked devices will be federated for a variety of purposes. The devices interact each other, consequently, they have to have both the client and the server functions. Therefore, the system naturally makes the peer-to-peer network architecture. Although many security mechanisms have been developed for the client-server network architecture, it is difficult to apply such conventional security mechanisms to the novel inter-device communication system directly. The new attempt of the inter-device communication causes some security problems that need to be solved. Thus, this dissertation shows the problems and the novel security mechanism for the inter-device communication system. This dissertation first presents the extension method of the proven network layer's security mechanism, especially the IPsec protocol and the IKE protocol for user-level applications. This attempt shows the feasibility of the network layer's security mechanism on the inter-device communication system. Next, this dissertation presents the novel inter-device authentication and authorization framework. The multiple ownerships model is the main concept of the framework. The model emphasizes the importance of the distinguishing and the binding of the device's identity and the ownerships explicitly. The framework employs the PKI technology to guarantee the relation between the device's identity and the ownerships by the cryptographic techniques of the PKI. Each the device's identity and the ownership can be expressed and verified based on the public key certificates and the attribute certificates. The prototype implementation employs the standard network layer's security mechanisms, the IPsec protocol and the IKE protocol proposed by the IETF. The prototype implementation also employs a tamper-proof smart card technology to store the identity and the ownership securely. The dissertation proposes the novel smart card software for the device authentication and the ownership-based authorization. The dissertation also proposes the initialization tool for manufacturer and the personalization tool for user. The dissertation also shows the results of the performance measurement. The dissertation presents the results of the demonstration experiments to show the usability of the proposal. The demonstration system consists of a micro server that works as a security proxy of the target appliances, the proposed smart card software and the middleware software. The dissertation shows the demonstration systems for the TV device and the security camera. The dissertation discusses the improvements and the contributions on the study, the comparison with the conventional security mechanisms and the open issues.

0661021 樋口　宗明
「移動体自己位置推定のための初期状態オブザーバの開発：理論および二輪車両型ロボットの車庫入れ制御への適用」

近年，コストの低減や乗務員の安全面などの観点から移動体の自動制御に関する研究が多くの研究者によって行われている．移動体の自動制御を行うには移動体の自己位置推定手法が非常に重要な技術となる．自己位置推定法に求められる要求として，正確性や精密性といった基本的な事項のほかに設計の容易さや少ない計算量といった実用面を考慮した要求事項がある．このような要求に対し，本発表では，「初期状態オブザーバを用いた自己位置推定法を提案する．

本発表ではまず，従来手法の自己位置推定手法に関する調査を紹介し，自己位置推定で広く用いられている，"拡張Kalmanフィルタ"を用いたセンサフュージョン手法の問題点について指摘を行う．拡張Kalmanフィルタを用いたセンサフュージョン手法は，誤差モデルを決定するために実験を繰り返し試行錯誤的にパラメータを決定しなければならないため実装に手間がかかり，横滑りのようなモデル化できない誤差に対して著しく精度が悪化する問題があることを示す．続いて，"初期状態オブザーバを用いた自己位置推定法"の計測原理について述べる．提案手法は現在位置を取得したい座標系 "大域座標系"と移動体の初期状態により与えられる"局所座標系"を定義し，"初期状態オブザーバ"と名付けた局所座標系と大域座標系の位置関係を推定するオブザーバを用いてセンサフュージョンを行う．提案手法はデジタルローパスフィルタの一種となるため，本手法の設計は既存のデジタルフィルタ設計法を用いることが可能である．このため，提案手法は拡張Kalmanフィルタに比べて実装が容易である．さらに，本手法は拡張Kalmanフィルタに比べ推定原理が簡単であるため，拡張Kalmanフィルタに比べて計算量の面において優れていることを示す．最後に，コンピュータシミュレーションおよび実機を用いた二輪車両型ロボットの車庫入れ制御問題により，提案手法が拡張Kalmanフィルタと推定精度に関して差がなく，拡張Kalmanフィルタよりも周囲環境に対してロバスト性が高いことを示す．

0561039 CINCAREK Tobias
「Strategies for Cost-effective Development of Real-Environment Automatic Speech Recognition Applications」

The most natural user interface for human-machine interaction is speech. Moreover, there are many applications for automatic speech recognition (ASR) technology, e.g. dictation systems, car navigation systems, real-environment guidance systems, dialogue robots, etc. ASR system have the difficulty that they are task- and domain-dependent. Consequently, the construction of an ASR system with reasonable performance usually requires large amounts of human transcribed speech data collected in the target environment. However, collection and human labeling of speech data is expensive and impractical whenever a new system for a new environment has to be built. Therefore, it is imperative to investigate more cost-effective development strategies. In literature several approaches to reduce costs of human-labeling such as unsupervised, lightly supervised and active learning have been proposed. Although these approaches have been effective in many cases, they have also drawbacks. Unsupervised learning can already be outperformed with relatively small amounts of transcribed data, lightly supervised training requires approximate labels which are not always available and confidence-based data selection in case of unsupervised and active learning may result in a model bias. All three learning methods do not address the aspect of task-dependency sufficiently. Therefore, one purpose of this work is to develop a method for cost-effective and automatic construction of task-adapted acoustic models.

A selective training framework for reuse of existing speech databases is proposed. Furthermore, a selective training algorithm is developed which enables the automatic selection of task-specific speech data from a large data pool using only a small amount of human-transcribed development data from the target environment. The proposed selective training method is shown to be effective for constructing a preschool children acoustic model using school children speech and an elderly acoustic model using adult speech. Furthermore, in order to reduce the development costs of acoustic modeling for a speech-oriented guidance system, the proposed method is also applied for building adult and child-dependent models in case of an automatically transcribed data pool. The selective training algorithm effectively discards wrongly transcribed utterances, non-speech inputs and utterances from the wrong speaker group.

Moreover, a development simulation for the real-environment, speech-oriented guidance system Takemaru is conducted. The major two components of the guidance system are the ASR module and the Q&A module for example-based responses generation. Since task and domain of the system are determined by the user and the system's environment, real data is required for module construction. It is found empirically that about 40,000 valid, human-transcribed utterances are required until performance saturates. The period to collect this amount of data may depend on the speaker group and the environment. In order to reduce development costs of future guidance systems for different environments, the portability of the Takemaru prototype system in the Kita environment, a local subway station, is investigated. A system can be considered as portable if it is reusable and easily adaptable to a new task or domain. Experimental results show that the Takemaru ASR module has a high portability in the Kita environment. The reusability of the Q&A module is only moderate. Nevertheless, Q&A improved remarkably after adaptation with relatively small amounts of real-environment data. It appeared to be possible to reduce the system development period from long-term to medium-term, from medium-term to short-term and from short-term development to no adaptation.

This work is an important contribution for more cost-effective ASR system development using existing speech data resources. A computationally feasible selective training algorithm has been proposed and applied successfully to construct task-adapted acoustic models. Furthermore, the data requirements to construct the prototype of a real-world dialogue system and for its adaptation to other environments have been investigated.

0561014 小林　寛和
「The entire organization of operons on the Bacillus subtilis genome 」

In the post-genomic era, comprehension of cellular processes and systems requires global and non-targeted approaches to handle vast amounts of biological information. The present study predicts transcription units (TUs) in Bacillus subtilis, based on an integrated approach involving DNA sequence and transcriptome analyses. First, co-expressed gene clusters are predicted by calculating the Pearson correlation coefficients of adjacent genes for all the genes in a series that are transcribed in the same direction with no intervening gene transcribed in the opposite direction. Transcription factor (TF) binding sites are then predicted by detecting statistically significant TF binding sequences on the genome using a position weight matrix. This matrix is a convenient way to identify sites that are more highly conserved than others in the entire genome because any sequence that differs from a consensus sequence has a lower score. We identify genes regulated by each of the TFs by comparing gene expression between wild-type and TF mutants using a one-sided test. By applying the integrated approach to 11 σ factors and 17 TFs of B. subtilis, we are able to identify fewer candidates for genes regulated by the TFs than were identified using any single approach, and also detect the known TUs efficiently. This integrated approach is, therefore, an efficient tool for narrowing searches for candidate genes regulated by TFs, identifying TUs, and estimating roles of the σ factors and TFs in cellular processes and functions of genes composing the TUs. Using these TU data, I predicted genome-wide operon structure in the B. subtilis genome by comparative genomic analysis of 55 gram positive bacteria. This taxnomical approach showed determining to appropriate boundaries of operons efficiently and I identified some internal operons. Furthermore, I took another operon prediction approach by support vector machine-based classification algorithm and efficiently detected gene pairs composing operons in the B.subtilis genome.

0561027 本田　直樹
「Analysis on Biological Functions Controlled by Spatial and Stochastic Reaction Networks 」

This thesis analyses and discusses principles of reaction networks for biological functions in unicellular and multicellular organisms with theories and computer simulations. Biological systems form complicated reaction networks which exert functions for the survival of organisms. How the reaction networks are controlled is one of the important questions for understanding these functions.

Throughout this thesis, spatial and/or stochastic biochemical processes are focused to answer these questions, based on two physiochemical facts: 1) biochemical reaction is commonly compartmentalized in space and consists with slow and fast diffusing molecules, and 2) reactions are inevitably stochastic if the copy number of each molecular species is low. How does the spatiality and stochasticity contribute into biological functions?

Four functions are discussed for examining spatial and stochastic effects on each function. First, I present biophysical model of synaptic plasticity with spatial regulation of local Ca2+ signaling involving molecular diffusion. Second, logic of spontaneous cellular migration and its role for chemotaxis are theoretically proposed with stochastic chemical reaction. Third, I address the mathematical model of spontaneous neural polarization that relates both with spatial and stochastic effects. Finally, I investigate the sizes control of somite formation in vertebrate development, in which stochastic reaction and cell-cell interactions are introduced.

0261203 Fredrik Bissmarck
「Real-time constraints to learning and control of voluntary movement」

The plasticity and computational capacity of the human cerebral cortex offer great potential for planning, learning and execution of movements. Indeed, a large part of the cortex is recruited for motor control and learning. However, a limitation is the long latencies of feedback from sensors and actuators of the peripheral nervous system - up to 100's of milliseconds. This constraint imposes a challenge to utilize the cerebral cortex for real-time motor control.

This thesis seeks to elucidate the real-time constraints of cortical feedback loops for motor control. A main theme of our study is the long-term learning of sequential, manual movement. We investigate how a series of planned movements are gradually integrated into a fast skillful movement, and how the recruitment of sensory feedback may be altered through stages of learning. We present two different studies on this theme. In the first, we take a computational approach. Proposing a framework with analogies of the basal ganglia-thalamocortical system, we address the problem of combining multiple feedback modalities of different latencies to learn joint torque controlled arm movements. In the second, we take an experimental approach, and study the long-term alteration of gaze strategies in a manual task.

We first review related work and important concepts: motor control and learning theory (Chapter 2), anatomy and function of the basal ganglia (Chapter 3) and motor sequence learning (Chapter 4). Then, we present the computational study (Chapter 5). We propose a general framework for combining modalities with different latencies. In a first simple implementation of a somatosensory reaching task, we assert our hypotheses that, given identical modules of different feedback latencies, 1) performance is limited by the latency of the faster module alone, and 2) that the faster module becomes dominant over control. In a second implementation, we examined an example of visuomotor sequence learning, where a plastic, faster somatosensory module interacts with a preacquired, slower visual module. Here we find that the somatosensory module acquires an independent control policy with better performance than the visual module. The visual module displays differential roles; in the early learning stage, it acts as a guide for the somatosensory module, and in the late learning stage, it acts as a safeguard against perturbations.

In the following chapter (Chapter 6), we present the experimental study. We first introduce "the 1 x 20 task", our paradigm to investigate the long term behavioural change in a stereotype, sequential button pressing task. We present a Bayesian model of dynamic updating of spatial representation, with the potential to explain gaze behaviour for manual tasks. We then report our findings of changes of gaze: in early learning, subjects fixate each target button, but as the manual execution speeds up, subjects fixate strategic points, inclined towards center-of-mass of clusters of targets. We also provide evidence that the Bayesian model can explain gaze-dependence of manual accuracy.

Overall, our computational study provides a quantitative picture of the limitations of sensory feedback control. Further, it provides an alternative way of flexibly combine modalities without explicit gating, by reinforcing connections of utile feedback and optimal actions. Our experimental study shows that vision is important for control of mature skills, but that there is a limit how fast gaze can be shifted for optimal feedback. This limit forces a change in gaze strategy with manual speedup.

0561017 柴田　和久
「視覚システムにおける選択的注意と可塑性についての研究」

本発表では，ヒト視覚システムに関する2つのテーマについて議論する．

ひとつめのテーマでは，非侵襲脳機能計測手法を用いて，ヒト視覚皮質における特徴選択的注意の効果について調べた．選択的注意による情報選択は，限られた計算容量の元で，脳が実時間で外界と相互作用するための仕組みである．過去の研究から，外界の視覚情報に対する選択的注意が，視覚皮質の活動を変化させることが知られていた．しかし，その知見のほとんどは時間解像度の低い計測手法（fMRI）によってもたらされたものであり，時間的側面の研究は立ち遅れていた．一方で，時間解像度に優れた計測手法（EEG, MEG）は，空間解像度が低く，皮質のどの領域が活動しているのかを知るのが困難であった．私はこのジレンマを，fMRIとMEGの情報を組み合わせ，皮質電流を推定する新規の手法を導入することで解決した．この手法によって推定した脳皮質電流を調べた結果，被験者が色の特徴に注意を向けると，視覚皮質の色を処理する領域の活動が，被験者が動きの特徴に注意を向けると，動きを処理する領域の活動が，それぞれ選択的に増加することが分かった．この効果は，まだ色や動きが視覚刺激として提示されていない期間でも見られた．さらにその時間特性を詳細に検討すると，被験者は視覚特徴に持続的に注意を向けているにも関わらず，この効果は時間的に過渡的であることが分かった．

つぎのテーマでは，外部から与えられる課題成績のフィードバックに基づいて，ヒト視覚システムの可塑性がどのように制御されているかを，心理物理実験と計算モデリングを用いて調べた．過去の教育心理学，認知心理学では，フィードバックはヒトの内的な認知状態に影響を与え，その結果学習に変化が現れると考えられていた．この仮説に基づくと，たとえ情報として正しくないフィードバックでも，内発的動機などの被験者の認知状態をうまく誘導するフィードバックを与えることで，学習を加速することができる可能性がある．私は，模様の見分けなどの課題の徹底的な訓練によって被験者の知覚感度が上昇することが知られている知覚学習を用いて，これを調べた．成績フィードバックを人為的に操作し，被験者の学習への効果を検討したところ，情報としては正しくないフィードバックが，情報として正しいフィードバックよりも被験者の知覚学習を促進させることがあることがわかった．さらにこの学習過程は，外部フィードバックと内的な予測をベイズ的に統合して視覚システムの可塑性を決めるモデルでよく予測できた．

0461009 川脇大
「非侵襲脳機能計測による視標運動予測メカニズムの解明とブレインネットワークインターフェースへの応用」

ヒトの脳機能の解明は科学の大きな目標であると共に医療の分野においても期待が大きい．特に近年，脳とコンピュータやロボットとを繋ぐインターフェースに関する研究が急速に進み，一部実用化もされ始めている．たとえば，サルの脳に埋め込んだ電極から計測した神経活動をもとにロボットアームを制御したり，頭部表面より計測した脳波からコンピュータ画面上のカーソルを制御したりすることが可能とされる．しかし，これらの技術を研究者や一般のユーザー向けに開放するには，安全性の問題，精度の問題，ユーザーに対する利便さの問題など，まだまだ多くの課題が残されている．これらの課題に対し，安全な非侵襲計測を複数活用し，高い時空間解像度による計測を実現し，神経科学的知見に基づいた特定の領域の情報を活用するブレインネットワークインタフェース（BNI）の技術が求められている．本研究では，小さな視標の運動を予測するヒトの脳情報処理の解明と，そのBNI応用が目的である．まず，空間解像度の高いfMRIを用いた研究を進め，外側後頭側頭野の前側および上側における神経活動が視標運動の予測に関わっていることを明らかにした．続いて，このfMRI研究で判明した視標運動予測に関わる特定の局所領域から時間解像度の高い脳活動情報を階層ベイズ脳活動推定法によって取り出せるか検証を行った．その結果，外側後頭側頭野の推定電流から視標速度を再構成し得る情報を取り出せる可能性が示された．また， MEGデータから直接再構成するよりも，階層ベイズ脳活動推定法によって大脳新皮質上の局所領域に推定された脳活動データから再構成するほうが精度が高いことが分かった．

0561034 森村哲郎
「Efficient Task-independent Reinforcement Learning based on Policy Gradient」

本発表では、方策勾配強化学習法の先験的知識を必要としない効率化に関して議論する．

方策勾配強化学習法は，エージェントが環境と相互作用する際に得られる報酬の平均値を目的関数とし，この目的関数を局所最大化する方策（行動則）の獲得を目指した方策探索法で，方策パラメータを目的関数の勾配により逐次更新することで実現される．方策さえ適切にパラメータ化すればエージェントや環境に関する知識を必要とせずに、マルコフ決定過程（Markov Decision Process; MDP）や部分観測マルコフ決定過程に適用可能である．そのため方策勾配強化学習法は様々な分野への応用が期待され，近年注目を集めている．しかしながら，実用化に向けて解決すべき問題に次の二点があった：
　問題1) 設計者設定パラメータ（メタパラメータ）の設定が困難、
　問題2) 学習所要時間が膨大になり易い．
これらに対する先行研究は多々あるが，そのほとんどは特定の課題を想定しており，課題の事前知識を利用したものであったため，汎用性に欠けていた．よって標準的な強化学習の枠組みに手を加えない，つまり課題に依存しないような方策勾配アルゴリズムの改良が望まれる．そこで上記問題の解決を目指した効率の良い方策勾配強化学習アルゴリズムを数理的に探る．

問題1) に対しては、メタパラメータの中でもこれまで有効な調節法が提案されていない積算報酬の割引率に関する研究を行った．一般の方策勾配法により推定される方策パラメータに関する平均報酬の偏微分値は，状態の定常分布の偏微分の計算が困難であったため，その偏微分に関する項を無視したものであった．この影響（推定値の偏り）は割引率を1 に近づければ減少するが，一方で分散は大きくなってしまう．つまり，割引率に関して偏り・分散のトレードオフ問題があった．そこで本研究では，逆方向マルコフ連鎖の性質を利用して定常分布の偏微分を推定する方法を導出し，割引率に依存しない新しい方策勾配法を提案する．割引率の設定が困難なMDPに適用した数値実験により提案法の有用性を示す．

問題2) に対しては，特にプラトー（学習の停滞期間）に注目して，MDPの確率分布に対して各方策パラメータの敏感さの相違やその相関を考慮した自然方策勾配（NPG）法の研究を行った．最適な方策への収束を遅くしている理由を学習すべきパラメータ空間の構造の性質から考察をしてNPGで必要となるリーマン計量行列を解析し、新しい自然方策勾配法を導出した．従来用いられてきたKakade（2002）のリーマン計量行列は方策のパラメータ摂動による行動の確率分布変化だけを考慮した計量行列であったのに対して，提案するNPGで用いるリーマン計量行列は行動の分布同様に方策の影響を受ける状態の分布までもを考慮したものになっている．そして数値実験より，特に状態数が多い場合でもプラトーに陥らず有効に働くことを示す．

情報科学研究科専攻長

平成１９年度 情報科学研究科 博士学位論文発表梗概

平成１９年度情報科学研究科博士学位論文発表梗概