Abstracts of Doctor Thesis 2014

平成26年度情報科学研究科博士学位論文発表梗概

Studies on Signal Processing for Local Ocean Wave Monitoring System

Maricris Cuison Marimon (1161206)

Ocean Wave Monitoring is an essential activity that provides valuable data in ocean engineering. The local ocean wave monitoring system considered in this study is a wireless network of sensors that gathers short time series data. The short time series data address the issues in its processing power, memory storage and battery power. Due to this, it cannot use conventional algorithms because they require longer data for generating significant wave height and period. Formulation of signal processing techniques that can accurately process short time series data in order to provide immediate information is the main motivation of this study.
First method is the thresholding technique that processes instantaneous data gathered by the sensors and judges the condition of its area by comparing to fixed threshold levels. This method is not effective if sensors are deployed on different areas, hence the second method ? statistical analysis is used. It uses higher order statistics generated from the locations to evaluate data segments gathered by sensors. Third technique uses Independent Component Analysis (ICA). This separates source signals from multiple sensors in an area. This is useful in estimating dominant patterns of source signals in a noisy environment which are typically experienced in real data however, wave data have certain characteristics that fail the requirements for ICA. Fourth uses spectral analysis which decomposes the wave data into frequencies. This shows the dominant wave frequency which is used to identify wave conditions. This is effective hence it is utilized in the fifth technique which uses Support Vector Machine to train the classification model for ocean wave conditions. Among the methods, the fifth effectively classifies short time series wave data. Due to this, immediate publishing of wave condition information is possible.

Brain Robot Interfaces for a Lower Limb Exoskeleton

Giuseppe Lisi (1261021)

This doctoral dissertation presentation introduces a study that explores brain-robot interfaces towards the control of a lower limb exoskeleton and applications in gait rehabilitation.

The first part of the presentation focuses on the feasibility of decoding highly discriminable brain states, necessary for a reliable device control, in the condition where subjects wear a moving lower limb assistive robot. In previous studies, a wheelchair was steered, using a decoder based on event-related desynchronisation and synchronisation (ERD/ERS). Contrary to a wheelchair, a lower limb exoskeleton perturbs the legs of the subjects. According to previous evidence, this passive movement might induce sensorimotor rhythms, interfering with the ERD/ERS decoder. Results indicate that the decoding accuracies, in the conditions where the leg is perturbed or not perturbed, do not display a significant difference. This establishes a solid ground for future studies aimed at using motor imagery to control a moving lower limb exoskeleton.

The second part of the presentation discusses about the possibility of decoding the brain activity functionally associated with gait, for applications in robotic gait rehabilitation. We designed a brain-machine interface that could discriminate the activity associated with constant speed gait as opposed to gait speed changes (i.e. acceleration or deceleration). Feature visualisation indicates that the information used by the decoder is associated with increased activity in parietal areas during gait speed changes. This, not only, may find direct application in gait rehabilitation after stroke, but represents an encouraging evidence towards the development of a more natural approach to control a lower limb exoskeleton.

リファクタリングがソフトウェア品質に及ぼす影響の実証的評価に関する研究

藤原賢二 (1261013)

リファクタリングとは，ソフトウェアが抱える設計上の問題をソフトウェアの外部的な振る舞いを変更することなく取り除くことをいう．高品質なソフトウェアを効率良く開発するためには，適切な時期に適切な箇所に対してリファクタリングを実施することが重要である．本論文では，リファクタリングの実施がソフトウェア品質に与える影響を調査するための分析手法と，その支援手法の提案と評価を行った．具体的には，次の3つの貢献を行った．

リファクタリングと欠陥混入の関係に着目し，これらの関係を調査するための分析手法を提案した．提案手法では，開発履歴からリファクタリングの実施頻度，欠陥の混入頻度および欠陥の修正頻度を計測する．提案手法をオープンソースソフトウェアに適用した結果，提案手法を用いてリファクタリングと欠陥の関係を定量的に評価することが可能であることを確認した．
開発履歴からリファクタリングの実施履歴を高速に復元するための手法を提案し評価した．従来の手法は，任意の2バージョン間から実施されたリファクタリングを検出することを目的としていた．そのため，開発履歴中の隣接する全てのバージョン間からリファクタリングを検出するための工夫がされていなかった．提案手法は，リファクタリングの検出に必要な構文情報の解析を差分的に実施することで計算時間の短縮を実現した．
研究者やプロジェクト分析者がリファクタリングの実施履歴を容易に分析できるよう支援するシステムを開発した．提案システムのユーザは，プロジェクトの開発履歴と構文情報リポジトリを統合的に扱うことができる．また，提案システムを用いることで，既存の開発履歴から構文情報リポジトリを作成可能である．提案システムを用いて約100件のオープンソースプロジェクトの構文情報リポジトリを作成し，公開を行った．

確率的手法に基づいた大規模代謝シミュレーション手法の構築

桂樹哲雄 (1161201)

生物体内で連鎖的に起こる化学変化を代謝反応と呼び、それに関わる化合物を代謝物と呼ぶ。代謝反応は、生物の生命活動を支えていることから、代謝反応の挙動を知ることは、生命活動を理解する上で大変重要である。

近年、質量分析装置等の計測技術の発達に伴い、生体内の代謝反応を網羅的に計測し、代謝反応の挙動を調べることができるようになってきた。しかしながら、実験で質量分析装置から得られる生体内の代謝物の濃度は通常相対値であり、また全ての代謝物を計測できるわけではない。そこで、代謝の挙動を量的に計量するためには、得られた結果をもとに、シミュレーションによって個々の代謝の挙動を明らかにする手法が有効である。

本研究では、植物の大規模代謝シミュレーションシステムの開発を目指し、以下の３つの課題、すなわち、(1) シミュレーションに必要なパラメタの数が膨大であること、(2) 決定論的定式化を行うとシステムの確率依存的ふるまいを記述できないこと、(3) 質量分析機器から得られる代謝量は絶対定量ではなく、また全ての代謝物を含んでいるわけではないこと、に着目する。これらの課題について、(1) 着目すべきサブネットワークのみを抽出すること、(2) 確率的シミュレーション手法を用いること、(3) 遺伝的アルゴリズムおよび分散遺伝的アルゴリズムによるパラメタ推定を行うこと、によってそれぞれ対処することとし、３つのツールを開発した。SS-mPMGは、ゲノム全体の代謝ネットワークから部分ネットワークを抽出し、代謝モデルの複雑性を軽減して動的ネットワークシミュレータを自動で構築するツールである。SS-GA, SS-dGAは、実験で観測された代謝の挙動を再現できるようなパラメタを、それぞれ遺伝的アルゴリズムまたは分散遺伝的アルゴリズムによって推定し、それらの推定したパラメタを用いて確率的シミュレーションを実行するツールである。

発表では、本研究における大規模代謝シミュレーションの手法を紹介し、開発したツールを用いてシロイヌナズナの実験的に観測されたアミノ酸生成を再現した結果を示す。

Decoding Language Information Represented in Brain Activity during Overt and Covert Speech

池田　純起 (1161001)

I have been studying which brain areas represent language information during overt and covert speech. Previous studies measured brain activity while subjects overtly or covertly speak sentences or words, and revealed brain areas associated with overt and covert speech by estimating task responsivity for individual areas. However, it remains unclear what language information is represented in the brain areas. I focused on decoding analysis which allows us to read out information from brain activity, and conducted two studies on reading out language information represented in brain activity during overt and covert speech.

(1) I investigated brain areas important for representing information about single vowels during covert speech, using decoding analysis. My findings differed from that of a previous study which performed decoding vowels within the phonemic sequence of words from brain activity during overt and covert speech of words. This fact suggested that neural basis for processing of vowels within phonemic sequences was different from that of processing of single vowels.

(2) Previous brain imaging studies reported that many of brain areas which activated during overt speech were similar to those which activated during covert speech. However, it remains unclear whether overt and covert speech share common speechrepresentations. To this end, subjects’ brain activity was measured during two speech conditions (i.e. overt and covert speech). When for each speech condition a decoder which discriminates syllables was trained and was tested to evaluate a decoding performance, statistically significant decoding performances were obtained in two speech conditions. Next, when a decoder was trained on the overt speech data and was tested on the covert speech data (or vice versa), significant decoding performances were not obtained. Results showed that information about syllables was represented in brain activity during overt and covert speech, but could not show that overt and covert

インターネットにおける位置依存情報配信に関する研究

岡田和也 (0961005)

スマートフォンの普及に伴い，利用者端末に対する輸送障害・災害情報の配信，周辺施設の広告配信など位置に依存した情報配信が重要な役割を果たしている．利用者の位置を知ることで，位置に対応付けられた多くの情報を提供可能となる．今後は，端末数，センサデバイスの増加に伴いより多くの位置依存情報配信が必要になると想定される．しかし，既存の位置に応じた情報配信では，個別のサービス，アプリケーション，通信網に大きく依存している．そこで本研究では，インターネット上で位置に依存した情報配信基盤を実現するべくLocation-Based Multicast (LBM) を提案した．LBMは，位置に依存した情報配信をネットワーク層においてIPマルチキャストを用いることでキャリア，サービスに依存しない配信基盤を提供する．ネットワーク層では，端末の位置・領域を識別することが困難であり，位置・領域に応じた経路制御が必要である．

本研究では，LBMを実現するために位置・領域の識別子としてGALMAを提案した．GALMAは，実世界の位置を交番二進符号に変換しマルチキャストアドレスに埋め込むことで，ネットワーク層での一意な位置識別を可能とする．評価では，実際の位置情報を元にしたシミュレーションによりGALMAが他のアドレス割当て方式に比べて有効であることを示した．次にマルチキャスト経路制御手法では，GALMAのアドレス階層構造を利用した経路集約可能なマルチキャスト経路制御手法を提案した．この手法では，マルチキャスト経路をルータにおいて段階的に経路集約することで経路表の肥大化を防ぐ．評価では，独自に実装したシミュレータにより従来の経路制御手法と比較して経路数を削減できることを示した．また，LBMの想定アプリケーションとしてモバイル端末内蓄積型センサ情報共有手法を提案した．日々収集されるセンサデータを端末内に蓄積することでストレージコストを削減する．シミュレーション実験では，実際の移動軌跡データを用いて提案手法によりセンサ情報取得位置数を8割以上削減できることを示した．

本研究では，LBMに必要な位置・領域識別子，経路集約可能な経路制御手法を設計しシミュレーションにより有効性を検証した．これらの研究成果によりインターネットにおける位置依存情報配信基盤の実現可能性を示した．

Flash Codes with Binary-Indexed, Resizable Clusters and Dual Mode of Encoding

Michael Joseph New Tan (1261025)

Flash memory is commonly used today. Memory cards, USB drives and solid state drives are some examples of current technology that uses flash memory. A flash memory is composed of cells grouped into blocks, where each cell can store some electron charges. The amount of charges stored within a cell is represented with an integer value which is referred to as the level or value of the cell. The operations to increase and decrease the level of the cells have an asymmetric relationship which causes the flash memory to have a limited lifespan. One of the study on extending the lifespan of flash memory is through the construction of clever coding scheme to properly accommodate the changing of the values ??of the data stored within the flash memory.This coding scheme are called flash codes and the main focus of this presentation.

This presentation introduces three novel flash codes. The first flash code is called binary-indexed flash code (BIFC). This flash code uses slices which are small group of cells used as an operational unit for encoding and decoding.The advantage of this flash code is that it uses fewer cells in a slice compared to previous slice-based flash code. One benefit of smaller slices is to be able to store a much larger data in a block of flash memory cells. However, it also comes with an overhead cost that degrades the performance of the flash code. The next two flash codes improve the first by limiting the overhead cost as much as possible. The second flash code, resizable-cluster method (RCM), is able to limit the cause of the overhead cost in the worst case scenario. The third flash code, dual-mode flash code (DMFC), uses two modes of encoding to diminish the overhead cost in the average case scenario. This presentation uses computer simulation to show the improved average performance of DMFC over BIFC.

Memorization and Design Support by Augmented Reality

藤本　雄一郎 (1261012)

Augmented reality (AR) is a technology that overlays information generated by a computer onto the real world and makes a user perceive the information as if it existed in the real world. Although AR has been gradually applied to entertainment applications such as gaming, advertising and some other fields for general users, AR is also intended to help workers. In particular, one of the most promising fields with active research is industrial task support. However, actual industrial tasks where support systems have been already employed are quite limited.

Objective of this study is to solve several problems which inhibit the application of AR and to expand applicability of AR. To investigate the reason for limited use, we categorized AR task support systems into two types in terms of features of information to be perceived. The first type of information in AR enables a user to intuitively understand information by performing the association between a virtual object and a real object. The second type of information in AR provides a high realism by the replacement of parts of a real object with a virtual object (e.g., a texture).

The first feature works well for assembly, inspection, maintenance, and some other tasks. One of the reasons why few systems have been employed in actual tasks is the shortage of the knowledge on the effective use for each application. To find a new benefit of AR, I focus on the 'memorization'. AR is a technology that overlays information about a specific location on the user's real-world view. Users can perceive, not only the information itself, but also its location. It is also known that humans can easily memorize and retain information if this information is associated with some real-world locations. From these two facts, I hypothesize that ``If information is displayed in relation to specific locations on the real world by AR, then the users can have better memorization task results than when the information is displayed in an unrelated location.". If the effectiveness of AR in enhancing a user's memory skill can be proven, I could argue that AR support systems are effective for not only providing information related to tasks but also facilitating the memorization of tasks. Through several experiments, I found several significant differences among the situation with AR and without AR which supports this hypothesis.

The second feature works well for the design support tasks, however, the technique is not as mature compared to the first one. Particularly, one of most significant examples of design process support is rapid-prototyping with AR. AR enables users to check a prototype of a product with various appearances within a short period of time by overlaying or projecting a texture onto it. One of the current problems in this area is the difficulty of using deformable products such as clothes which are easily deformed in a short time.$B!!(BThis example shows that current technologies have a big limitation in rapid-prototyping system. In this research, I developed a design support system for deformable objects and technologies required for achieving it.

ロボットサービスのための人環境情報地図の構成法と計画法の研究

鮫島　一平 (1361005)

人と環境を共有するロボットは，決まった経路を移動する産業用ロボットとは異なり，周囲の環境を認識し，人や障害物を避けて安全にサービスを提供することが求められる．ロボット用地図の地図として，2次元・3次元の占有格子地図は広く研究されているが，占有格子地図だけでは，人の生活環境のような環境下で高度な作業を行うことはできない．この問題に対するアプローチとして，地図に様々な情報を付与する方法があり，様々な試みがなされてきている．しかし，従来の研究は各情報を個別に扱っており，サービスロボットでの使用を前提とし，人，物体由来の複数の情報を組み合わせ，屋内・屋外環境で地図を構築した例はない．そこで本研究では，人と環境を共有するサービスロボットにおいて，安全・効率的な移動と高度なサービス提供を実現するための地図を提案する．本研究では，人の識別・分類に有用な指標として人体寸法情報に，ロボットの移動に有用な情報として人の移動軌跡、障害物の変化情報に着目し地図の構築を行う．

本研究では，人体寸法の統計データを解析することで，ロボットに搭載可能なセンサである距離画像センサ，3次元Lidarセンサから人の人体寸法を推定する手法を提案した．本手法により，身長や四肢の長さ，体重等を含む全身の52項目の寸法を非接触・短期間に推定することが可能となる．これにより，ロボットは利用者の人体寸法に応じ，個々人に合わせるといった高度なサービスの提供が可能となる．

人の移動軌跡由来の情報は，ロボット搭載センサで直接計測できない，人の行動傾向を推定するのに重要な指標となる．本研究では，LRFを用いたクラスタベースSJPDAFs による歩行者追跡手法を用いて計測したデータに基づき，人の移動速度，人の移動確率分布，人の対流確率分布，人の移動速度分散を算出した．また，算出したパラメータに基づき，人の移動パターンにより通路・交差領域を分類する手法を提案した．

障害物の変化情報は，従来のロボット用地図では反映しにくい，動的な障害物を考慮する上で重要な指標となる．本研究では，動的な障害物を考慮する手法として障害物の存在頻度・変化頻度情報を付与した地図を構築した．また，本研究で用いたドロネー三角形分割を用いた障害物の認識手法について解説し，認識結果から障害物の存在頻度・変化頻度を算出する方法を提案した．
発表では，ロボットの搭載センサにより全身の人体寸法を推定するシステムと，今後ロボットサービスが利用されるであろう屋内展示施設環境，屋外遊歩道環境にて人環境情報地図を構築した結果について紹介する．

Automated Social Skills Training through Affective Computing

田中宏季 (1261007)

Social communication skills are important factors influencing human life, and the number of people who have trouble with these skills have recently been increasing for a variety of reasons. In the thesis a computer-based training system to enhance human social communication skills is proposed. Computers in social skills training have several advantages in that computerized environments are predictable, consistent, and free from social demands. One of the central psychological themes in communication difficulties is empathizing, which is a set of cognitive and affective components. In this presentation, I propose several computer-based training methods to train both cognitive and affective skills. For the cognitive component, I developed iPad application "NOCOA+" that uses multiple modalities to help users recognize non-verbal behaviors. I confirmed a method for predicting the autistic traits by using these systems, examined the effect of modality differences, and evaluated the effectiveness of computer-based intervention. For the affective component, I attempt to automate the process of social skills training by a dialogue system named "Automated social skills trainer," which provides the social skills training through human-computer interaction. The system includes a virtual avatar that recognizes user speech and language information and gives feedback to users to improve their social skills. Its design is based on conventional group or individual social skills training performed by human participants including defining target skills, modeling, role-play, feedback, reinforcement, and homework.

赤外線深度センサを用いた下部形状認識によるマウス歩行解析システムの開発

中村　彰宏 (12610009)

マウスは幅広い分野で実験動物として利用され、最も重要なモデル動物の一つである。近年では運動機能や遺伝的形質の評価のため、マウスの歩行に注目した実験が幅広い分野で用いられている。しかし、そういった研究で重要とされるオープンフィールドにおける自然動作中の後肢関節角や歩容について、マウス行動に影響を与えず自動計測できるシステムは未だ実用化されていない。そのため、現在は人間の目視による評価が標準とされており、計測の客観性や時間的コストが問題となっている。

本研究では、この問題を解決するため、後肢関節角推定システムと歩容解析システムの開発を行った。いずれも赤外線深度センサを用いてマウスを腹面から計測することで、脚同士のオクルージョンを避けて四肢形状を取得し、マーカーレスでの三次元動作解析を実現する。さらに床に赤外線透過フィルタを用いることで、高所恐怖による行動への影響を防ぐことができるシステムとなっている。今回の発表では、後者の歩容解析システムについて発表する。

提案システムは、歩行中のマウスの足跡と四肢末端の軌道を対象とし、三次元形状から凸点を抽出するアルゴリズムによりそれぞれの軌道を推定する。実マウスを用いた実験を行い、人間の目視による追跡と比較してMAE 4.3 mm 程度での足跡追跡と、MAE 4.7 mm 程度での四肢末端追跡が行えることを確認した。さらに、既存システムよりも自然な行動を計測可能であることを示すため、床に赤外線透過フィルタを用いた場合と透明なアクリル板を用いた場合でのマウス行動を比較し、透明な床上では活動量と計測範囲中央の滞在率が有意に低下することを確認した。

このシステムの実用化により、今まで人間の目視に依存していた行動解析の客観性向上と時間的コスト低減が実現できると期待される。現在、企業との共同研究として、低価格で簡便なシステムとしての製品化を進めている。

ソフトウェア開発の超上流工程における非機能要件の定量的評価

齊藤康廣 (1061202)

　委託ソフトウェア開発プロジェクトの超上流工程において，ユーザ（発注企業）が作成する提案依頼書（Request for Proposal: RFP）の品質は，ソフトウェア開発を成功させる上で極めて重要である．本発表では，RFPの品質を定量的に評価する3つの方法を提案する．評価対象とするのは，RFPで示されるべき非機能要件（NFR）であり，評価の観点は，その記述の明確さ，である．

　最初の提案では，評価対象を，ユーザにとって重要度の高い「保守と運用に関する55個の非機能要件」に限定した上で，要件記述の明確さを最大5段階で評価するためのメトリクスを定義する．地方自治体，図書館，官公庁，独立行政法人，大学，病院の6ドメイン29件のRFPを評価対象としたケーススタディによって，記述が不十分な要件を特定したり，基準値との比較を通じて特に改善が必要な特性を明らかにできることを確認した．

2番目の提案では，評価の自動化を目的として，RFPに含まれる各NFRに関するキーワード（NFRキーワード）の出現頻度に基づいて，教師あり機械学習によって各要件の記述の明確さを評価する．提案方法では，自然言語で記述されたRFPからNFRキーワードを抽出し，各NFRにマッピングする．そして，NFRキーワードの出現頻度とNFRの記述の明確さとの関係をランダムフォレストによりモデル化する．70件のRFPを題材として，提案方法によって26種類の非機能要件の記述の明確さを評価した結果，人手による評価との一致率の平均値は69.8％であり，±1の誤差を許容した±1差一致率の平均値は97.2％となった．

　3番目の提案では，評価の自動化において教師データを不要とすることを目的として，NFRキーワードの出現頻度とそれに対する重み付けのみに基づいて，関連するNFRに評価値を与える．161件のRFPを題材とした実験の結果，提案方法による評価値と人手による評価値との相関係数は0.22～0.43となった．このことから，教師データを付与することが難しい状況においても，ある程度の精度でNFRの自動評価を行えることが示唆された．

Supporting Effective Knowledge Sharing through Extracting Search Activities in a Community of Interest

Papon Yongpisanpop (1261028)

A Community of Interest (CoI) is a group of people who share a common interest or passion. The purpose of the CoI is to provide a place where people who share a common interest can go and exchange information, ask questions, and express their opinions about the topic. With todays technology people can access information online by using web search engine, which allow people to discover and share knowledge within a CoI. High turnover in CoI is one of the important causes that make it hard to capture and share knowledge. The question is whether we can create a system that will capture community-wide knowledge in a real time and make it widely available to all its members or not. This dissertation focuses on the issue of reducing knowledge-sharing efforts while people search and reuse users past search results to improve future search results to be more relevant for a community of interest. It proposes Search Activity Knowledge Extraction (SAKE) model, which extract the knowledge from search behavior and share it through the CoI. A framework called Adaptive Search Framework (ASF) based on the SAKE model can collect ten most used keywords and evaluate Top-5 and Top-10 search results using standard Topic-Sensitive PageRank (OSim and KSim) in the CoI environment. A user experience study to evaluate the user satisfaction after two weeks of using the proposed search engine revealed that the search results were significantly improved compared with conventional search engines (Google and Bing). The SAKE model can reduce knowledge-sharing efforts while searching information in a CoI and it can return more relevant results to the searchers.

Human Activity and Environment Recognition on Mobile Devices

Yuki Maruno (1261015)

mHealth, the use of mobile devices and other wireless technology in health care and public health, is a rapidly expanding area of research and practice. mHealth applications are closely related to our daily life, and help people manage their own health and wellness, promote healthy living, and provide access to useful information whenever and wherever they need it.
In this dissertation, I consider an mHealth application that helps people track the daily activity. In building such applications, the following technical problems need to be solved: (1) recognition of human activity and (2) recognition of the user's environment.
Instead of relying on advanced mobile sensors, for each of these tasks I use a single embedded sensor. For human activity recognition, I use a three-axis accelerometer, which is equipped with almost any mobile device. In order to maintain high accuracy in recognition with low computational cost, I employ the wavelet transform and the singular value decomposition during feature extraction. For human environment recognition on the other hand, I use a microphone to capture the environmental sounds. To ensure sufficient location coverage, the data collection is designed based on people's daily routines, which enables coverage of a wide range of environments including public transportation, offices, streets, and shopping malls. In order to classify the 17 environments, I make use of several audio features from time domain and frequency domain. With regard to experimental results, the algorithm used was able to classify user activities into walking, running, standing still and being in a moving train with accuracy of over 90%. As for human environment recognition, accuracy of over 80% for the 17 environments was achieved.
The proposed method deals with the limitation in available electric power, thereby addressing an mHealth application issue.

神経成長円錐におけるcGMPから膜電位へのシグナル変換のデータ駆動型システム同定

山田　達也 (1261018)

神経突起先端にある成長円錐は，細胞外の環境を検知し，神経突起を適切なシナプス結合のターゲットへと導く．例えば，細胞外に誘導因子Sema3Aの濃度勾配がある場合，通常状態の成長円錐はSema3Aに対して，膜電位の下降を伴いながら忌避性の運動を行う．しかし，成長円錐内のcGMP濃度を増加させると，膜電位の上昇を伴いながら誘引性の運動を行うようになる．この事実は，cGMP濃度による生化学的シグナルが膜電位による電気的シグナルに変換されていることを意味する．しかしながら，こうした異なる物理量間のシグナル変換を実現するメカニズムは未解明である．
本研究は，成長円錐におけるcGMPから膜電位へのシグナル変換を実現するシステムの同定を目的とする．そのために，データ点が豊富な膜電位時系列を決定論的数理モデルで表し，個々の実験や細胞に含まれるばらつきをモデルパラメータの確率分布で表現した．ベイズの定理にマルコフ連鎖モンテカルロ（MCMC）法を適用することによりモデルパラメータの事後分布を推定し，対数エビデンスを用いたモデル比較を行うことで細胞内システムの同定を行った．
膜電位には定量的なモデルとして確立している Hodgkin-Huxley 方程式を適用し，細胞内分子のシグナル伝達には一般的な生化学反応方程式を用いた．これらのモデル式に定量データを適用すれば，各方程式のパラメータについても定量的な議論ができる．しかしながら，細胞個性と観測ノイズによるばらつきがモデルパラメータに存在し，制約のないパラメータ推定ではばらつき要因が混合してしまう．そこで，我々はベイズ的枠組みでパラメータの事前分布を導入するとともに，パラメータの属性を適切に区別することで，分布の範囲を制約した．ただし，事前分布のハイパーパラメータは現象論的・物理的観点から決定した．
その結果，既知の相互作用に加え，Protein Kinase G (PKG) の下流による塩素イオンチャネル抑制が必要であることが推定された．妥当性検証を行ったところ，推定されたモデルが他の実験条件の膜電位時系列を定量的に再現するとともに，定常状態後の膜電位においても実験定量データに酷似したcGMP依存性をみせた．

Predicting learning plateau of working memory from whole-brain intrinsic network connectivity patterns

山下真寛 (1161027)

In Cybernetics, learning ability is regarded as a characteristic of organisms. Human learning has been linked with brain plasticity, which inspired artificial intelligence as machine learning. In the past 10 years, resting state brain activity has taken over central area of neuroscience research. Previous studies provided evidences that the brain works through spontaneous interaction between large-scale networks, and the network characteristic is predictive of learning performance.

Working memory refers to short-term memory and manipulation of information, and is an important factor for human intelligence. Learning performance of working memory has been linked mainly with a single task-activated network (fronto-parietal network which increase activity during working memory tasks). However, since the brain is essentially a hierarchical dynamic network system, differently functioning brain networks may uniquely contribute to learning. Therefore, we hypothesized that individual learning performance is determined by both task-activated network and less-activated network (i.e., not increase or decrease activity during working memory tasks), together.

Using a commonly-used working memory task, subjects were trained for 25 sessions (80-90 min). Individual learning plateau of working memory was estimated by a curve fitting method to their learning curve. Another day, the subjects underwent resting state functional magnetic resonance imaging (fMRI) for 5 min. With that resting state brain activity data, large-scale network connectivity was calculated in the entire brain. To predict individual learning plateau from the network connectivity, we used a sparse linear regression. Consequently, individual learning plateau of working memory was highly accurately predicted (R² = 0.73, p = 0.003) from network connectivity among brain-wide networks.

Latent Variable Models for Discrete Data and the Learning Methods

小西卓哉 (1261005)

Latent variable models are probabilistic models including unobserved random variables, which are widely applied to modern data analysis. Many researches have shown the applicability and efficiency in a variety of data, however, several questions still remain unsolved. To obtain desirable latent representation, a practical issue is how to design latent variable models according to the properties of data. As a specific application, we focus on search queries on the Web search engines. A search query consists of a combination of terms and the possible number of them is enormous. However, the search queries can be represented as common low-dimensional patterns. We propose a probabilistic topic model that extracts such patterns as pairs of latent topics. In the experiments, the proposed model obtains desirable patterns and shows the effectiveness in real applications more than existing topic models. Another problem on latent variable models is how to learn the models when observed data are given. Exact inference of latent variable models is mostly intractable, thus, approximate methods are substituted. While efficient learning algorithms have been proposed for many models, several models have not been explored enough. From among such models, we study the Variational Bayesian (VB) inference methods of the infinite relational model (IRM) for network data. We derive the collapsed variational Bayesian (CVB) inference that is a special case of the VB inference. The CVB inference empirically outperforms the standard VB inference in most real network datasets. The results imply the CVB inference indicates even better performances in dense networks.

Real-time Neuroprostheses Control: Towards Practical BMI for Paralyzed Patients

福間　良平 (961019)

[Objective] Advancements in invasive Brain Machine Interface (BMI) research have enabled patients with motor dysfunction to control external devices. However, to bring BMIs out of the laboratory and into the real world to improve the quality of life for patients, the invasiveness and clinical risks associated with contemporary BMIs need to be improved. Although electrocorticogram (ECoG) measures brain activity invasively, it does not penetrate into the cortex and has been already used in clinical diagnostics. Moreover, motor information, which is extracted from the brain and used to control BMI, might be altered due to paralysis. By developing ECoG-based BMI and non-invasive BMI, and adapting them to real patients, the efficacies of ECoG-based BMI for reconstructing motor function and the potential of non-invasive BMI in clinical use are demonstrated.

[Methods] ECoG signals were measured when patients with different severities of motor dysfunction moved a limb and controlled a real-time ECoG-based prosthetic arm. The measured signals were decoded to infer the type of the limb movement and intention to move, showing the preservation of information about the movement type and movement onset. Magnetoencephalogram (MEG) signals from severely paralyzed patients during a hand movement task were also analyzed, and a prosthetic hand was controlled in real-time using MEG signals.

[Results] Even paralyzed patients could control an ECoG-based prosthetic arm, and the severely paralyzed patient could control an MEG-based prosthetic hand by attempting hand movements without actually moving their hand. Decoding accuracy for the movement type was found to be deteriorated due to the paralysis, but detection accuracy about movement onset was not. Notably, the subjective movability of severely paralyzed patients$B!G(J phantom limbs was shown to be related to the decoding accuracy about the movement type.

[Interpretation] It was found that the motor information of the paralyzed patient was dependent on the severity of motor dysfunction, the subjective difficulty of moving their phantom limb, and the characteristic of motor information, which affected the real-time performance of the BMI. The real-time ECoG-based BMI was shown to be useful for reconstructing the motor function of paralyzed patients. The MEG-based BMI was found to be likely controlled primarily by features that are characteristically used in a BMI using ECoG signals. Thus it is implied that the MEG-based BMI system should be suitable for evaluating and improving the ability of controlling a practical ECoG-based BMI.

Presentation language: Japanese

Bilingual Dictionary Extraction via Multilingual Topic Models

劉暁東 Xiadong Liu(1161205)

A machine readable bilingual dictionary plays a crucial role in many natural language processing tasks, such as statistical machine translation and cross-language in- formation retrieval. In this thesis, we propose a framework for extracting a bilingual dictionary from comparable corpora by exploiting a novel combination of topic modeling and word aligners, such as the IBM models. Using a multilingual topic model, we first convert a comparable document-aligned corpus into a parallel topic-aligned corpus. This novel topic-aligned corpus is similar in structure to the sentence-aligned corpus frequently employed in statistical machine translation and allows us to extract a bilingual dictionary using a word alignment model.

The main advantages of our framework is that (1) no seed dictionary is necessary for bootstrapping the process, and (2) multilingual comparable corpora in more than two languages can also be exploited. In our experiments on a large-scale Wikipedia dataset, we demonstrate that our approach can extract higher precision dictionaries compared to previous approaches, and that our method improves further as we add more languages to the dataset.

統計的機械学習を用いた日本語歴史コーパス構築時の表記整理作業の自動化

岡照晃 (1261003)

近年，コーパスを利用した日本語研究が増えつつある．しかし日本語学や国語学の分野では，古い時代の文献資料を扱う歴史的研究が大きな位置を占めており，そういった歴史的資料はコーパスとしての整備が現代語のコーパスほど進んでいない．歴史コーパスの整備が進まない原因の一つとして，コーパス整備時の表記整理にかかるコストが高いことが挙げられる．表記整理作業は専門家にしか行えず，作業人員を大量に集めることが難しい．またその反面，作業対象が膨大であるため，作業を完了するまでに非常に時間がかかる．

そこで本研究では，統計的機械学習手法を用い，歴史的資料の表記整理作業を自動化することを最終的な目的とする．これにより，誰でも簡単に低コストかつ大規模に表記整理作業を実施することが可能になると考えられる．本論文では，その第1段階として，表記整理作業の中から濁点付与の作業を取り上げ，自動化に取り組んだ．その後，濁点付与の自動化で得られた知見を基に，その他の表記整理の自動化に取り組んだ．

本論文における貢献は以下の通りである．
１．表記整理作業の一つである濁点付与の自動化に取り組み，近代文語論説文に対しF値：96を超える高精度な自動濁点付与を実現した．
２．１の手法を実装した自動濁点付与アプリケーションを開発した．
３．表記整理と形態素解析を同時に実施する手法を開発した．この手法により濁点付与に限らず，すべての表記整理項目の自動化が可能となった．また自動濁点付与性能の向上および，その他の表記整理項目でも94～96%の適合率を実現した．

情報科学研究科副専攻長

平成26年度 情報科学研究科 博士学位論文発表梗概

Maricris Cuison Marimon (1161206)

Giuseppe Lisi (1261021)

藤原 賢二 (1261013)

桂樹 哲雄 (1161201)

池田 純起 (1161001)

岡田 和也 (0961005)

Michael Joseph New Tan (1261025)

藤本 雄一郎 (1261012)

鮫島 一平 (1361005)

田中 宏季 (1261007)

中村 彰宏 (12610009)