Abstracts of Doctor Thesis 2004

平成１６年度情報科学研究科博士学位論文内容梗概

0061204 	倉立 尚明
「Talking Head Animation System Driven by Facial Motion Mapping and 3D Face Database
（顔面運動マッピングと三次元顔形状データベースを用いたトーキングヘッドアニ メーションシステム）」

内容梗概

This thesis proposes a new method called facial motion mapping to create a perceptibly
accurate talking head animation system which is applicable to most face-like objects. Also, a
new 3D face posture estimation method based on a large scale 3D face database is described that
makes it possible to create 3D talking head animations for people without 3D face data from only
two photographs: a profile view photograph and a front view photograph.
The first method is based on the statistical analysis of multiple face postures from each subject
or computer animation character. Providing a small set of basic speech and non-speech
related face postures induces a similar result across subjects for the major components extracted
by a principal component analysis. This similarity can be used to transfer face motion from one
subject to any other. Faces are synthesized from linear estimation of subject-specific deformation
parameters. The deformation parameters are derived from static facial postures using straightforward
multi-linear analysis techniques. Input to the linear estimator consists of time-varying
values for a few locations on the face or deformation of an entire face, and results in a synchronized
audiovisual output. The time-varying data for one face can easily be used to control the
deformation parameters of other faces, including computer characters, and cartoon and animal
faces. The resulting facial motion is quite realistic even when rendered for a 2D cartoon face.
Although considerable computation time is required to extract the deformation parameters, they
are small in number. Thus, the linear estimation technique used to generate the animations runs
at almost real time without optimization.
Like the first method, the second method uses results of a principal component analysis, but
from a large scale 3D face database which consists of multiple face postures obtained from over
200 subjects. Describing faces in this principal component space decreases the redundancy of
the data significantly, thus a relatively small set of principal components are enough to visually
reconstruct any face in the database. The same components can be used to parameterize almost
any face which is not included in the database, and face posture distributions of the registered
faces in the database can be applied to estimate face deformations for these unknown input faces
in the principal component space. Also, multi-linear analysis techniques similar to those used in
the first method are used to estimate a 3D face for any unknown face from a small set of feature
points extracted from a profile view photograph and a front view photograph, and face postures
for the unknown face can also be estimated in the principal component space.
The combination of these two methods has the capability to create perceptibly correct talking
head animations using one subject’s face motion for any person who can provide just two
photographs.



0261025 	松村 知子
「レガシーソフトウェアにおける暗黙的コード制約の形式化と潜在フォールトの検出
     Formulating Implicit Code Constraintsand Detecting Potential Faults in Legacy Software」

概要
修正や機能追加が長年繰り返されてきた大規模ソフトウェア（レガシーソフトウェ
ア）において，保守の障害となる多数の「暗黙的コード制約」の発生，及び，それに
伴うフォールトの混入が問題となっている．暗黙的コード制約とは，開発・保守の過
程で付随的に発生する明文化されていないコーディング上の制約であり，違反すると
フォールトが混入する場合がある．一方，長期にわたる保守の過程では作業者が入れ
替わることが多く，暗黙的コード制約を全作業者に徹底させることは困難である．制
約の存在に気づいていない作業者は制約違反をたびたびおかし，同種のフォールトが
繰り返し混入することとなる．
本論文では，レガシーソフトウェアを対象として，暗黙的コード制約の形式化の方
法，及び，形式化されたコード制約を使ってフォールトを自動的に検出するシステム
を提案する．提案方法は，まず，熟練した保守作業者が，暗黙的コード制約を過去の
フォールト報告書から抽出し，その中からフォールトの原因となる典型的なコードパ
ターンとして形式化可能なものを，パターン記述言語により記述する（これを「制約
違反パターン」と呼ぶ）．検出システムは，プログラムの構文解析によって作成され
る属性付き構文木と制約違反パターンとのパターンマッチングを行い，制約に違反す
るコードの検出を行う．
あるレガシーソフトウェアを調査した結果，保守工程で報告された故障の32.7%は暗
黙的コード制約に違反したために混入したものであり，保守開始時に存在した暗黙的
コード制約のうち39個を確認できた．提案するパターン記述方法では、30個の制約に
違反するコードパターンを記述できた．さらに，試作したマッチングシステムによる
実験の結果，パターンに一致する箇所が対象ソフトウェア中に772箇所検出され，そ
のうちの152箇所にフォールトが存在したことが分かった．またそのうちの111箇所が
未報告のフォールトであり，提案方法が潜在フォールトの検出に有効であることを確
認した．



0261006 	今村 賢治
「Automatic Construction of Translation Knowledge for Corpus-based Machine Translation 
(コーパスベース機械翻訳における翻訳知識自動構築法の研究)」

概要
対訳コーパスの充実に伴い、機械翻訳も徐々にコーパスから自動獲得した知識
を用いる方式が盛んになっている。これをコーパスベース機械翻訳と呼ぶ。本
論文では、コーパスベース機械翻訳のうち、対訳コーパスからの機械翻訳知識
自動構築法に焦点を当て、以下の3点の課題について論じる。

  1. 対訳コーパスからの知識獲得法の提案
  2. 自動獲得された知識の機械翻訳エンジンへの適用と翻訳品質の測定
  3. 翻訳品質を低下させるコーパスベース機械翻訳特有の問題点の指摘と
     解決法の提案

まず、第1の課題に対し、本論文では対訳文の階層的句アライメント方法を提
案する。これは、同等句と呼ぶ対訳文間の句同士の対応を自動的に抽出する方
法である。本論文で提案する方式は、構文解析を用いる。従来、構文解析を併
用する句アライメント方式は、対訳文の構文構造を決定したのちに、対応関係
を抽出していた。本論文で用いた方式は、句対応スコアと呼ぶ構造類似性評価
尺度を用い、アジェンダ中から最良の構文構造と対応関係を同時に決定する。
この方式を用いることにより、従来法に比べ、精度を保持したまま、約2倍の
同等句を抽出することができた。

第2の課題に対しては、階層的句アライメントを大規模対訳コーパスに適用し、
機械翻訳規則を自動構築する。さらに、実際の機械翻訳エンジンTDMTに組み込
み、翻訳品質を測定する。組み込む過程で、翻訳規則中には大量の冗長規則が
含まれ、誤訳や曖昧性増大の原因となることが判明した。冗長規則を排除する
ことにより、自動構築した機械翻訳器は、手作業で作成した知識を用いる機械
翻訳と同等の性能を有することができるという結論を得た。

第3の課題に対しては、2つのアプローチが提案される。一つは、対訳コーパス
自身に起因する問題に焦点を当て、それを知識獲得の前処理的に解決する方法
である。対訳コーパスは必ずしも機械翻訳に適した対訳文ばかりではなく、文
脈や状況に依存した訳や、同じ原文であるにも関わらず、異なる複数の訳が含
まれる。このようなコーパスから、機械翻訳規則を自動構築すると、大量の冗
長規則が作成される、本論文ではまず、機械翻訳に適した対訳とは何か、議論
し、直訳性に着目する。直訳性を測定するスコアとして、対訳対応率を定義し、
2つの知識構築法を提案する。一つは、直訳性が高い対訳文だけを用いて知識
を構築する、対訳コーパスのフィルタリング法である。もう一つは、対訳文を
直訳部と意訳部に分割し、そこから生成される規則に異なる汎化を適用した、
分割構築である。

第3の課題に対するもう一つのアプローチは、機械翻訳知識自動獲得の後処理
による解決法である。冗長な規則は、翻訳の多様性の他に、自動獲得エラーに
よっても作られる。この問題に対して、翻訳品質の自動評価を利用した機械翻
訳規則の取捨選択法(フィードバッククリーニング) を提案する。これは、翻
訳規則の取捨選択を組み合わせ最適化問題と捉えている。すなわち、自動評価
が出力するスコアを組み合わせ最適化の目標関数値と見なし、これを最大化す
るように、規則の取捨選択を行う。さらにフィードバッククリーニングを拡張
し、訓練コーパスのみでクリーニングを行う、N分割交差クリーニングを提案
する。その結果、従来の後処理法に比べ、翻訳品質を大幅に改善することがで
きた。

最後にコーパスベース機械翻訳の最近のトピックを紹介し、今後の方向性につ
いて論じる。



0361040 	Pang Yan-bin
「DESIGN AND ANALYSIS OF FIELDBUS CONTROL SYSTEMS」

概要
 A Fieldbus Control System (FCS) is a distributed system composed of 
field devices and control and monitoring equipment items and it is 
integrated into the physical environment of a plant or factory. The FCS is 
increasingly being used in the automation arena because of the advantages 
of the fieldbus. More than 50 different names of available fieldbuses have 
emerged. However, research works for the FCS lag far behind their current 
practice. This dissertation investigates several important issues of the FCS.
In Chapter 1, differences between an FCS and a conventional control system 
are summarized, and related researches are reviewed with focus on timing 
analysis, evaluation, control algorithms, modeling and simulation, 
development and applications. Chapter 2 presents models of different 
fieldbus protocols from basic to complete ones to lay fundamentals for this 
study. Main fieldbuses are introduced briefly and their characteristics are 
discussed. The fieldbus medium access control mechanism is also discussed. 
Accordingly, several classification methods of fieldbuses are proposed.
In Chapter 3, the timing characteristics of the FCS are analyzed in 
details. The control period of a control loop in an FCS is formulated and 
analyzed. The stability condition for normal operation of the fieldbus 
control loop is derived. The analysis and experimental results show that 
the execution times and the margin time are dominant in a control period, 
whereas the communication time is secondary. It is also shown that the 
execution time of function blocks depend on the configuration of 
application software. The effects of the communication time, the 
computation time, and the jitter of control period on the control 
performance are evaluated.
In Chapter 4, a complete set of evaluation indices is proposed from the 
user point of view by analyzing the requirements of data communication and 
installation environment for fieldbuses. As a case study, experimental 
results for Foundation Fieldbus (FF) are presented. A general procedure 
with a complete set of detailed indices for selecting a fieldbus system is 
also proposed.
Chapter 5 considers how to overcome bad effects caused by the delays of 
communication and computation time, and the jitter of control period. A 
modified PID control algorithm and a predictive control algorithm are 
applied and the effectiveness of these algorithms are analyzed and verified 
by simulation study.
Chapter 6 proposes an object-oriented, hierarchical, and hybrid modeling 
approach for the FCS development. Based on this approach, a simulation 
platform for the FCS including fieldbus devices, fieldbus segments and the 
plant is developed using the Matlab environment. In this simulation, both 
computation times of application software and communication delays are 
considered at the same time. Also both the control and the communication 
performances are evaluated through simulation runs. Chapter 7 illustrates 
the development of two fieldbus systems by describing both hardware and 
software of the system. The contributions of this thesis are summarized 
with directions for future work in Chapter 8.

Keywords: Fieldbus, Fieldbus control system (FCS), Timing analysis, Control 
algorithm, Modeling and Simulation, Evaluation


0161203 	橘 拓至
「Studies on Performance Analysis of Network Architectures for Wavelength Division Multiplexing

（波長分割多重方式におけるネットワークアーキテクチャの性能解析に関する研究）」
概要

Wavelength division multiplexing (WDM) is attractive for the
infrastructure of the next generation Internet, since it supports huge
bandwidth by multiplexing several wavelengths into a single optical
fiber.  In order to transmit Internet data over WDM networks, network
architectures with photonic technologies, such as wavelength routing,
optical burst switching (OBS), and optical packet switching (OPS) have
been studied and developed.


Currently, it is difficult to store data in optical domain because
optical random access memory has still been in development phase.  It
is also well known that the cost of wavelength conversion is much
expensive, while the conversion capability is quite limited.
Therefore, it is important to develop WDM network architectures under
those constraints, and the evaluation of performance measures such as
loss probability, throughput, and wavelength utilization plays a
crucial role for quantitative characterization of the developed
network architectures.


This dissertation focuses on two network architectures: the wavelength
routing and OBS.  The dissertation firstly considers the wavelength
routing network with dynamic lightpath configuration where a lightpath
supports multiple label switched paths (LSPs).  In this network,
lightpaths are established according to the congestion state of a node
and are released after some holding time.  For the performance
evaluation, the system is modeled as a multiple queueing system in
light and heavy traffic cases, and the connection loss probability and
wavelength utilization factor are derived.  Numerical examples show
that the analytical models in both cases are effective for the
performance evaluation in comparison with the simulation results, and
show how the holding time affects the connection loss probability.


Secondly, to provide the multiple service classes for the connection
loss probability, QoS-guaranteed wavelength allocation and shared
wavelength allocation are proposed.  In the first method, the
pre-determined number of wavelengths are allocated to each QoS class
depending on the priority of loss probability.  Here, a wavelength set
for a QoS class is a proper subset of other sets for higher classes.
In the second method, wavelengths are classified into multiple
dedicated wavelength sets and a shared wavelength set which is
utilized by all classes.  Both the methods can be utilized in the
wavelength routing network with limited-range wavelength conversion.
Both methods are modeled and analyzed with queueing theory, and
numerical examples show the effectiveness of the methods.


Finally, the dissertation considers a timer-based burst assembly and
slotted transmission scheduling for the OBS network.  In the method,
bursts are assembled in round-robin manner and are transmitted in
accordance with slotted scheduling.  A loss model with two independent
arrival streams is constructed for the performance evaluation, and the
burst loss probability, burst throughput, and data throughput are
explicitly derived.  The usefulness of the analysis is discussed with
several numerical examples.


Currently, the WDM networks are deploying world-wide and the research
for the realization of all-optical Internet has become more active
than ever before.  The author expects that the proposed methods and
their performance analysis will be significantly utilized in order to
construct the WDM networks for the next generation Internet.


Key words: WDM Network, Queueing analysis, Wavelength routing, Optical
burst switching, Performance evaluation, Wavelength conversion


0261014 	下畑 光夫
「Acquiring Paraphrases from Corpora and Its Application to Machine Translation 
（コーパスからのパラフレーズの獲得とその機械翻訳への適用）」
概要
自然言語には、パラフレーズ、すなわち同じ意味を共有する表層的に異なる表現、
が存在する。同じ意味を様々な表現で表すことができるということは自然言語の
表現力の豊富さを示すものであるが、一方で自然言語処理の性能を低下させる一
因となる。この対処のために、シソーラスなどの言語資源が構築されているが、
それらは汎用的情報を記述しているために、特定分野や特定アプリケーションに
とって充分な効果を持っていない。

本論文の目的は、コーパスに基づくパラフレーズの獲得ならびにその機械翻訳へ
の適用である。本論文では、語彙的パラフレーズ獲得と文レベルパラフレーズ獲
得の２種類の方式を提案する。両手法は表層的処理に基づいており、コーパス以
外の資源を必要としない。本論文では、人手によるパラフレーズの分析、語彙的
パラフレーズ獲得、類似文検索、の３点について述べる。

人手によるパラフレーズについては、２種類の研究について報告する。この研究
では、(1)人手によるパラフレーズでは、どのようなタイプのパラフレーズが多
く行われるか、(2)人手によるパラフレーズを機械翻訳に与えた場合にどの程度
の効果が得られるか、について検証する。

次に、パラレルコーパスから語彙的パラフレーズを獲得する方法について述べる。
提案手法は以下の特長を持つ。 (1)機能語に関連するパラフレーズを獲得できる。
(2)単言語からの観点だけでなく、対訳的観点から見て同義であるパラフレーズ
が獲得できる。コーパス中に存在するパラフレーズを単一の表現に置き換えるこ
とにより、コーパス中のテキストを簡潔化することができる。この簡潔化により、
コーパスに基づく機械翻訳の性能を向上させることができる。

最後に、単言語コーパスから類似文を検索する方法（文レベルのパラフレーズ）
について述べる。提案手法は、収集が容易である単言語コーパスを用いることと、
語彙レベルを超えるパラフレーズを獲得できるという特徴を備えている。 ３種
類の主要な方式について比較実験を行い、類似文検索方式としてN-gramの共通部
分に基づく方式を採用している。本手法を以下の2通りの方法で機械翻訳に適用
した。一つは入力文の前編集として利用する方法である。与えられた入力文が翻
訳不能である場合に、翻訳可能な類似文を検索し、それを機械翻訳に渡すことで
翻訳不能文を翻訳することが可能となる。もう一つは、用例翻訳の類似度算出に
用いる方法である。本手法は長文や文体の違いにロバストな検索を実現しており、
従来手法と比較して広いカバレッジを実現している。



0261007	岩垣 剛
「Studies on Design for Delay Testability and Delay Test Generation for Sequential Circuits」


Abstract:
 VLSI (Very Large Scale Integration) circuits are basic components 
of today's complex digital systems. In order to realize dependable 
digital systems, VLSI circuits should be highly reliable. VLSI 
testing plays an important role in satisfying this requirement. VLSI 
testing is to check whether faults exist in a circuit, and it 
consists of two main phases: test generation and test application. 
In test generation, a test sequence that is an input sequence to 
detect faults in a circuit is generated. In test application, the 
generated test sequence is applied to the circuit.  Conventional 
testing deals with stuck-at faults only. For today's high-speed VLSI 
circuits, testing for stuck-at faults is not sufficient to guarantee 
the timing correctness of the circuits. Delay testing that is to 
check whether delay faults exist in a circuit is an important 
technology to guarantee the timing correctness.  For delay faults in 
a combinational circuit, two-pattern tests are required to detect 
them. On the other hand, for delay faults in a sequential circuit, 
we need a test sequence whose length is two or more. Test generation 
for sequential circuits under simple fault models such as the single 
stuck-at fault model is generally a hard task. Delay test generation 
for sequential circuits is a more challenging problem. For such 
sequential circuits, design for testability (DFT) is an important 
approach to reduce the test generation complexity.  Fully enhanced 
scan design has been proposed as a straightforward DFT method for 
delay faults. In this design, every flip-flop (FF) is replaced by an 
enhanced scan FF (ESFF). Each ESFF can store arbitrary two bits to 
apply two consecutive vectors. For a sequential circuit designed by 
this technique, we can use a combinational delay test generation 
algorithm to generate a test sequence for the original circuit. 
Therefore, high fault efficiency, which is defined as the ratio of 
the sum of the number of detected faults and identified untestable 
faults to the total number of target faults under test generation, 
can be achieved with short test generation time. However, this 
method has the following disadvantages: (1) hardware overhead caused 
by extra memory elements of ESFFs is very high, (2) test application 
time is long, and (3) test application at the rated speed of a given 
circuit, called at-speed testing, cannot be performed. In this 
dissertation, we tryto overcome these disadvantages by using two 
different approaches: a partially enhanced scan approach and a 
non-scan one.  We first propose a delay test generation method based 
on a partially enhanced scan technique. A new structure, called 
discontinuous reconvergence structure (DR-structure), of sequential 
circuits is presented, which has easy testability for delay faults. 
We show that the delay fault test generation problem for sequential 
circuits with DR-structure can be reduced to that for their 
time-expansion models, which are combinational circuits. On the 
basis of the reducibility, we present a test generation method for 
delay faults in sequential circuits with DR-structure. Our method 
can be applied to several delay fault models. In order to apply our 
method to general sequential circuits, we use a partially enhanced 
scan technique. This method can contribute to improving (1) and (2) 
of the above disadvantages. Experimental results show that the 
proposed method is effective in hardware overhead, test generation 
time and fault efficiency.  Next, we propose a non-scan design 
scheme to enhance delay testability of sequential circuits 
synthesized from state transition graphs (STGs). In this scheme, we 
utilize a given STG to test delay faults in its synthesized 
sequential circuit. The original behavior of the STG is used during 
test application. For faults that cannot be detected by using the 
original behavior, we design an extra logic, called an invalid test 
state and transition generator, to make those faults detectable. Our 
scheme allows achieving short test application time and at-speed 
testing, and it can contribute to improving (2) and (3) of the above 
disadvantages. We show the effectiveness of our method by 
experiments.  Keywords: sequential circuit, delay fault, test 
generation, partially enhanced scan design, non-scan design, 
at-speed test



0361006	大杉 直樹
「A Framework for Software Function Recommendation based on Collaborative Filtering」


Abstract:
High-Functionality Applications (HFAs) contain a large amount of functions.
 However, most of HFA users use only a few functions and are not aware of 
useful functions. To let users discover useful yet unaware functions 
efficiently, this dissertation proposes a framework for software function 
recommendation based on Collaborative Filtering (CF). The proposed framework 
includes an abstract design of a function recommendation system and an 
automated process for producing a recommendation, as well as system 
implementation techniques and new CF algorithms. To produce a recommendation 
to a target HFA user, firstly, histories of software function executions 
(called usage histories) are collected from many HFA users via the Internet.
 Next, similarities among users are calculated using the frequencies of 
function executions of each user. Then, the potential execution frequencies 
of target user's unaware functions are predicted based on similar users' 
already known frequencies. Finally, a list of functions ranked by their 
potential frequency is given as a recommendation to the target user. Since this 
framework does not require any previously constructed "user model" to make a 
recommendation, it is easily applicable to many HFAs. Typically, the CF 
algorithm consists of a similarity computation algorithm and a prediction 
algorithm. This dissertation describes three simple CF algorithms (lacking 
similarity computation), ten similarity computation algorithms including two 
new algorithms, and seven prediction algorithms. Prediction accuracies of 
these CF algorithms were empirically evaluated using usage histories collected 
from 23 Microsoft Office Application users. To evaluate the recommendation 
accuracy, ARE (Average Relative Error) and NDPM (Normalized Distance-based 
Performance Measure) were used. The result showed that the average NDPM and the 
average ARE of all the CF algorithms were better than that of randomly 
generated recommendation. Especially, the proposed two similarity computation 
algorithms (Rank Correlation and Magnitude Relation) and two conventional 
similarity computation algorithms (Adjusted Cosine Similarity with Average and 
Adjusted Cosine Similarity with Median) outperformed the other six algorithms. 
Also, four prediction algorithms (Weighted Sum, Adjusted Weighted Sum with 
Average of a Column, Adjusted Weighted Sum with Median of a Column, Adjusted 
Weighted Sum with Average of Neighbors, Adjusted Weighted Sum with Median of 
Neighbors) outperformed the other three prediction algorithms.

Keywords: Recommender System, Filtering Algorithms, Usage History, HFA 
(High Functionality Application), and CSCW



9761210	野本 忠司
「Machine Learning Approaches to Rhetorical Parsing and Open-Domain Text Summarization 
（機械学習による修辞構造解析および領域非限定文書要約）」


概要：
本発表では、日本語の修辞構造解析とテキスト自動要約について機械学習を応
用した幾つかの手法について検討する。なお、時間的な制約があるので主とし
て後者について議論する。

修辞構造解析では、主として文の間の修辞的依存関係と修辞タイプの同定問題
をとりあげる。ともに日経新聞を記事を用いて人手で依存関係、タイプをマー
クアップしたデータをベンチマークデータとして、確率的決定木を導入し、依
存関係、タイプ同定の予測性能を調べる。また、ひとつの拡張として、決定木
をランダム生成したコミッティベースの能動学習についても若干ふれる。

テキスト要約は、その目標として要約とは直接関係のない外的タスクでの有効
性を目指す立場と内的タスクでの有効性(つまり、人間なみの要約の生成)を目
指す立場がある。本稿では、それぞれの立場での要約タスクを教師なしおよび
教師あり学習等、異った機械学習の観点でアプローチする。

まず、外的なタスクとして本稿では文書検索を考える。これは、文書検索での
有効性が要約のよしあしを決めるという立場であり、要約とは、文章としての
自然さより、元文書の主要な情報を保持していればよいという考えかたが背後
にある。 本稿ではMDLで若干拡張したクラスタリングベースの要約手法(以下
DBS)を提案し、BMIR-J2と呼ばれる文書検索のベンチマークデータで手法の有
効性を検証する。

一方、内的タスクとして、本稿では人手による重要文判定データ(以下JFD)を
収集し、判定データをモデリングすることを考える。 具体的には，確率的決
定木をベースにした判定データを直接学習する手法と，判定データを全く参照
しないクラスタリングに基づく手法(DBS)とを比較し，両者の精度と判定者間
の一致度との関係を見る．

さらに、本稿では、被験者判定の分布がドメインに固有であることに着目して、
分布自体を直接統計的にモデリングして、要約の生成をおこなうことを考える。
このアプローチでは、分布情報のみを使って要約を生成するところに特徴があ
る。ベンチマークデータとしてJFDを用いて、kNN, Naive Bayes, 決定木等、
知識集約型学習手法とパフォーマンスの比較をおこなう。 最後に、このアプ
ローチが人手要約の大きな特徴である一致性(agreement)と不一致性
(disagreement)を自然に表現できることを示す。



9961202	北村 美穂子
「Practical Techniques for Improving Machine Translation through Parallel Corpora 
（対訳コーパスを利用した実用的な機械翻訳の研究）」


概要：
機械翻訳の性能向上は，その機械翻訳が有する翻訳知識の質及び量が決め手と
なる．したがって，性能の良い機械翻訳を構築するためには，従来のような手
作業での翻訳辞書の登録でなく，機械的又は半機械的に翻訳知識を構築できる
技術及びそうして構築された知識を容易に追加・拡張できる機械翻訳技術が重
要となる．

本論文は，上記２点の技術課題について，対訳コーパスからの翻訳知識の獲得
技術及びパターンベース機械翻訳技術を用いた解決方法を論じる．

第一の提案は，素性の扱いが柔軟なパターンベース機械翻訳システムの提案で
ある．従来のパターンベース機械翻訳は，翻訳規則の記述に多くの制限があり
人間の手が入れにくく，翻訳結果のチューニングが難しいという課題があった．
我々が提案するパターンベース機械翻訳は，そこで扱う全ての翻訳知識を人間
が理解し易いパターンの形式で表す．パターンは人間自ら作成することもでき
る．また，対訳コーパスを用いて機械的に作成することもできる．

第二は，上記のパターンベース機械翻訳で利用される翻訳パターンの自動抽出
方法の提案である．文対応済みの対訳コーパスにおいて，共起する原言語と目
的言語の単語列を対応付けることにより，翻訳パターンを自動的に抽出する．
統計的な手法によって単語列組の対応関係の強さを計算し，対応関係が強いと
認められた単語列組から順に抽出する．出現回数による閾値を設定し，その値
を徐々に下げながら計算対象を増やしていく方法を採ることにより，単語列の
組み合わせ爆発を抑えることができ，高精度で翻訳パターンを抽出することが
できる．

第三の提案では，第二の提案手法をベースにして，対訳辞書，及び，対訳コー
パス中の文の構文情報を利用して，翻訳パターンを獲得する．本方法では，実
用化に配慮して，人間が効率良く抽出結果を確認する手法も導入する．第三の
手法は，第二の提案手法に比べて，より高精度，高いカバレッジで翻訳パター
ンを抽出することができる．

上記の二つの手法を用いることにより，連続単語列からなる専門用語や慣用表
現に関する翻訳パターンを獲得することができる．しかし，より複雑な翻訳パ
ターン，例えば，連続単語列内に変化部が存在する翻訳パターンや，訳語選択
のための翻訳規則等を獲得することはできない．そこで最後に，統計的手法と
構造照合手法を利用した翻訳規則の自動獲得方法を提案する．構造照合手法と
は，対訳文間を依存構造レベルで対応付けるための手法である．

上記の四つの技術の融合により，翻訳知識獲得機能を有する実用的な機械翻訳
システムの構築が可能となる．



0261010	栗田 雄一
「CPG-based Rhythmic Manipulation for a Multi-Fingered Hand from Human Observation 
（人の操作計測に基づく多指ハンドによるCPGベース型リズミックマニピュレーション）」


概要：
本研究は，人のリズミックな操作計測に基づくCPGベース型マニピュレーションの提案
を行う．
人は未知の物体を適切に把持することができ，さらに指を協調させて動作させること
により精緻かつ複雑な操作を行うことができる．
人の手のような器用な操作を実現する上で実世界で自律的な適応を見せる人や生物の
運動を規範とすることを考えたとき，リズム生成機構(Central Pattern Generator: 
CPG)が生物の周期的な運動に深く関わっていることが知られている．
近年の研究から，人の手指による操作に関して習熟に伴いリズミックな動作が現れる
ことが報告されており，CPGを用いたマニピュレーションの制御手法は，適応的なマニ
ピュレーションを実現するための有効な手法となり得る．
そこで本論文では，リズム生成器であるCPGによる制御手法を多指ハンドによるマニ
ピュレーションに適用することで器用なマニピュレーションを実現することを目的
とし，人の手指運動におけるリズミックな性質に関する考察と多指ハンドによるCPG
ベース型マニピュレーションの提案と検証を行った．
はじめに，把持運動に先だって行われる把持力計画に着目し，把持運動中の把持/負荷
力と指筋電の同時計測を行った．
人は把持に必要な力を運動の事前に計算することで効率的な運動を行っているといわ
れている．
被験者による複数の重量の物体に対する把持運動の計測から，力計画に関する考察を
行う．
つぎに，人の物体操作において観察されるリズミックな運動に着目し，人が円筒物体
を空中で回転させる操作における動作パターンを指と物体間の接触情報から考察した．
接触状態から操作における指歩容といもいうべきリズミックな動作パターンを典型的
接触パターンとして表現する．
また典型的接触パターンから，人の物体回転操作におけるリズミックな運動に関する
考察を行う．
以上の結果を鑑み，典型接触パターンとして形式化された回転操作の各指の動作を
CPGによって生成する．
各指の動作開始指令をCPGにより生成された動作パターンにより適切に発行することで，
多指ハンドによる回転操作を行わせることができる．
提案手法の有効性を確認するためにシミュレータを作成し，人と似た可動域をもつ多指
ハンドモデルを用いてCPGベース型制御による物体回転操作が可能であることを確認
する．
また，CPGによるリズミックな出力はパラメータによって周期や振幅，位相などが決定
する一方で，適切な外部入力を引き込むことでその出力周期を適応的に変更させる
ことができる．
神経振動子モデルのフィードバック項に関節角情報を入力することにより，物体の操作
に伴う指の関節角の変化から，把持指の持ち替え周期を適応的に変更する手法を提案
する．
さらに，神経振動子の出力値を物体の目標回転速度に対応づけることにより，物体径
が未知で回転動作における終点が計算できない条件においても目標回転速度を常に滑
らかに変化するように与えることができる．
多指ハンドシステムを用いた実験により，提案手法の有効性を確認した．
本研究によって得られた知見により，人の手指運動におけるリズミックな性質が明ら
かになり，またCPGベース型マニピュレーションの有効性が示されることで，物体に
応じた適応的なマニピュレーションを実現するための一手法の確立につながる．
この結果は，人の手指のような器用な操作を多指ハンドにより実現するというロボット
ハンド研究における大目的を達成する上で有効な指針になることが期待される．


0261033	櫻井　滋
「Structural basis for recruitment of flap endonuclease-1 to PCNA
（フラップエンドヌクレアーゼ1とPCNAの相互作用の構造的基礎）」


Abstract
Flap endonuclease-1 (FEN1) plays crucial roles in the removal of RNA-DNA primers during
DNA synthesis and in the removal of flap DNA during DNA repair. Proliferating cell nuclear
antigen (PCNA), the DNA-clamp protein, binds FEN1 and stimulates its endonuclease
activity. The structural basis of the FEN1-PCNA interaction was revealed by analysis of the
crystal structure of the complex between human FEN1 and PCNA. The main interface
involves the C-terminal tail of FEN1, which forms two β-strands connected by a short helix,
the βA-αA-βB motif, participating in β-β and hydrophobic interactions with PCNA. These
interactions are similar to those previously observed for the p21CIP1/WAF1 peptide. However,
this structure involving the full-length enzyme has revealed additional interfaces that are
involved in the core domain. The interactions at the interfaces maintain the enzyme in an
inactive orientation and might be utilized in rapid DNA-tracking by preserving the central
hole of PCNA for the purposes of sliding along the DNA. A hinge region present between the
core domain and the tail may switch FEN1 and other enzymes bound to the same PCNA ring.
Key words
Flap endonuclease, DNA clamp, repair, replication, X-ray

0261003	井垣　　宏
「On Developing Integrated Services of Networked Home Appliances」


Abstract
Background
Recent advancement in processors and networks brings emerging technologies to network various
home electric appliances, including TVs, air-conditioners, lights, DVD players and refrigerators. A system
consisting of such networked home appliances is generally called a Home Network System (HNS). A
major HNS application is the integrated service of networked home appliances (we simply call HNS integrated
service in the following). The HNS integrated service is to orchestrate different home appliances
to provide more comfortable and convenient living for the users. It is count as one of the next-generation
value-added services in the ubiquitous computing environment.
This dissertation presents two main contributions to the effective development of the HNS integrated
services.
Design and Implementation with Service Oriented Architecture
First, we propose a novel framework to design and implement the HNS integrated services. The
proposed framework is characterized by the extensive use of the Service Oriented Architecture (SOA).
The conventional approach to implement the HNS integrated services adopts the Server Centralized
Architecture (SCA), where an intelligent home server controls all the networked appliances in a centralized
manner. To achieve each integrated service, the home server sends control commands to the
appliances in a certain order. Thus, the mechanism of SCA-based HNS is quite intuitive. However,
due to its centralized nature, the SCA-based HNS suffers from the concentration of load as well as the
decline in the reliability, interoperability, and system extendability.
To cope with the problem, we use the SOA for the implementation of the HNS integrated services.
In the proposed framework, the appliances export their own features as services. By introducing a new
concept � service layer on top of the proprietary interface of each appliance, the appliance can autonomously
execute the services exported by any other appliances with a standard procedure. Thus, the
appliances are loosely coupled via the exported services. This enables more flexible, balanced and reliable
HNS integrated services. We present a framework to design and implement the integrated services
based on SOA, then illustrate a prototype system developed with Web services.
We also propose a graph-based method to evaluate the HNS from the viewpoints of reliability, workload
and coupling. With the proposed evaluation metrics, we conduct a comparative evaluation between
the proposed and the conventional systems.
Feature Interaction Analysis in HNS Integrated Services
The second contribution of the dissertation is to address the problem of feature interactions in the HNS
integrated services. The feature interaction generally refers to inconsistent conflict between multiple services, 
which is never expected from the single services’ behaviors. This problem has been originally
studied within the area of telecommunication services.
The feature interaction problem occurs in the HNS integrated services as well, since multiple home
users may activate different integrated services, simultaneously. The feature interaction decreases the
total quality of services, even leads to the system down. Therefore, efficient detection and resolution
of feature interactions are indispensable, to guarantee the safe and comfortable living of home users.
However, the feature interactions in this emerging domain are not yet well studied.
Our goal is to formulate the feature interaction problem within the HNS integrated services and propose
an efficient analysis method for it. In the formulation, we first model each appliance as an object
consisting of properties and methods, where a property represents an internal state of the appliance, and
a method abstracts a feature. Each method consists of pre-condition and a post-condition, and refers or
updates values of some properties. Similarly, we also construct a model for home environment.
Within the model, we formalize two types of feature interactions: device interaction and environment
interaction. The device interaction is direct conflict among features of the same appliance device. It
arises when multiple HNS integrated services simultaneously trigger some methods that update an appliance
property in different way. On the other hand, the environment interaction is indirect conflict
among different appliances via the HNS environment. It arises when multiple HNS integrated services
simultaneously trigger different appliances so that they try to perform inconsistent update of an environment
property.
Based on the formulation, we implement a feature interaction detection system, and conduct a case
study of interaction detection for practical HNS integrated services. The proposed framework is quite
generic and is well applicable to any HNS, including both SCA-based and SOA-based systems. We also
discuss several resolution schemes for the detected interactions.
Overview of Dissertation
This dissertation is organized as follows. In Chapter 1, we summarize the background and related
topics, and describe an outline of the dissertation. In Chapter 2, we describe preliminaries for the
HNS integrated services with practical example scenarios. In Chapter 3, we propose the SOA-based
framework for the design, implementation and evaluation of the HNS integrated services. Chapter 4
discusses the feature interaction problem within the domain of the HNS integrated services. Finally, in
Chapter 5, we conclude this dissertation with a summary and future work.
KeyWords
Integrated services, Service Oriented Architecture, Feature Interaction and Home Network


0361004	内田　眞司
「Evaluating software maintainability based on code clone analysis and overhaul
（コードクローン分析とオーバーホールによるソフトウェア保守性評価）」

Abstract
Maintenance of legacy and large-scale software is a quite expensive and difficult
task, since many updates and extensions have been applied to the software.
Restructuring such large-scale software is a promising approach to reduce the
effort of the future maintenance. However, the restructuring requires some
criteria on whether or not the restructuring should be performed, and whether or
not the result of the restructuring is sufficient. In order to derive such reasonable
criteria, it is necessary to evaluate quantitatively the maintainability of the
software, including understandability and modifiability.
This dissertation presents two methods for the quantitative evaluation of software maintainability.
In Chapter2, I propose an evaluation method for software modifiability, which
extensively uses a technique of "code clone". The code clone refers a pair of
source code fragments where one fragment is identical or quite similar to another.
The code clone is recognized as a major source to decline the modifiability of the
software. However, few empirical studies have been conducted to clarify
quantitative characteristics of the code clone, for instance, how much clones are
usually contained in well-maintained systems, or why the clones are
produced. In this chapter, I conduct an extensive analysis of code clones using
125 packages of open source software written in C language. Based on the result,
I present a guideline for the allowable production of code clones.
In Chapter3, I propose "software overhaul" to evaluate the understandability of
the software. In the software overhaul, a target software code is first
de-constructed into pieces. Then, for a given de-constructed software, the
programmer restores it to the original software. By analyzing the restoration
process, issues with respect to understandability of the software are identified.
In this chapter, I conduct an empirical study on software debugging, where the
subjects debug either overhauled or non-overhauled programs. The result shows
that the time taken for debugging overhauled programs is significantly shorter
than the time for the non-overhauled programs. This fact implies that the
software overhaul can be used to find issues in the understandability of the
software.
Keywords: Software maintainability, Software modifiability, Software
understandability, Code clone analysis, Software overhaul


0261201	菊地　高広
「インターネットにおけるライフライン通信の実現に関する研究」


論文内容要旨
インターネットが普及するとともに、様々な通信サービスのインフラとしての重
要性が増しつつある。そうした中、緊急通報などのライフライン通信についても、
インターネット接続環境さえあれば利用可能であることが求められている。また、
インターネット上では高度なマルチメディア環境のサポートなどによって、多く
の人に利便性の高いライフライン通信サービスの提供が可能となる。
しかし、従来、電話などで行われてきたライフライン通信の機能を、インター
ネット上で実現するには様々な課題がある。一つ目は、最寄り機関などへ適切に通
信するための接続先解決の取り扱いである。電話においては110 番や119 番のよ
うに、利用者にとっての簡易なアクセス手段の提供と、適切な接続のための特殊
な取り扱いがなされている。二つ目は、発信者の識別情報の取り扱いである。い
たずらやなりすまし防止、ならびに、呼び返しの実現が求められる。三つ目は、発
信者の居場所である地理的位置情報の取り扱いである。電話における110 番や119
番では、発信した場所の住所を、必要であれば受信側で把握することができ、緊
急対処や現場への駆け付けが可能となっている。
そこで、本論文では、これらの課題について、ライフライン通信をインターネッ
トにおいて実現するための議論を行ない、それをもとにインターネットにおける
ライフライン通信システムを提案する。
本論文においては、既存技術・既存手法における問題点と、実現にあたっての
要求事項を整理し、その上で、通信接続先解決のための地理的位置情報ベースの
ENUM 方式、発信者の地理的位置情報証明書方式、および、ユーザ情報証明書方
式という、各課題を解決する三つのモデルを提案した。
そして、これらの三つの提案モデルに基づき設計したライフライン通信システ
ムの実装構築を行ない、実証実験ならびに性能評価を行なった。その結果、この
システムは他に考えられる方式と比較してスケーラビリティやセキュリティとプ
ライバシーの点で好ましいだけでなく、十分実用的であることが確認された。


0261011	小枝　正直
「拡張現実感を用いたビークルの操縦支援に関する研究」


（論文内容の要旨）
ロボットおよびビークルに関する研究は，自律型と操縦型に分類することがで
きる．ロボット自身が環境を認識し，行動を決定する自律型ロボットに関する研
究は多数行われており，一定の成果を挙げている．しかし自律型ロボットの実現
には，環境認識，行動計画，運動制御が不可欠であるが，これらのうち今後20 年
で人間と同レベルの機能を持った情報処理システムの構築が見込まれているのは
運動制御だけである．画像認識を含む外界の認識と適切な行動計画のための情報
処理は難問であり，自律型ロボット実現のための大きな障害となっている．現時
点においてロボットに定型作業以外の作業を行わせる実用的な方法は，人間によ
る操縦であると言える．
ビークルの操縦には，ビークルの自己位置・姿勢情報と目的地へのナビゲーシ
ョン情報が最低限必要であり，それらの情報を効率的に伝達できるインタフェー
スが重要である．また異種のビークル間でインタフェースが共通化できれば，操
作性・利便性の向上が期待できる．
本研究では，ビークル操縦のための情報提示インタフェース構築用API を開発
した．本API を用いたインタフェースでは，ビークルの自己位置・姿勢，周辺の
地図，目的地までの経路，速度，操縦者の頭部姿勢，周辺の構造物の情報をグラ
フィカルに提示する．このうち，経路と周辺建物への情報提示には拡張現実感技
術を用いており，実画像に重畳表示されるため操縦者に効率的・直感的に伝える
ことが可能である．本API を用いて，位置・姿勢計測方法や表示デバイスが異な
る３つのビークル（無人ヘリコプタ，電動車椅子，自動車）の操縦を試みた．
無人ヘリコプタではGPS・ジャイロを用いて自己位置・姿勢を計測し，ヘッドマ
ウントディスプレイ（HMD）を表示デバイスとして用いた．無人ヘリコプタには全
方位カメラを搭載し，その画像をアナログビデオトランスミッタまたは無線LAN
を用いて地上に送信する．メモリベースの全方位画像リアルタイム透視投影変換
手法を提案し，時間遅れを抑えて機体周辺の画像を操縦者に提示できる没入型遠
隔操縦システムを構築した．またシミュレータを用いて実験を行い，本システム
の有効性を示した．生駒市防災訓練にて実機による公開実験を行い，実現可能性
を示した．さらに開発したAPI を用いた情報提示インタフェースを用いて平城京
跡で実験を行い，無人ヘリコプタの操縦支援システムを開発した．
電動車椅子では，自己位置・姿勢計測にレーザレンジファインダ（LRF），表示
デバイスに単眼HMD を用いた．はじめにLRF による位置・姿勢計測の精度評価を
行い，位置誤差6[cm]，姿勢誤差1[deg]以下と拡張現実感システムの利用に適し
ていることを示した．これをステレオカメラを用いた搭乗者の頭部位置・姿勢計
測と組み合わせて屋内環境での搭乗型ガイドロボットを構築し，拡張現実感シス
テムへの応用性を示した．本システムを用いて搭乗者を指定経路に沿ってガイド
する実験を行い，屋内において正確な誘導が可能であることを確認した．
自動車では，実用性を考慮してGPS・ジャイロによる自己位置・姿勢計測とヘッ
ドアップディスプレイを用いた表示デバイスを用い，操縦者に対して完全非接触
での計測・情報提示が可能なシステムを構築した．
以上のように，本API を様々なビークル，計測手法，表示デバイス，および環
境で用い，汎用性・応用性を示した．


町田　貴史
「Dense Estimation of Surface Reflectance Properties based on Inverse Rendering」

Abstract
In augmented virtuality, it is important to estimate object surface reflectance properties
to render objects under arbitrary illumination conditions. There exists a number
of methods to estimate reflectance properties of object surfaces densely. However,
it was difficult to estimate surface reflectance properties faithfully for objects with
interreflections. This thesis describes three new methods for densely estimating nonuniform
surface reflectance properties of real objects constructed of convex and concave
surfaces. Specifically, we use registered range and surface color texture images
obtained by a laser rangefinder. The proposed methods determine positions of light
to take color images for discriminating diffuse and specular reflection components of
surface reflection. Then, first method is densely estimation of local surface reflectance
properties by calculating the inverse local illumination rendering method. On the other
hand, second and third methods are densely estimation of reflectance parameters with
diffuse and specular interreflections based on an inverse global illumination rendering.
Experiments show the usefulness of the proposed methods by comparing with
conventional methods.
Keywords:
Inverse Rendering, Surface Reflectometry, Interreflections, Mixed Reality, Augmented
Virtuality


0361206	駒井　知央
「Studies on high efficiency antenna for the digital mobile communications
（デジタル移動通信におけるアンテナの高効率化に関する研究）」

Abstract
In mobile communication environment, the received signal is attenuated due to
shadowing and multipath fading in addition to path-loss. In order to ensure the
transmission quality, the mobile communication system requires various kind of
technique to combat shadowing and multipath fading. Among these techniques,
antenna is one of the key components for improving the transmission performance.
This paper deals with the efficient antenna for mobile communication systems. In
the following, this paper focuses on two systems; one is a base station antenna for
micro-cellular system, whose cell size is a several hundred meters, and the other is
a vehicle antenna for mobile reception of digital terrestrial television broadcasting.
First of all, this paper proposes a V-shaped antenna, which is composed of two
collinear array antennas. In urban area, cross-polarization component arises due to
multipath propagation. The conventional diversity antenna, which is composed of
two vertically aligned parallel collinear array antennas, is not efficient because it
can not make use of cross-polarization component. In order to improve the gain of
the diversity antenna, this paper proposes a V-shaped diversity antenna as a
base-station antenna of microcellular system. Each of two diversity antenna
elements, which are composed of collinear array, is slightly tilted in order to make
efficient use of cross-polarization component. The field test shows that the proposed
diversity antenna gives extra diversity gain against the conventional one. Since the
proposed antenna does not require horizontal support beam for supporting two
collinear antenna elements. It is cost effective. Furthermore, its style matches a
cityscape.
This paper then introduces the efficient antenna attached to the rear window of a
car for mobile reception of digital terrestrial television broadcasting. Since the size
and place of the antenna to be attached onto the vehicle are strictly limited,
conventional antenna is usually attached onto the rear window. However, the
heat-wire, which is also placed onto the rear window, decreases the antenna gain as
well as the body of the car. In order to solve this problem, the proposed antenna
makes efficient use of heat-wire as a reflector element. The proposed antenna
improves the directional gain toward the horizontal direction.
The thesis is organized as follows.
Chapter 1 introduces technical requirements and problems of antennas for mobile
communication systems in order to clarify the objective of our study.
Chapter 2 derives the space-domain fading correlation characteristics in multipath
propagation environment for evaluating the performance of the diversity antenna
proposed in the following chapters. The analysis method of the antenna
characteristic is shown. And the measured receiving characteristic propagated from
a hand held PC is shown.
In Chapter 3, I will propose a V-shaped diversity antenna for a base-station antenna
of the micro-cellular system. Computer simulation shows that the proposed
V-shaped antenna gives extra diversity gain against the conventional one in
multipath fading environment.
Chapter 4 introduces a rear window attached antenna for mobile reception of digital
terrestrial television broadcasting. Computer simulation and field test shows that
the proposed antenna improves the directivity gain toward the horizontal direction.
Chapter 5 summarizes the concluding remarks.


0261017	西川　剛樹
「A Study on Blind Source Separation Based on Multistage Independent Component Analysis」


Abstract
A hands-free speech recognition system is essential for realizing an intuitive,
unconstrained, and stress-free human-machine interface. In real acoustic environments,
however, the speech recognition performance is signi.cantly degraded
because we cannot detect the user痴 speech with a high signal-to-noise ratio (SNR)
owing to the interference signals such as noise. In this thesis, we introduce blind
source separation (BSS), which is an approach for estimating original source signals
only from the information of the mixed signals observed in each input channel.
Many BSS methods based on independent component analysis (ICA) have
been proposed for the acoustic signal separation. However, the performances of
these methods degrade seriously particularly under heavily reverberant conditions.
The ICA-based BSS can be classi.ed into two groups in terms of the processing
domain, i.e., frequency-domain ICA (FDICA). and time-domain ICA (TDICA)
From the experimental study using the conventional FDICA, the source-separation
performance is saturated because the independence assumption collapses in each
narrow-band. In TDICA, the convergence degrades because the iterative learning
rule becomes more complicated as the reverberation increases. In order to resolve
the problems, I newly propose multistage ICA (MSICA), in which FDICA and
TDICA are cascaded. In the proposed method, the separated signals of FDICA
are regarded as the input signals for TDICA, and we can remove the residual
crosstalk components of FDICA by using TDICA. The experimental results in
the convolutive speech mixtures reveal that the separation performance of the
proposed method is superior to those of TDICA- and FDICA-based BSS methods.
In the original MSICA, the speci.c mixing model, where the number of microphones
is equal to that of sources, was assumed. However, additional microphones
are required to achieve an improved separation performance. This leads to alternative
problems, e.g., a complication of the permutation problem. In order
to solve them, we propose a new extended MSICA using subarray processing,
where the number of microphones and that of sources are set to be the same in
every subarray. The experimental results reveal that the separation performance
of the proposed MSICA using subarray processing is improved as the number of
microphones is increased.
In the speech recognition system, not only a high SNR but also a high speech
quality is required. For speech signals, we must use TDICA with a nonholonomic
constraint to avoid the decorrelation e.ect caused by the holonomic constraint.
However, the stability cannot be guaranteed in the nonholonomic case. To solve
the problem, the linear predictors estimated from the roughly separated signals by
FDICA are inserted before the holonomic TDICA as a prewhitening processing,
and the dewhitening is performed after TDICA. The stability of the proposed algorithm
can be guaranteed by the holonomic constraint, and the pre/dewhitening
processing prevents the decorrelation. Moreover, to achieve a stable learning and
low-distortion in the model where the number of microphones is larger than that
of sources, an extended learning algorithm is newly proposed. In the algorithm,
we estimate the distortion components by the holonomic constraint and we compensate
the sound qualities by using the estimated components. The experimental
results revealed that the proposed algorithm provides the higher stability and the
higher separation performance, compared with the conventional MSICA.
Keywords:
hands-free, microphone array, blind source separation, frequency-domain independent
component analysis, time-domain independent component analysis, convolutive
mixture


0261032	PRASAD　RAJKISHORE
「A Study on Independent Component Analysis based Speech Signal Separation and Enhancement」

Speech signal separation and enhancement under blind setup is one of the challenging
areas of practical application. Excellent solutions to these problems are always required
for the spoken communication between man and machine in the real world. The problem
of speech separation arises in the presence of multiple speakers and that of enhancement
pertains to reduce the effect of noise and other interfering signals. In the real world
applications these two problems are often occurring simultaneously and their solutions
are urgently required in the development of full-fledged conversational interface. The
aims and scope of our work is also in the same context. Recently, Blind Signal Separation
(BSS) based on the Independent Component Analysis (ICA) has emerged as a potential
engineering solution for speech separation problem. Such algorithms work with the
assumption of statistical independence of each sources and estimate original sources as
the independent or least dependent components. This thesis also addresses development
and application of ICA based algorithm for the blind separation of convoluted mixture of
speech, observed by a two element linear microphone array, under the over-determined
situation. The proposed ICA algorithm is based on the non-Gaussianization, by
negentropy maximization, of the Time-Frequency Series of Speech (TFSS) signal. The
functioning of ICA by non-Gaussianization is based on the heuristic idea of Central Limit
Theorem (CLT) under which it happens that the mixed speech signals become more
Gaussian than the individual signal and thus by reversing the process of non-
Gaussianization individual signals can be estimated with arbitrary scale and permutation.
Under such a framework a cost function is required to measure the degree of non-
Gaussianization and maximally non-Gaussian signals are taken as the independent
components which are original sources. There are various measures such as kurtosis,
entropy, negentropy for measuring non-Gaussianization but negentropy provides much
better robustness to outlier and is widely used. However, direct measure of negentropy is
cumbersome and it is approximated in terms of cumulants or non-linear functions. In this
thesis various approximations of negentropy of TFSS by the higher order statistics of the
non-linear, non-quadratic functions and their separation performances have also been
investigated. The nature of nonlinear function used to approximate negentropy of the data
depends on its statistical characteristics of the data. The detailed study on the probability
density of TFSS has been presented to test the relative proximity of underlying
distribution of TFSS with that of Gaussian distribution, Laplacian distribution and
Generalized Gaussian Distribution (GGD). The results of different statistical tests such as
moment test, Chi-square test, and Quantile-Quantile (QQ) plots have been found to
favour closeness of distribution of TFSS with that of GGD. Accordingly, a GGD function
based non-linear function has been proposed for negentropy approximation and its use in
ICA algorithm. Also, it has been found that the proposed non-linear function gives less
error in approximation of the negentropy than the conventional functions. The separation
performances of conventional and proposed non-linear functions have also been studied
with the fixed-point Frequency Domain Independent Component Analysis (FDICA)
algorithm and have been found that GGD based non-linear function improves rate of
convergence of the algorithm.
The problem of speech enhancement has also been addressed in the frequency domain.
The speech enhancement in the frequency domain is done by manipulating spectral
component of the noisy signal in accordance with noise suppression rule. This thesis also
proposes the development of noise suppression rule based on the Maximum A Posteriori
(MAP) estimation. The proposed MAP estimator uses flexible statistical models, based
on GGD, for the TFSS of the speech as well as noise signals. Thus the noise suppression
rule is adaptive with the statistics of the noise and the same can be used to reduce effect
of different types of noise such as Gaussian and super-Gaussian or spiky signals. The
noise suppression characteristic of the estimator depends on the type of noise. In contrast
to this, most of the conventional methods such as Wiener filter show same noise
suppression characteristics to Gaussian and spiky noise signals. The statistics of the noise
signal and clean speech signal are also estimated from noisy signal. First the statistics of
noise signal is estimated from the noise only segments of the noisy signal and are used to
estimate statistics of the clean signal from the higher order statistics of the noisy signal
and noise signals. In order to demark noise-only portions of the noisy signal, a novel voce
activity detector based on the organization measure of the spectral components has been
proposed. Again, negentropy has been used as the measure of organization of the spectral
components which is different for noise-only frames and noisy speech frames. The
experimental results of enhancement of speech contaminated by different noise signals
shows its superiority over the conventional Wiener filter. The flexibility in the noise
suppression characteristics of the proposed MAP estimator is suitable for doing post
processing of the speech signal separated by FDICA algorithms. The problem is difficult
in the sense that the residual noise is also speech. The separated signals from an FDICA
algorithm contain components of undesired sources in the residual form. Since these
residual signals are speech like noise, it can be further reduced using proposed MAP
estimator by using one separated component as the target speech while others as the
source of the noise. However, for the proposed post-processing the knowledge of the
level of residual noise present in the target speech is required and can be determined from
the information about noise reduction done by the FDICA algorithm. However, this
method is not blind as it requires original contribution of each source to each microphone.
The experimental results show that the post processing by the MAPS estimator gives
appreciable improvements in the noise reduction.


0261029	森岡　涼子
「Studies on time series analyses for gene expression based on statistical methods
（統計的手法に基づく遺伝子発現の時系列解析に関する研究）」

Abstract
There are some attempts on molecular levels in order to understand living organisms, 
to conserve environment and to promote medicines. Whole genome sequences of some 
organisms have been decided and cellular systems on each process such as transcription, 
translation and interaction between proteins have been modeled. Microarray technology 
enables us to measure mRNA expressions in a whole cell and to approach transcriptional 
regulations. There are two incidents that make microarray data analyses difficult.
 One is that the data includes many functional unknown genes. The other is that the data 
is noisy. In this thesis, statistical methods for microarray time series data including noise are presented.
 First, by paying attention to the changing patterns of gene expression in time series,  
feature extractions by clustering methods is discussed. Though various clustering methods 
have been proposed, the variation of results, due to the dependency on initial parameters 
and to noise intensity, make it difficult to reach consensus about standard methods. 
Using the mixture of principal component analysis trained by a variational Bayes inference, 
results could be obtained that are as good as those by conventional k-means, and that are stable. 
In addition, there is positive correlation between the free energy, which is a statistical 
criterion obtained by the variational Bayes method , and a biological criterion.
 These results imply that the free energy criterion can be used as an objective criterion to select the clustering results.
 Next, the system identification for the microarray time series is performed. In conventional 
transcriptional regulatory network analyses, the structure of the network is defined by 
a fixed formation. In this study, the structure is assumed as a flexible one depending on 
the environmental conditions and time, and the times at which the structures are changed are 
estimated. By the combination linear dynamical system and self-organizing map, both of 
the results can be obtained simultaneously that are objective time at which the structure 
is changed and intuitive understanding about observational data. As a result, the known 
cellular transition points could be detected and also detect the other transition points. 
By keeping eyes on the genes that react in the vicinity of the transition points, it is 
found that genes concerned some phonomena are different even though the phenomena are 
considered as same ones. Also, using the results, the transitions could be interpreted if 
many unknown genes relate them. The information about genes that relate transitions is 
applicable to the experimental design about cellular adaptation to the envirionment.
 In this thesis, I have shown the usability of the statistical approach to the microarray time series data. 
The statistical criterions and the results are useful for gene functional estimations, 
comprehension about cellular adaptive systems and also experimental designs. 



0361032	原　良昭
「Study on Quantitative Evaluation Methods of Physical Activity
（身体活動の定量的評価手法に関する研究）」


Abstract
In Japan, the rate of elderly people in the population increases year by year. In 2015, it is expected
that 25% of the population become elderly people. It is important to prolong the individual healthy life
expectancy for improving the quality of life. The healthy life expectancy is the time to be able to spend
without becoming bedridden and dementia. Ministry of Health, Labor and Welfare recommend doing
habitually suitable intensity of physical activity in daily life for increase of healthy life expectancy.
Methods of evaluating physical activity quantitatively instead of qualitatively are needed.
In this paper, I developed quantitative evaluation methods of the physical activity, and examined
those methods.
The quantitative evaluation method of the physical activity by the acceleration signal
The tilt angle of a waist is acquired from DC component of an acceleration signal in
the direction of body longitude on a waist. The intensity of AC component of an
acceleration signal is indicated from the sum of the absolute AC component of the
acceleration signal. it is reported high correlation is confirmed between the oxygen
uptake and a sum of absolute value of AC component of acceleration signal in unity time.
Many researchers have attempted the evaluation of physical activity from an
acceleration signal by using the above relation.
I proved that the intensity of an AC component is estimated by the standard deviation
of a signal. I proposed a evaluation method of physical activity by the standard
deviation of an acceleration signal instead of the sum of absolute value of an
acceleration signal. I measured an acceleration signal in the direction of body longitude
on a waist in daily life. The subject is a male of 24 years old. I compared the evaluation
of physical activity by standard deviation with that by the sum, and indicated
superiority of proposed evaluation method.
The quantitative evaluation method of the physical activity by electromyogram
In activity of daily living (ADL) ability of the elderly person, the ability of movement
such as walking is declined since the early stage comparatively. Muscles of legs are
mainly used in these behaviors. By evaluating activity of each muscle in the lower limb
in daily life quantitatively, it becomes possible to estimate an amount of a muscular
activity required, which is necessary to maintain the ability related to movement
behavior.
EMG expresses electric activity of a muscle. iEMG is gotten by the full-wave
rectification and smoothing of EMG. The average value of the amplitude of iEMG
reflects the muscle contraction rate and strength. Therefore, in this paper, the muscular
activity is evaluated by a sum of iEMG in unity time. However, to compare the sum of
iEMG during each muscle and individual, it is necessary to normalize the sum of iEMG.
Then I propose the normalization method by the sum of iEMG in level walk. The
muscular activity in daily life is indicated as ratio of the muscular activity in the level
gait by above normalization method.
I measured the iEMG from the medial vastus muscle, the rectus femoris muscle, the
semitendinosus muscle and the biceps femoris muscle. Those muscles are the muscle of
thigh. Both the rectus femoris muscle and the medial vastus muscle extend the knee
joint and both the semitendinosus muscle and the biceps femoris muscle flex the knee
joint. The subject was a 24 years old healthy male.
The muscular activities of each muscle are evaluated by the proposed method and are
compared with each others. It has been proved that the muscular activities are different
even if these muscles belong to the same function group. This result suggests that the
evaluation of each muscular activity is needed.
The estimate method of the maximum voluntarily contraction by mechanomyogram
Maximum voluntary contraction (MVC) is different by an individual and a muscle.
Therefore, a targeted load in rehabilitation is indicated at ratio of MVC (%MVC).
However, because enormous muscular force occurs, measurement of MVC may hurt
muscles and tendons. When a subject has an injured muscle, the exact measurement of
MVC becomes difficult by the restraint from a pain. Therefore, the estimate method of
MVC is needed. Muscular force is controlled by the recruitment, rate coding, and type of
the motor unit (MU). MU are sorted into slow type MU (ST-MU) and fast type MU
(FT-MU). Increasing muscular force, the number of recruited MUs is increasing. FT-MU
is recruited after all ST-MU is recruited. %MVC that FT-MU is recruited has a
specific %MVC to each muscle.
Mechanomyogram (MMG) indicates mechanical activity of a muscle. It is reported
that the amplitude of MMG drastically increases when FT-MU is recruited.
I propose the estimate method that estimates MVC from drastic increase of the
amplitude of MMG. In other words I seek at the muscular forth which FT-MU was
recruited by a drastic increase of the amplitude of MMG. Then I estimate MVC from the
muscular force when FT-MU was recruited.
I measured MVC of the biceps brachii muscle and MMG of the biceps brachii muscle
during isometric contraction. Subjects are 5 males (22.8±2.6 years old). The root mean
squares (RMS) of MMG are calculated. The RMS increased drastically 26±3.7%MVC.
From this result, MVC of the biceps brachii muscle can be estimated to multiply
about four by the muscular force when RMS of MMG drastically increased. I confirmed
that measured MVC almost accorded with MVC estimated by proposed method.
From the above-mentioned result, the physical activity is evaluated by standard
deviation of an acceleration signal in the direction of body longitude on a waist.
And the muscular activities of each muscle are evaluated by EMG. Furthermore, MVC
is estimated from MMG by proposed estimate method easy.
Keywords:
physical activity, acceleration signal, electromyogram, mechanomyogram,
maximum voluntarily contraction


0261013	實廣　貴敏
「Automatic Model Generation for Speech Recognition
（音声認識のためのモデル自動生成法）」


Abstract:
現在，一般に用いられる音声認識技術は確率モデルに基づいており，音響モデル，言語モデルとも，
大量な学習データベースから各パラメータを推定する．このとき問題になるのが，学習データ量とパラ
メータサイズの関係である．パラメータサイズが小さすぎると性能を十分得られず，学習データを生か
しきることができない．また，パラメータサイズが大きすぎると，学習データに強く依存したモデルに
なり，逆に精度を落とすことになる．これは一般に過学習と呼ばれる．学習データに対し，適切なパラ
メータサイズを選択することは重要な問題である．
音響モデルとして現在の主流技術である音素環境依存型隠れマルコフモデル(hidden Markov Model,
HMM) は，状態共有構造を学習データから推定し，作成することが一般的になっている．音素カテゴ
リを利用した音素決定木クラスタリングが手法として広く用いられている．一般的にはゆう度最大化
(Maximum Likelihood, ML) 規準を用いてクラスタリングを行う．しかし，パラメータ数が増加するに
つれ，ゆう度は一般に増加する．過学習を起こしやすく，ML 規準のみでは停止条件として使うことがで
きない．本研究では，はじめに音響モデルの適切なパラメータサイズを自動推定することを目標とする．
過学習の問題を避けるために，情報量規準，中でも最小記述長規準(Minimum Description Length,
MDL) またはBayesian Information Criterion (BIC) を用いた音素決定木クラスタリングが提案されて
いる．情報量規準を用いることで，パラメータ数および学習サンプル数を考慮することができ，分割
規準だけでなく，停止規準として用いることができ，自動的にパラメータ数の決定に用いることができ
る．本研究では，MDL 規準をゆう度最大化規準逐次状態分割法(Maximum Likelihood Successive State
Splitting algorithm, ML-SSS) に適用することで，決定木クラスタリングでは扱うことのできない時間
方向の状態長を各異音モデルごとに自動推定できるアルゴリズムを実現する．評価実験により，提案手
法MDL-SSS 法は自動的に分割停止可能で，従来法ML-SSS 法に比べ，より適切な状態共有構造を得る
ことができることがわかった．
また，情報量規準は，簡単なモデルにおいて大量データに対し導出されたもので，HMM のような複
雑なモデルを理論的に扱うことは不可能で，また，少量データではうまく働かない．そこで，近年，機
械学習で提案されている変分ベイズ法を利用する．ML-SSS 法に変分ベイズ法を適用し，少量データに
おいても効率的なモデルを自動生成できる手法を提案する．評価実験では，従来法よりパラメータ数の
少ないモデルで同程度以上の性能が得られた．
次に，言語モデルの精度向上を目指して，構文木から単語パターンを抽出し，モデル化したモデルを
提案する．単語N-gram モデルが局所的な単語連鎖のみを考慮したものであるのに対して，近年，提案
されている構文解析そのものを言語モデルに用い，文全体の構造情報を利用するモデルが提案されてい
る．しかし，対話文のように，比較的短い文では，むしろ文節レベルでの構造が重要である．そこで，
まず，前処理として単語連鎖などから，より扱いやすいパターンに変形し，trigram モデルとして用い
る．さらに，構文木から各単語に木構造内で関連のある単語パターンを抽出してモデルとして用いる．
評価実験から，変形単語trigram モデルにより大きな改善が得られ，さらに単語パターンモデルで若干
の改善が得られた．また，単語パターンモデルは長い文に対し，特に効果的であることが分かった．最
後に，提案手法であるMDL-SSS 法による音響モデル，単語パターン言語モデルでの組合せで評価を行
い，効果が得られることを確認できた．


9961008	大谷　朗
「Formal and Computational Aspects of Phrase Structure Grammar in Japanese
（日本語における句構造文法の形式的・計算的諸相）」


Abstract (in Japanese)
本論文は、文解析に関する情報処理の一研究であり、理論的な検証と実装の並行
によって形式的妥当性と計算的厳密性を併せ持つ体系として構築された「NAIST
日本語句構造文法(JPSG)」について、その理論的諸相を中心に論じるものである。

JPSGは、主辞駆動句構造文法(HPSG)に基づき、理論の射程を正しく認識した上で
言語分析をふまえた原理の修正・拡張を提案している点では文法理論である。一
方、そうした理論に抵触しない形式化を行なうことで、計算機処理を指向したス
キーマ等を追加している点では実用文法の計算モデルでもある。このような独自
の拡張を行なったJPSGの一番の特徴は、真に文法ベースの解析を行っている点に
あるといえる。

近年の文法理論は、あらゆる言語情報の形式化に関して、タイプ付き素性構造に
よって表現されたオントロジー、一律のデータ構造、そしてその数学的な厳密さ
に基づくことで、モジュールの区別を問わない静的な制約記述を可能にする。ま
たそれらを基盤として記述された文法は、別段アドホックな処理を追加すること
なく素性構造の単一化のみで駆動する。

このような理論に基づく文解析の意義は、単にその厳密な基盤に立脚することだ
けでなく、本来さまざまな言語的制約に関する見通しのよい知見を情報処理に反
映させることにある。しかしながら、言語の形式的特性に関する研究の多くは、
工学的、そして言語学的にも専ら統語を中心に行なわれる現状にあり、HPSGに基
づく研究もその例外ではない。

こうした背景をもとに、JPSGは、工学的処理に言語学の知見を積極的に導入する
際のいくつかの問題点として、(i)日本語に特徴的な助詞・助動詞の後置におけ
る「隣接性」、(ii)修飾表現からみた被修飾表現に対する「依存性」、(iii) 構
文間にみられる意味的共通性に関する「一様性」、(iv)談話現象の背景にある言
語情報間の複雑な関連を制約する「統一性」を取り上げた。そして、それらに対
して音韻・形態・統語・意味・談話情報の制約を一律に形式化した言語学的な分
析を明示し、なおかつ具体的記述を与えることで、文法理論の恩恵を最大限に享
受した言語的制約情報の融合処理を目指した。

その結果、言語情報の性質を忠実に記述することを遵守したJPSGは、そうした情
報のいくつかを捨象することで被覆の向上を意図した解析に対し、「率」の点で
劣るものになったことは否めない。しかしながら、そのような解析においては常
に問題とされるモジュールの連携やデータのドメイン依存の影響を一切受けない
頑健な形式化を得ることになった文法は、「質」の点ではその見通しは遥かに明
るいものであるといえる。

文法記述にかかる時間的コストも、開発環境と連携して構築されたJPSGでは大幅
に削減されている。このことはグラマー・エンジニアリングの一つの成功例とし
て、自然言語処理研究において位置付けることができる。


Keywords

日本語句構造文法(JPSG)，主辞駆動句構造文法(HPSG)，文法ベースの解析，
言語的制約情報，頑健な形式化，グラマー・エンジニアリング


0261015	高橋　哲朗
「Computation of Semantic Equivalence for Question Answering
（質問応答のための意味的等価性の評価）」


内容梗概
現在Web を中心に大量の電子テキストがあふれており，それらに対する効率
的な情報アクセスの技術が求められている．その手段の一つとして質問応答があ
る．質問応答とは，自然言語で問われた質問に対して，文書集合を情報源として
回答となる語句を返す技術であり，1999 年から始まった評価型ワークショップの
開催を背景に盛んに研究が行なわれている．本研究では質問応答タスクを，質問
文とドキュメントにおける要素間の意味的関係の等価性を認識する問題ととらえ，
この技術を追及することによって厳密な質問応答という課題に迫る
等価性の評価方法として本研究では，(1) テキストの表層情報を意味表現の近
似ととらえ，そのレベルで照合を行なう，(2) テキストに対して語彙・構文的な操
作を行なうことにより表層的な差異を吸収し照合を行なう，(3) テキストから意
味的な関係を抽出し，そのレベルにおいて照合を行なう，という3 種類を考える．
(1) について，本研究では構文構造における照合を行なう．先行研究では，テ
キストを単語の列ととらえ単語間の類似度や近接性などを用いて類似度を求める
手法が主流であったが，意味的類似性のより厳密な近似を行なうために構文構造
を用いた．本研究ではCollins の提案したTree Kernel を元に，類似度の計算の
アルゴリズムを拡張し，ノード数M とN の構文木に対して計算量をO(MN) に
抑えながら対応ノード間の類似度の定量化やノードの飛び越えを許す照合を実現
できるアルゴリズムを提案した．
(2) のために，まずテキストに対する語彙・構文的な操作を行なう汎用的な言
い換えシステムを開発した．そして質問応答における言い換えの適用方法として，
言い換えと照合を繰り返しながら回答を探索するモデルを提案し，質問応答シス
テムを実装した．このシステムを用いて質問応答タスクの実験を行なった結果，
現在の実装では我々の予想に反して構文的照合と言い換えの効果が小さいことが
明かとなった．また誤り分析を通して，(a) 照応・省略解析，(b) 複数の可能性を
持たせた構文森における照合，(c) より大規模な言い換え知識の獲得の3 点を課
題として認識することができた．
(3) の部分問題として上記の(c) の問題に着目し，質問応答に必要な言い換え
のバリエーションや，解かなければならない問題点について調査した結果，質問
応答に必要となった記号に関する言い換えや，語彙・構文的言い換えの種類を洗
い出すことができた．これらの知識は質問応答における照合を行なうための知識
として有用なものである．また，語彙・構文的な言い換えの範囲を越えた言い換
えについても整理を行ない，それらの分析を通して，問題の一部については属性
関係を抽出することによって解決できる見込みを得ることができた．
そこで本研究では質問応答のサブタスクとして，(対象物，属性名，属性値) と
いう三つ組からなる属性関係を抽出するというタスクを設定した．このタスクは
これまでに研究が進められてきた情報抽出を一般化する新しい問題であるため，
従来手法とは異なるアプローチが必要である．この問題に対して我々は，抽象化
したパタンによりドメインを限定せずに三つ組の候補を抽出し，統計量を使って
それらをフィルタリングする手法を提案した．実験の結果，この手法を用いるこ
とによりパタンのみで抽出した場合に比べ抽出精度をF 値で6.1 ポイント(10%)
向上させられることを示した．
キーワード
質問応答，テキスト間の類似度，言い換え，属性関係，関係抽出



0261023	藤田　篤
「An empirical approach to linguistically accurate paraphrase generation」

Abstract
Paraphrases are alternative linguistic expressions conveying the same information. Technology for
automatically generating or recognizing paraphrases has been attracting increasing attention due to its
potential in a wide range of natural language processing applications; e.g. machine translation, information
retrieval, question answering, summarization, authoring and revision support, and reading assistance.
In this thesis, we focus on lexical and structural paraphrasing in Japanese, such as lexical and phrasal
replacement, verb alternation, and topicalization, which can be carried out without referring to communicative
context. Analogously to the state-of-the-art technology of machine translation and language
generation, we decompose the process of paraphrase generation into two subprocesses; namely, generation
of paraphrase candidates followed by ranking the candidates. First, one of the major problems in
paraphrase generation is the difficulty of specifying the applicability conditions of each paraphrasing
rule. Since paraphrasing rules with inappropriate applicability conditions would produce erroneous output,
we need to develop a robust method to detect and correct such transfer errors in the post-generation
process. Second, certain classes of paraphrases exhibit a degree of productivity that allows them to
be systematically explained based on semantic properties of constituent lexical items. Such systematic
paraphrases include verb alternation and compound noun decomposition. To realize them, we need to
explore principled ways of paraphrase generation based on frameworks of lexical semantics.
To handle errors occurring in paraphrase generation, we first examine what types of errors tend to
occur in lexical and structural paraphrasing of Japanese sentences. On the basis of the observation that
errors associated with case assignments form one of the major error types, we develop an error detection
model for this type of errors. The model effectively uses a large collection of positive examples and a
small collection of negative examples by combining supervised and unsupervised machine learning
methods. Experimental results show that our model significantly outperforms the conventional models.
To capture the systemicity underlying paraphrases, we assess the lexical conceptual structure (LCS),
which represents verbs as semantic structures together with the relationships between their arguments
and syntactic cases. Relying on this framework, we develop a paraphrasing model consisting of a
handful of LCS transformation rules, particularly focusing on paraphrasing of Japanese light-verb constructions
(LVCs). Through an experiment, we show that our model generates paraphrases of LVCs
accurately, gaining advantages over conventional approaches to this class of paraphrasing.
In this thesis we built up the following arguments. First, there has not been made comprehensive
investigation into errors across different paraphrase classes, although various case studies have so far
been done on paraphrasing. Therefore, our attempt to the errors provided a fine perspective for further
research on handling generation errors. We then addressed the most imperious type of error occurring
in paraphrase generation, and proved that the feasibility of our over-generation plus filtering approach.
What we argued here was how to use negative examples effectively. Second, we proposed a lexicalsemantics-
based account of a subset of lexical paraphrases. Since our model relies on a solid linguistic
theory, the approach is promising for realizing paraphrases.
Keywords:
automatic paraphrasing, paraphrase generation, paraphrase taxonomy, transfer errors, revision-based
transfer, language model, lexical conceptual structure, paraphrase corpus



0361207	鈴木　潤
「Kernels for Application Tasks in Natural Language Processing」

Abstract
The recent success of statistical natural language processing (NLP) made feasible the
development of challenging applications. For example, text classication traditionally
classied texts into `topics' but now researchers attempt to classify texts according to
`sentiments', such as `intention' and `polarity'. Text summarization traditionally only
extracted important sentences from single documents but now strives to automatically
generate an abstract from multiple-source documents.
These trends indicate that recent tasks increasingly demand a text to be inter-
preted semantically or contextually. In other words, solving recent tasks with higher
performance requires methods that can handle richer types of linguistic information.
Conventionally, in the eld of NLP, a set of words, called bag-of-words, is the most
popular method to represent features of texts. However, it is widely known that bag-
of-words models lack many linguistic features found in texts. It is widely accepted
that the lacks of structures in bag-of-words models leads to inadequate performance
in recent tasks. Therefore, methods capable of handling richer linguistic information
within texts are desirable.
For these reasons, this dissertation proposes a methodology that is capable of han-
dling richer structural information derived from syntactic and semantic analysis. I
formalize all of my proposed methods within the framework of kernel methods. More
specically, the proposed methods are dened as kernel functions, a very generalized
mathematical framework developed in Machine Learning eld. Since this formalization
embeds the proposed methods in a generalized framework, I can apply the proposed
methods to many other tasks, even other research elds.
This dissertation discusses the following several methodologies that might lead sub-
stantial improvement for recent tasks in application areas of NLP. They are:
1. Eectively handling dierent levels of word attribute, such as words, semantic
information obtained from dictionary and part-of-speech
2. Eectively handling richer structural information derived from integrated syntac-
tic and semantic analysis
3. Eect of statistical feature mining for structural features
The chapters, 3 to 5, address those three topics, respectively.
Before going to these topics I rst describe the present state of tasks in application
areas of NLP as the background of my work in Chapter 1.
In Chapter 2, I explain the necessary concepts for understanding the dissertation:
Kernel methods and kernels for discrete structure, discrete kernels, in theory and math-
ematical formalism, in order to pave the way for embedding the proposed methods in
the framework of kernel functions. I also explain some specic examples of kernel meth-
ods and discrete kernels, i.e., Support Vector Machines, as well as sequence kernels and
tree kernels.
Chapter 3 proposes a feature extraction method named Word Attribute N-gram.
Comparing to bag-of-words representation, the proposed method can handle several
levels of word information. Moreover, this method deals with not only a set of word
attributes, but also conjunctions, or N-grams, of word attributes which are expected
to capture some important linguistic expressions. I assume that these linguistic expres-
sions are more eective than single words or attributes. Both words attribute N-gram
method and sequence kernels being instances of discrete kernels, I show the relationship
between them. In the experiments, I clarify the eect of proposed method and verify
the eect of each features to given tasks.
Chapter 4 proposes Hierarchically Structured Graph Kernels, a method dealing with
integrated structural information that re
ects the results of syntactic and semantic
analysis within text. I assume that this richer structural information gives much higher
quality of clues to solve the target tasks. I dene the proposed method as kernels on
a certain class of graph, called hierarchically structured graph, which is a graph with
recursive hierarchical structure constructed by subgraphs and edges from vertices to
subgraphs. Experiments demonstrate the performance of this method compared to
other discrete kernels which can only be restricted syntactic and semantic information
within text.
Chapter 5 proposes a statistical feature mining method for discrete kernels, such
as sequence and tree kernels. Unfortunately, some previous experiments have shown
that in some cases there is a critical issue with discrete kernels, especially in NLP
tasks. That is, the over-tting problem arises due to too many and sparse but redun-
dant features that are processed implicitly in these kernels. As a result, the machine
learning approach may not be trained eciently. Conventionally, this issue is addressed
by eliminating large substructures from the set of features used. However, the main
reason for using discrete kernels is that we aim to use structural features easily and
eciently. Therefore, I propose a new approach based on statistical feature mining that
avoids over-tting without lacking the any statistically important features. Moreover,
I embed our proposed feature selection method into an original kernel calculation pro-
cess by using sub-structure mining algorithms, which allows for ecient computation.
Experiments are undertaken on some of sentimental classication tasks to conrm the
problem with a conventional method and to evaluate the eect of the proposed method.
Finally, I summarize the dissertation and describe possible future directions in
Chapter 6.
Keywords:
natural language processing, structured text, kernel methods, discrete kernels, hierar-
chically structured graph, feature mining,



0261018	野島　良
「Efficient Key Management Schemes in Broadcast Encryption」

Abstract
Recent development of technology enables us to realize services which deliver 
digital content to users through a high-speed network or a large-capacity (and
low-cost) storage media such as DVD. In such a service, it is essential to 
protect the content from malicious users and eavesdroppers who try to obtain 
the content without paying. An important aspect of such services is that the 
delivery of the digital content can be regarded as ``broadcasting''; one
center distributes identical information (possibly encrypted content) to all 
the users. To protect digital content, we need to encrypt the content so that 
only valid users can decrypt it. This kind of problem is sometimes called  
broadcast encryption. In broadcast encryption, there are two efficient 
methods, called the complete subtree (CS) method, and the subset difference 
(SD) method. However, the straightforward uses of the key management parts in 
these methods cause a problem in practical implementations, since when the 
number of users' terminals becomes huge, the size of secret information which 
must be secretly stored by each terminal becomes non-negligibly large.

In this thesis, we show two key management schemes in the CS method, and
two other schemes in the SD method, respectively. These four schemes share 
the same approach, which is to reduce the size of secret information in each 
terminal while preserving the security. 

For the CS method, two key management schemes which reduce the secret 
information in each terminal from $O(\log N)$ to $O(1)$ are proposed, 
where $N$ is the number of all the terminals. The essential idea behind 
the proposed schemes is to use a trapdoor permutation. Using the trapdoor 
information, the key management center computes and assigns a key to each 
terminal so that the terminal can derive all information necessary in the 
CS method. In the first scheme, two trapdoor permutations are used. We show 
that the permutations to be used need to satisfy a certain property which is 
similar to but slightly different from the claw-free property. The needed 
property, named strongly semi-claw-free property, is formalized in terms of a 
probabilistic polynomial time algorithm, and its relation to the claw-free 
property is discussed. It is also shown that if the used permutations fulfill 
the strongly semi-claw-free property, then the proposed scheme is secure 
against attacks of malicious users. Next, we show another scheme which uses 
a general one-way trapdoor permutation and a hash function. This scheme is 
efficient if we use an ``idealized'' hash function for the hash function. 
However, the scheme can be proven secure even if we use a ``non-idealized'' 
hash function.

We also propose two secure and efficient key management schemes under a 
reasonable assumption on the ability of malicious users in the  SD method. 
Under this assumption, it is possible to reduce the size of secret information
from $O(\log^2 N)$ to $O(\log N)$ in each terminal in the original SD method.
This result remedies the main drawback of the SD method that requires users' 
terminals to keep a large amount of the secret information.

In this thesis, the detailed comparison between the proposed schemes and other
similar schemes is presented and we believe that the proposed schemes are 
especially suitable for practical implementations of broadcast encryption.




0261012	越田　高志
「Studies on Web Service Dynamic Invocation for CALS/EC
（ＣＡＬＳ／ＥＣにおけるＷｅｂサービス動的実行手法に関する研究）」

Abstract
The old CALS/EC system was a static system like the E-commerce fixed among specific 
companies. However, from now on, towards worldwide market cultivation or an expansion 
of business opportunities, companies necessitate a dynamic CALS/EC system that enables 
a flexible commercial transaction according to user request. As new technology of realizing 
the dynamic CALS/EC system, Web service based on SOAP, UDDI, and WSDL technologies is proposed,
 and the utilization research is advanced. Web service is the general term of the distributed 
processing software that communicates by the SOAP protocol, using XML as communication data.
 The technical information on Web service is described in WSDL. The information on the contents 
and the provider of Web service is registered into a UDDI registry. A user searches a UDDI registry,
 and detects and performs required Web service. Although Web service is provided with flexibility,
 such as interoperability based on XML, and platform independence, use of the Web service is not
 expanded in a business field. I believe that this has been due to three problems:
(1) Users cannot easily develop a stub program for executing Web services on
the user side.
(2) It is difficult for users to find useful and suitable Web services.
(3) Users cannot easily understand the functions of Web services and how to
use them.
In this research, I proposed a new means and concept that solve these problems. Furthermore,
 I developed the Web services that embodied the means and concept, and implemented a 
B2B system that loosely coupled those Web services. And I conducted the experiment that 
applied the system to the actual business process, and verified the effectiveness.
First of all, to verify applicability of Web service in CALS/EC, I developed the Web service to 
an actual business process with construction of a UDDI registry. Then, I implemented a goods 
procurement B2B system, applied it to use-case, and confirmed its availability.
 Next, as the means of solving the problem of (1), for complex output data-type in Web service, 
I proposed the JavaBeans dynamic generation and analysis technique. Then I developed the Web 
service dynamic invocation system independent of an output data-type unifying these means.
 It is a stub-less system that detects Web service from a UDDI registry and is executed dynamically.
Next, I proposed the concept of primitive Web service as the technique of solving (2) and (3) problems. Based on it, 
I developed primitive Web services, and agents that cooperate and control them, and implemented 
as a goods procurement B2B system. The system was applied to the actual use-case, and I 
confirmed the effectiveness as compared with the system using the conventional Web service.
Keywords:
CALS/EC, Web service, UDDI registry, Business-to-business system, Dynamic
invocation, Primitive Web service




0261026	的野 晃整
「Studies on Storage and Retrieval for RDF Data Based on Path Exporessions
（RDF データのための経路式に基づいた格納と検索に関する研究）」

Abstract
Recently, the Semantic Web has emerged as a vision of the next generation of the
WorldWideWeb. One of the key dierences between the currentWeb and the Semantic
Web exsits on metadata. In the Semantic Web, metadata are described by Resource
Description Framework (RDF), that is for describing data and their semantics. RDF
is a foundation to describe of RDF graph representing metadata. RDF graph is composed
of a set of statements, which can be represent as binary relationships among
web resources. The structure of RDF graph thus is a directed graph. Additionally,
RDF Schema is also defined to describe schematic information of RDF data.
Today, it is becoming increasingly common to use RDF as a metadata format. One
typical usage is to describe large-scale metadata: WordNet , an online dictionary,
whose total size is 35 MB; Gene Ontology, a controlled vocabulary for functional annotation
of gene products, whose size is 365 MB; and Open Directory Project, which
provides the largest Web directory, and whose size is over 2 GB. In the near future,
such large RDF-based metadata is considered to increase rapidly as RDF comes into
wide spread use. In order to handle such data eciently, RDF databases and indexing
scheme that can manage massive RDF data are essential.
So far, several RDF databases have been proposed. The first problem is poor performance
in processing path queries. The reason is that RDF data are decomposed into
statements in most of the conventional RDF databases. Therefore, we need to perform
a join operation per each path step. This results in performance degradation as RDF
data grows and/or query length becomes longer. The second problem concerns the
ability to handle RDF schema. The conventional RDF databases can be classified into
two groups: the first group is designed depending on RDF schema; and the second
group stores RDF data in terms of statements. The former cannot handle a schemaless
RDF data. The latter have to search the superfluous area, where both data are stored,
because they do not make any distinction between schema and instance data
In this study, we propose a scheme to store RDF data in relational databases and an
indexing scheme for RDF data based on sux array of path expressions. Our proposed
relational schema is designed to be independent of RDF schematic information, and
designed to make the distinction between schema and instance data. In our approach,
we first construct subgraphs from RDF data. In this way, we can resolve the above
problem of handling schema information. We then store the subgraphs into distinct
relational tables by applying appropriate techniques for representing each subgraph.
In particular, we extract all reachable path expressions from a subgraph including instance
data. In addition, we apply an interval numbering scheme to subgraphs including
schematic information, enabling us to eciently detect ancestor-descendant
relationships between two nodes. Meanwhile, in our indexing scheme based on sux
array, we generate a set of path expressions from each subgraph which is extracted
from RDF data. We then construct sux array from path expressions. This indexing
scheme can be used with the conventional RDF databases and make the number of
join operations to be decreasing. Through a series of experiments to evaluate the performance,
Interestingly, the processing times of the conventional approach increase as
the path lengths grow, while the times of our approach decrease. We arm that our
approach can eciently handle massive RDF data.
Keywords:
RDF database, path expressions, directed graph, sux array, numbering scheme




0261028	目次 正一
「Two aspects of bioinformatics in different timescales:
 sampling problem on molecular dynamics simulation and perspective to the gliding mechanism of Mycoplasma mobile」

Abstract
Computational science has deeply penetrated into molecular biology for the last few decades
and emerged as a new field called bioinformatics. Bioinformatics treats biological problems of
extremely different timescales. One extreme is molecular dynamics (MD) simulation, which
deals with the motion of protein molecules in the order of nanoseconds. Another extreme is the
analysis of molecular evolution, which deals with the events that take place during the time
scale of millions of years. Here, I would like to report my approach to two biological problems
of two distinct time scales. One is the analysis of sampling problems in MD simulation of
protein, and the second is the analysis of gliding mechanism of Mycoplasma mobile.
MD simulation has become one of general tools to estimate thermal property of proteins. A
simulation of nanoseconds of protein motion can be carried out. To derive thermal properties of
molecule from a MD trajectory, an assumption has to be made, that the simulation has been long
enough to reconstruct the ensemble of the molecular motion. There is no simple criterion to
make a judgment, if the simulation time was sufficiently long to estimate the physical property
of the molecule. Many physical quantities are known to be expressed as a second moment of
dynamical variables , which can directly deduce vibrational modes . Therefore, I have
investigated the effect of vibrational modes on convergence of the second moment matrix
component of atom positions. We found that the frequency modes higher than 68[cm-1]
converges to the equilibrium value within 10% error value within 400 pico seconds. The
difference between the eigenvalues of second moment matrix of all modes and those of only
middle (10~68[cm- 1]) and low (~10[cm-1]) modes is very small. These results mean that the
higher frequency components of second moment matrix converge v ery fast and hardly influence
the convergence of middle and low frequency components. The middle frequency modes
influence the convergence of lower modes but this influence is less significant, compared from
the effect from low modes obtained by former study . As a summary, the lower modes are
dominant for the convergence of second moment matrix.
Computational analysis of evolution is another important tool to clarify the molecular
mechanism of biological machinery. Here I describe one such approach to the analysis of the
movement mechanism of a bacterium, Mycoplasma mobile. M.mobile has an ability to glide,
when attached on a solid surface. The mechanism of the motion is not understood yet, however,
a few genes including Gli349 are experimentally shown to be related to the motion. The DNA
sequence of the Gli349 gene has already been reported but the three -dimensional structure has
not been solved yet. To obtain a clue for the mechanism of its movement, I carried out an
evolutionary analysis of the Gli349 sequence. As a result, I found 21 sequential repeats within
the Gli349. Each repeat consists of approximately 100 amino acids with a conserved sequence
motif, “YxxxxxGF”. No homologous sequence to Gli349 was found in NCBI RefSeq
non-redundant sequence database. In general, repeat sequences are known to assoc iate with
other protein. This leads to a speculation that the repeat regions in Gli349 are also responsible
for the interaction with other proteins. Chymotrypsin breaks the amide bond where the sequence
is not within a structural domain, such as a loop region between structural domains. The
structural domain regions speculated from the chymotrypsin cleavage experiment agreed well
with the repeats I discovered. I predict one sequence repeat corresponds to one structural
domain, and the entire structure of Gli349 is composed of the tandem domains. Electron
microscopy experiment shows Gli349 takes rod shape with a couple of kinks. The size and
shape of the observed shape fits very well with the predicted domain repeat structure.
Immu noglobulin fold is known to form a rod structure by forming tandem repeats. However, the
repeat sequences do not fit into any of know immunoglobulin fold structure s . Therefore, I
conclude that the repeat sequence in M.mobile responsible for th e gliding forms a novel type of
structural domain.
Keywords:
molecular dynamics, sampling, evolutionary analysis, Mycoplasma, gliding protein


0061024	持橋　大地
「A Particle Filter Approach to Statistical Language Modeling of Contexts
（逐次モンテカルロ法による文脈の確率的言語モデル）」


Statistical language modeling is not only important for machine translation or 
speech recognition, but also forms a foundation for text modeling, information 
retrieval and human-machine interaction. Therefore, sophistication of statistical 
language model is an important challenge of natural language processing and for 
similar high-dimensional discrete data domains which we often encounter. 
 Although research in statistical language modeling have been largely concentrated 
on "n-gram" models that captures adjacent regularities, in a real linguistic 
environment and to account for our adaptive linguistic behavior, we require 
sophisticated long-distance language model that reflects contextual knowledge 
appropriately. 

 However, long-distance language models proposed so far assumes a single 
static "context" and an adaptation to it, neglecting topic shifts that are indispensable 
in modeling long texts or applications to the problems that we first mentioned 
in a real environment. In the words of information theory, long-distance language 
models so far are the approaches that regard a text as a stationary information 
source, no matter how long and heterogeneous it is, to estimate its parameter as 
the data are available as "context". 

 In contrast, in this thesis we regard a topic shift as a latent stochastic process 
to give an explicit probabilistic generative model that describes topic shifts within 
a text. According to this model, we propose a novel long-distance language model 
that captures topic shifts and their rate automatically, in a principled Bayesian 
approach. 

 This model is based on a model called Mean shift model in Statistics, and 
essentially a nonlinear HMM that cannot be decoded by traditional Baum-Welch 
algorithm or Kalman Filters. For this purpose, we used a multinomial Particle 
Filter, a sequential Monte Carlo method that has been used mainly in signal 
processing or robotics research, to estimate both its state and parameter online. 

 Essentially, this model is an extension to a DNA sequence modeling recently 
proposed by Chen and Lai (2003) in Statistics. Since naive application to original 
model to natural language raises problems as to extremely large number of 
symbols (words) and semantic correlations existent between them, we extended a 
multinomial Particle Filter by both LDA and DM, Bayesian text models recently 
proposed, to incorporate semantic relationships between words and to update 
Dirichlet priors that are assumed known and fixed in the original model. 

 As a result, we give two models, MSM-LDA and MSM-DM: the former tracks a 
change of mixing distribution of a mixture model in a multinomial topic simplex, 
and the latter tracks unigram distributions directly in a word simplex. They 
recognize topic shifts and their rate sequentially in a Bayesian fashion, to make 
an optimal prediction of next word by a mixture of different context lengths that 
are sampled by each particle individually. 

 Experiments on the standard British National Corpus showed consistent perplexity 
improvements on simple context models that have been used thus far, to 
give a Bayesian context model that has the lowest perplexity in the current state 
of art. 

 Though this model is a forward predictive language model, it can be extended 
in principle using a Monte Carlo forward-backward algorithm or to a collection of 
documents, obviating a unit of "document" that has been assumed for a naive 
unit of semantic modeling in natural language processing. 

Keywords:
Language Model, Bayesian Learning, Unsupervised Learning, 
Sequential Monte Carlo, Change Point Detection, Time series analysis. 



0361026	中島　淑貴
「NAMインターフェース・コミュニケーション」


　非可聴つぶやき（Non-Audible Murmur: NAM）は「気導音としては周りが聞き取れない
ほどの無声音のつぶやき」の「肉伝導音」であり，音響学的には「声帯振動ではなく気道
の乱流雑音を音源とする無声呼気音が，発話器官の運動による音響的フィルタ特性変化に
より調音されて，人体頭部の主に軟部組織を伝導したもの」と定義する．音声の生成系で
ある人体表面から直接NAMをサンプリングすることにより，高感度で聴取可能な音声信号
として捉えることが可能となり，同時に気導外部雑音は人体にフィルタリングされて低減
する． 

　第一に聴診器接着型NAMマイクロフォンを開発して，肉伝導するNAMをサンプリングして
認識するのに適した装着位置を見つけた．HMM音響モデルにEM学習や話者適応を行ってNAM
音響モデルを作成し，大語彙連続認識実験を行い，いわゆる「無音声認識」（非可聴つぶ
やき認識）の実用可能性を見いだした．またこのNAMマイクロフォンによりサンプリング
される体内伝導通常音声(Body Transmitted Ordinary Speech: BTOS) によるBTOS認識も
検討した． 

　第二にNAM音の信号処理による通常音声化，いわゆる「無音声電話」などへの応用があ
るが，聴診器型NAMマイクロフォンによるNAMは2KHz以上にフォルマントが見られない．こ
のため我々は皮膚の音響インピーダンスに近いソフトシリコーンを音媒体に用いた新型
NAMマイクロフォンを開発し，NAM音の帯域の広範化，接触面感度や外部雑音耐性の上昇を
得た．このソフトシリコーン型NAMマイクロフォンによりNAMやBTOSのHMMによる認識にお
いても，聞き取り試験でも，聴診器型に比しその認識率が向上した． 

　第三としてNAMマイクロフォンを同側で縦に2つアレイ化して装着することにより，ピッ
チ変動に伴う喉頭の上下動をパワー比により移動音源定位することでF0とは異なった視点
でBTOSやNAM発話のピッチを推定できる可能性と，音声の研究において人体を肉伝導の音
場と捉える考え方を紹介した． 

　このNAMとその汎用音声入力インターフェースとしての利用価値の発見により，NAMを肉
伝導の第二の音声言語として，その信号に既存の音声信号処理技術の蓄積を応用し，周囲
環境に気兼ねせず影響も受けにくい，人対機械，人対人の新しい発話入力インターフェー
ス・コミュニケーションが可能となる． 



0261008	上岡　隆宏
「I’m Here! : Wearable Memory Augmentation System to Support Object Finding Task
（I’m Here! : 物探しを支援するウェアラブル拡張記憶システム）」


本研究の目的は，物探しタスクを効率化するウェアラブル拡張記憶システム
を設計・評価することである．物探しタスクとは，人が物体を見つけるために
日常生活環境の中を探しまわるタスクであり，その対象は人が持ち運んで使用
する把持物体である．
物探しタスクが発生する原因は，自分自身が把持物体を最後に置いた場所を
思い出せなくなることにある．本研究で提案する物探し支援システムI'm
Here!は，ユーザの視野映像を蓄積・検索するウェアラブル拡張記憶システム
を応用し，対象物を最後に置いた場所を視野映像として蓄積・検索することに
よって，ユーザが物探しタスクのために浪費する時間を短縮する．I'm Here!
は，単純なシステム構造，低い導入・維持コスト，簡単な操作を特長とする．
本研究では，I'm Here!の有意性・信頼性を示すため，I'm Here!を装着した
被験者による物探し実験と，I'm Here!の提示する映像の妥当性を被験者が評価
する実験を行った．その結果，I'm Here!によってユーザの物探しタスクが効率
化されることが示され，ユーザの物探しを効果的に支援するために必要な視野
映像の視野角条件と物体認識の精度に関して，具体的な指標が求められた．ま
た，I'm Here!が視野映像を記録するために重要な役割を担うカメラデバイスを
開発し，カメラデバイスを用いた物体認識手法の精度や環境適応性を調査した．
その結果として，I'm Here!の機能は実現可能であることが示された．



0161017	小林　亮博
「Mediation Architecture of Personal Robots’ Applications Based on Communications’ Model
（コミュニケーションモデルに基づいたパーソナルロボットのアプリケーション調停機構）」


本論文は，パーソナルロボットが個々に持つ親和的な挙動を損なうことなく様々な環境に特化した
サービスを提供することを目指し，パーソナルロボットが訪問先の環境からアプリケーションを動的に
ロードしロボットに元から実装されているアプリケーションと共存させるソフトウェアアーキテクチ
ャを提案する．近年，家庭やオフィスといった日常環境へのロボットの応用が着目されており，親しみ
やすい外見を持ったロボットが開発されてきている．これらのロボットはパーソナルロボットやホーム
ロボットと呼ばれており，主に情報化環境における知的インタフェースの分野や，エンタテイメント
分野での需要が見込まれている．近い将来パーソナルロボットは，便利なモバイル端末や親しみやすい
パートナーとして，ユーザに同伴し様々な環境で利用されると考えられる．ロボットが外出先で，その
環境に特化した情報サービスを動的にロードし実行できれば，ユーザの利用の幅が大きく広がる．本研
究は，これらの情報サービスを動的にロードしロボット上で実行する機構を提案する．
本研究では，パーソナルロボット自身が持つ固有のアプリケーションを親和アプリケーション
(Familiarity-oriented Application FA)，個々の環境で動的にロードされるアプリケーションを環境アプ
リケーション(Environment-oriented Application EA)と呼ぶ．他の情報端末に対するパーソナルロボッ
ト最大の長所は，身体を用いた親しみやすいコミュニケーションの能力である．パーソナルロボットの
親和性はFA が生成する動作に大きく依存しているため，ユーザがEA を利用する合間に，FA による
ロボット固有の動作をはさみ，ユーザとの親和性を保つことが望ましい．そこで本研究はFA とEA の
出力を調停することで1 台のロボット上で同時に実行し，ユーザが双方のアプリケーションをフレキシ
ブルに利用できるシステムを目指す．
これらのFA とEA は独立に開発されることが一般的に想定されるため，マルチタスクOSが持つよ
うな，デバイスの排他処理やスケジューリングの機能が必要となる．このとき通常のプロセススケジ
ューリング等と異なり，人間―ロボット間のコミケーションに違和感の無いよう調停が行う必要がある．
本研究では，以上がパーソナルロボットにとって最大の問題ととらえ，1 台のロボット上で複数のアプ
リケーションを利用するユーザとロボットのコミュニケーションモデルを設計し，調停の基準とした．
このコミュニケーションモデルに，談話解析やノンバーバルな人間--ロボットコミュニケーションの解
析等の知見から，Information Unit， Communication Stream， Communication Channelの3 種類
のコミュニケーション単位を導入した．本稿では，以上のモデルから調停ルールと内部表現の設計を行
い，アプリケーション開発における制限を述べている．最後に，上記モデルに沿ったロボットの動作を
ビデオに撮影し，被験者の主観によりロボットの動作に対するモデルの効果を検証した．



0261016	田中　康
「ソフトウェア開発におけるプロジェクト管理とプロセス改善に適した成果物観点によるプロセスモデルPReP」


　ソフトウェア工学の技術的な進展の一方で，スケジュールの遅れ，ソフトウェア品質
に基づく事故，そしてコストの見積もり間違いなど，ソフトウェア開発プロジェクトの
進め方に起因する失敗が重要な課題となっている．このような問題に対して「ソフトウェ
アプロセス」という概念が登場し，ソフトウェアプロセスの観点から，ソフトウェア開
発が抱える問題を解決するための様々な技術の提案や議論が行われてきたが，依然とし
て，ソフトウェア開発プロジェクトの失敗原因は改善されていない．特に，失敗原因を
改善するために定義されたソフトウェアプロセスが実際の開発プロジェクトで利用され
ないために，改善が実現しないといった問題を抱えている．
本論文では，定義されたソフトウェアプロセスが実プロジェクトで利用されない問題を，
プロセスの記述方法，すなわちプロセスのモデル化方法の観点から分析し，プロセスの
モデルが現実の開発活動を適切に表現していないことに根本的な原因があると予測した．
つまり，従来のプロセスモデルには，高度に一般的で抽象的なプロセスの流れを示すこと
ができるものや，あるいは特定のプロジェクトのプロセスの中身を非常に具体的に記述
することのできるものは存在するが，これらは，計画策定や進捗管理に直接活用できな
いか，あるいは，複数のプロジェクト間で繰り返し活用できないといったように，具体
性と一般性のいずれかに短所が存在する．プロセス改善のための参照モデルとして現在
広く利用されているCMM（Capability Maturity Model）やCMMI（Capability Maturity
 Model Integration）では，定義したプロセスをプロジェクトが利用するために「制度
化」というプロセス遵守のための規律を定義している．しかし，このような「制度化」
につとめても，従来のプロセスモデル押w)ｻ手法を用いて定義したプロセスが現実の
開発活動を適切に表現していないため，実際の開発プロジェクトで利用されなくなって
しまい，さらに改善の効果を期待することも困難になる．そこで，現実の開発活動を適
切に表現し，プロセスの改善および開発プロジェクトでの利用に適したプロセスモデル
としてPReP（Product Relationship Process）モデルを開発した．さらに，PRePモデル
のプロジェクトの管理およびプロセス改善への利用方法を定義し，そのための支援環境，
および組織への導入モデルの定義を行った．
従来のモデル化の方法の多くは，タスク（作業）の時間的順序関係によってプロセスを
モデル 化している．一方，PRePモデルは，開発プロセスの中で作成され管理される実
体である成 果物に着目したプロセスのモデル化方法であり，成果物の入出力の関連構
造に着目したモ デル化方法を特色としている．モデル化の方法として，成果物と成果
間の関連の種別を分類し，その表記方法を定義した．
PRePモデルでは，最終成果物を作るために必要となる成果物の入出力の関連構造が，適 
用する技術と管理方法とによって一意に決まるため，同等のプロファイルを持つプロジェ
クト間では，90～100%の再利用率となり，定義したプロセスの再利用性が高いことが分か
った．さらに，調査の結果，モデルの理解性，改善への適用性，プロジェクト管理への
利用性において優れているとの評価を得た．また，PRePモデルでは，成果物を“作業者
に割り当てられた作業の成果物”と定義することによって，モデルの粒度とプロジェク
トの管理粒度が一致するために，プロジェクトの計画策定や進捗管理への適用性が高い．
特に，プロジェクト管理のためのスケジュール作成に必要となる内容と構造とになるた
め，記述したモデルからプロジェクト管理で使用するガントチャートへの自動変換が可
能であることを確認した．
現在，筆者の所属する企業の開発組織では，本論文で定義したプロジェクト管理とプロ
セス改善への利用方法と，組織への導入モデルに従い，PRePモデルの導入と適用を行っ
ている．プロセスの記述と利用は順調に進んでおり，次の段階の課題として，最適なプ
ロセスに関する議論が行われている．また，本論文で定義したプロセス中心型環境の基
本アーキテクチャに基づいた支援環境の実現が今後の課題となっている．

キーワード
ソフトウェアプロセス，プロセスモデル，プロジェクト管理，プロセス改善，プロセス
再利用，プロセスパターン，プロセス中心型環境



0361012	坂田　宗之
「アクティブ赤外線タグを用いた屋内ユーザロケーションシステム」



情報機器の可搬化やネットワーク環境の拡大が進み，いつでも，どこでも計算機や
ネットワーク資源が利用できるようになりつつある．このような環境の中で，刻々
と変化する自分や他のユーザの位置を把握し，現在位置に関連付けされた情報を提
供したり，過去の位置情報を利用することで新しい利便性が生まれる．この考え方
にはユーザや端末の位置を同定するシステム，ロケーションシステムが必要不可欠
である．屋外においては，GPS(Global Positioning System)が最有力なロケーション
システムである． GPSは，条件が揃えば最高で1m程度の位置同定精度を実現しており，
携帯電話にも搭載できるほど小型化されていたり，近年ではカーナビゲーションシス
テムだけでなく親が子供の現在の位置を遠隔地から確認できるサービスに利用される
など，広く一般に普及している．オフィス環境などの屋内のロケーションシステムに
ついても種々の研究がなされている．しかし，各手法は位置同定精度やユーザ識別能
力の有無，コスト面など様々な特徴を有しており，決定的なシステムは存在していない． 

本研究では，屋内におけるロケーションシステムに必要な条件を，ユーザの行動を制
限しないこと，比較的高い位置同定精度，センサ数が少なく簡便なシステムであるこ
との三点であると考える．ユーザの行動を制限しないためには，各ユーザが携帯する
デバイスをできるだけ小さく，軽くする必要がある．また，位置の把握を行うために
ユーザの意図的な動作を必要とせず，システムが自動的に位置情報を把握・管理し，
ユーザの必要な時にアプリケーションとして提示可能なシステムであることが重要で
ある． 

本論文では，屋内においてこれらの条件を満たす位置同定，ユーザ識別の手法を提案
した．本論文で提案する手法では，ユーザ端末に取り付けられたIR (赤外線)タグの
発光を赤外広角カメラにより検出し，少ないセンサで安定した位置同定を行う．同時
に，システムとユーザ端末の通信を利用してタグの発光を制御し，これを用いて対応
付けを行う事で自動的に各ユーザを識別し追跡することを可能にする．次に，提案し
た手法を実現するため，ロケーションシステムのプロトタイプであるALTAIRを構築し，
有効性を確かめる実験を行った．実験では，まずユーザの居る領域が無線ネットワーク
のアクセスポイントに関する情報を利用して検知可能であり，ユーザの領域を移動を
自動的に取得し，各領域におけるユーザ識別および位置同定がユーザの意図的な動作
を必要とせずに始められることを確認した．次に，各領域において安定してユーザの
識別および位置同定が自動かつ比較的高精度に可能であることを示した．さらに，各
領域において取得されたユーザの位置情報をデータベースを用いて管理することで，
基礎的な複数のアプリケーションに利用可能であることを示した． ALTAIRを利用する
と，人の動きを時間的・空間的に把握可能であり，実現可能なアプリケーションは多
岐にわたる．例えば，ある領域に注目して過去や現在の人の移動など状況の遷移を観
察したり，あるユーザの足跡を時間をさかのぼって確認することなどができる．さら
には，取得した情報をデータベースを利用して管理するため既存のシステムとの親和
性が高く，位置情報を軸として他のメディアと連携するなど，ユビキタス時代のアプ
リケーションを実現するための中核を担うことが期待される．
情報科学研究科専攻長
平成１６年度情報科学研究科博士学位論文内容梗概

0061204 倉立尚明「Talking Head Animation System Driven by Facial Motion Mapping and 3D Face Database （顔面運動マッピングと三次元顔形状データベースを用いたトーキングヘッドアニメーションシステム）」

0261025 松村知子「レガシーソフトウェアにおける暗黙的コード制約の形式化と潜在フォールトの検出 Formulating Implicit Code Constraintsand Detecting Potential Faults in Legacy Software」

0261006 今村賢治「Automatic Construction of Translation Knowledge for Corpus-based Machine Translation (コーパスベース機械翻訳における翻訳知識自動構築法の研究)」

0361040 Pang Yan-bin 「DESIGN AND ANALYSIS OF FIELDBUS CONTROL SYSTEMS」

0161203 橘拓至「Studies on Performance Analysis of Network Architectures for Wavelength Division Multiplexing
（波長分割多重方式におけるネットワークアーキテクチャの性能解析に関する研究）」

0261014 下畑光夫「Acquiring Paraphrases from Corpora and Its Application to Machine Translation （コーパスからのパラフレーズの獲得とその機械翻訳への適用）」

0261007 岩垣剛「Studies on Design for Delay Testability and Delay Test Generation for Sequential Circuits」

0361006 大杉直樹「A Framework for Software Function Recommendation based on Collaborative Filtering」

9761210 野本忠司「Machine Learning Approaches to Rhetorical Parsing and Open-Domain Text Summarization （機械学習による修辞構造解析および領域非限定文書要約）」

9961202 北村美穂子「Practical Techniques for Improving Machine Translation through Parallel Corpora （対訳コーパスを利用した実用的な機械翻訳の研究）」

0261010 栗田雄一「CPG-based Rhythmic Manipulation for a Multi-Fingered Hand from Human Observation （人の操作計測に基づく多指ハンドによるCPGベース型リズミックマニピュレーション）」

0261033 櫻井　滋「Structural basis for recruitment of flap endonuclease-1 to PCNA （フラップエンドヌクレアーゼ1とPCNAの相互作用の構造的基礎）」

0261003 井垣　　宏「On Developing Integrated Services of Networked Home Appliances」

0361004 内田　眞司「Evaluating software maintainability based on code clone analysis and overhaul （コードクローン分析とオーバーホールによるソフトウェア保守性評価）」

0261201 菊地　高広「インターネットにおけるライフライン通信の実現に関する研究」

0261011 小枝　正直「拡張現実感を用いたビークルの操縦支援に関する研究」

町田　貴史「Dense Estimation of Surface Reflectance Properties based on Inverse Rendering」

0361206 駒井　知央「Studies on high efficiency antenna for the digital mobile communications （デジタル移動通信におけるアンテナの高効率化に関する研究）」

0261017 西川　剛樹「A Study on Blind Source Separation Based on Multistage Independent Component Analysis」

0261032 PRASAD　RAJKISHORE 「A Study on Independent Component Analysis based Speech Signal Separation and Enhancement」

0261029 森岡　涼子「Studies on time series analyses for gene expression based on statistical methods （統計的手法に基づく遺伝子発現の時系列解析に関する研究）」

0361032 原　良昭「Study on Quantitative Evaluation Methods of Physical Activity （身体活動の定量的評価手法に関する研究）」

0261013 實廣　貴敏「Automatic Model Generation for Speech Recognition （音声認識のためのモデル自動生成法）」

9961008 大谷　朗「Formal and Computational Aspects of Phrase Structure Grammar in Japanese （日本語における句構造文法の形式的・計算的諸相）」

0261015 高橋　哲朗「Computation of Semantic Equivalence for Question Answering （質問応答のための意味的等価性の評価）」

0261023 藤田　篤「An empirical approach to linguistically accurate paraphrase generation」

0361207 鈴木　潤「Kernels for Application Tasks in Natural Language Processing」

0261018 野島　良「Efficient Key Management Schemes in Broadcast Encryption」

0261012 越田　高志「Studies on Web Service Dynamic Invocation for CALS/EC （ＣＡＬＳ／ＥＣにおけるＷｅｂサービス動的実行手法に関する研究）」

0261026 的野晃整「Studies on Storage and Retrieval for RDF Data Based on Path Exporessions （RDF データのための経路式に基づいた格納と検索に関する研究）」

0261028 目次正一「Two aspects of bioinformatics in different timescales: sampling problem on molecular dynamics simulation and perspective to the gliding mechanism of Mycoplasma mobile」

0061024 持橋　大地「A Particle Filter Approach to Statistical Language Modeling of Contexts （逐次モンテカルロ法による文脈の確率的言語モデル）」

0361026 中島　淑貴「NAMインターフェース・コミュニケーション」

0261008 上岡　隆宏「I’m Here! : Wearable Memory Augmentation System to Support Object Finding Task （I’m Here! : 物探しを支援するウェアラブル拡張記憶システム）」

0161017 小林　亮博「Mediation Architecture of Personal Robots’ Applications Based on Communications’ Model （コミュニケーションモデルに基づいたパーソナルロボットのアプリケーション調停機構）」

0261016 田中　康「ソフトウェア開発におけるプロジェクト管理とプロセス改善に適した成果物観点によるプロセスモデルPReP」

0361012 坂田　宗之「アクティブ赤外線タグを用いた屋内ユーザロケーションシステム」

平成１６年度 情報科学研究科 博士学位論文内容梗概

0161203 橘 拓至 「Studies on Performance Analysis of Network Architectures for Wavelength Division Multiplexing （波長分割多重方式におけるネットワークアーキテクチャの性能解析に関する研究）」

0261014 下畑 光夫 「Acquiring Paraphrases from Corpora and Its Application to Machine Translation （コーパスからのパラフレーズの獲得とその機械翻訳への適用）」

平成１６年度情報科学研究科博士学位論文内容梗概

0161203 橘拓至「Studies on Performance Analysis of Network Architectures for Wavelength Division Multiplexing
（波長分割多重方式におけるネットワークアーキテクチャの性能解析に関する研究）」

0261014 下畑光夫「Acquiring Paraphrases from Corpora and Its Application to Machine Translation （コーパスからのパラフレーズの獲得とその機械翻訳への適用）」