コロキアムB発表

日時: Wednesday, Novermber 28, Time 1 (9:20~10:50)


会場: L1

司会: 田中 宏季
井上 剛 M, 2回目発表 自然言語処理学 松本 裕治, 中村 哲, 新保 仁, 進藤 裕之
title: Joint Prediction of Morphosyntactic Categories for Fine-grained Arabic Part-of-Speech Tagging Exploiting Tag Dictionary Information
abstract: Part-of-speech (POS) tagging for morphologically rich languages such as Arabic is a challenging problem because of their enormous tag sets. One reason for this is that in the tagging scheme for such languages, a complete POS tag is formed by combining tags from multiple tag sets defined for each morphosyntactic category. Previous approaches in Arabic POS tagging applied one model for each morphosyntactic tagging task, without utilizing shared information between the tasks. In this paper, we propose an approach that utilizes this information by jointly modeling multiple morphosyntactic tagging tasks with a multi-task learning framework. We also propose a method of incorporating tag dictionary information into our neural models by combining word representations with representations of the sets of possible tags.
language of the presentation: Japanese
 
高橋 晴太郎 M, 2回目発表 数理情報学 池田 和司, 中村 哲, 吉野 幸一郎, 久保 孝富, 佐々木 博昭
 
北島 大夢 M, 1回目発表 光メディアインタフェース 向川 康博
title: A material classification method using time domain response to the illumination light
abstract: Material classification is an important technology in computer vision. However, it is difficult to distinguish different materials with similar appearance observed in ordinary color domain by a conventional RGB camera. In this research, we focus on the difference in time domain response caused by the subsurface scattering in each material. We can measure the time domain response with high temporal resolution by using a photon counting device, aiming to classify materials.
language of the presentation: Japanese
 

会場: L2

司会: 油谷 曉
若松 信孝 D, 中間発表 計算システムズ生物学 金谷 重彦, 安本 慶一, MD.ALTAF-UL-AMIN, 小野 直亮, 黄 銘
title: An approach to function prediction of metabolites by clustering the 3D-chemical structural similarity based network
abstract: A number of studies have investigated the relations between structures and functions of metabolites. It has been proposed that structural similarity between metabolites implies functional similarity between them. In light of this fact we propose a method for function prediction of secondary metabolites based on association philosophy. First we determined the structural similarity scores of all possible metabolite pairs using COMPLIG algorithm and then selected the metabolite pairs for which the similarity score is more than or equal to 0.95. To increase the possibility of clusters rich with known metabolites we then selected metabolite pairs for which functions of both or at least one metabolite is known. The network of such metabolite pairs was then clustered using the DPClusO algorithm. Statistically significant cluster-function pairs were then selected using the concept of hypergeometric p-values and FDR. Functions were then predicted for function unknown metabolites included in statistically significant clusters.
language of the presentation: Japanese
 
井上 雄貴 M, 2回目発表 計算システムズ生物学 金谷 重彦, 安本 慶一, 小野 直亮, 黄 銘, MD.ALTAF-UL-AMIN
title: Deep learning and gradient boosting for classification of high resolution tandem mass spectra
abstract: Compound identification in metabolomics conflict with a lot of unknown compounds. Because there are a lot of known compounds without associated reference mass spectra, library search method is not always succeeding. Other approaches such as in-silico fragmentations, in-silico modelling of spectra and compound identification methods with machine learning can help to annotate unknown spectra. Here we introduce an aggregated approach that uses structure-based compound class annotations from the ClassyFire service to annotate potential unknown tandem mass spectra. This idea is a computational pipeline that could assign chemotaxonomic classes (super classes, classes, subclasses) to annotate unknown MS/MS spectra. In our approach, we focused on the sub-classes with more than 300 members (Terpene glycosides, Carbohydrates and carbohydrate conjugates, Diterpenoids, Terpene lactones), because they provide a lot of training data and a chemical structure’s feature. We trained our models using datasets of 19,972 tandem mass spectra in 3 collision energies, which are publicly available in the MoNA database (http://mona.fiehnlab.ucdavis.edu/). The data also contains multiple adducts and precursor ions for given compounds. The input data consists of lists of m/z values and intensities of identified peaks. After pre-processing, i.e. removing of missing values and duplicate structure deletion, we obtained in total 6,791 samples for positive ionization mode and 2,051 samples for negative ionization mode. We applied feature selection and dimensionality reduction to extract important features of the dataset in order to improve accuracy. For feature selection, we utilized the random forest and gradient boosting based variable importance. For dimensionality reduction we applied an ensemble approach of several different methods approaches such as Principal Component Analysis (PCA), Non-negative Matrix Factorization (NMF) and Recursive Feature Agglomeration algorithm (RFAA). We implemented four different machine learning algorithms such as Deep Neural Network based on Keras, Random Forest based on Scikit-learn as well as gradient boosting based on LightGBM and XGBoost. We also used unweighted soft voting algorithm. Finally, we describe a method to annotate tandem mass spectra directly with compound class annotations. We assembled multiple ionization energies from a high-resolution orbital ion trap in positive and negative ionization mode. When trained by more than 300 tandem mass spectra data per class, our model showed up to 90% prediction accuracy. But when available spectra data per class is smaller, the prediction accuracies drop considerably.
language of the presentation: Japanese
 
髙城 賢大 M, 2回目発表 ユビキタスコンピューティングシステム 安本 慶一, 清川 清, 荒川 豊, 諏訪 博彦, 黄 銘
title: Design and Implementation of Notification Information Survey System and Survey Results toward Use-side Adaptive Notification Management
abstract: lots of interrupt notification method based on context recognition have been studied, but most of existing research assumes that the applications do not control the notification timing except for the target application. However, if other applications are controlled by the same notification timing, concentration of interrupt timing will occur, and the effect of notification timing control may not be exerted.In addition, since the installed applications are different for each user, it is necessary to control notification timings taking into consideration the behaviors of all the applications installed on the user's smartphone.In this research, we define notification timing control considering behavior of all installed applications as "Adaptive Notification Management", and conducted diversity surveys of otifications received by users.In this paper, we develop a system that acquires all notification information while excluding privacy. We report the experiment results actually collected using crowdsourcing, and discuss how to realize the application realizing adaptive notification management.
language of the presentation: Japanese
 
渡邉 洸 M, 2回目発表 ユビキタスコンピューティングシステム 安本 慶一, 藤川 和利, 荒川 豊, 藤本 まなと
Title: Development of a system for advanced learning from new reading experiences
Abstract: Studies have shown that reading is one of the most important elements of education. With the spread of computers and smartphones in modern times, digital documents are becoming a major reading source. Thanks to this trend, experience of reading has been expanded to receive information from dynamic contents (movies and musics). Our goal is to find out how we could advance the learning from this new reading experiences. To do this, we come up with two steps of research. First step is to create a toolkit that enables us to create a reader dependent (intelligent interactive) document from existing static document. Next step is to identify the best timing of showing additional contents to each individual readers. In this presentation we will propose an idea of the first part, “creation of the GUI toolkit for everyone to create intelligent interactive documents”.
Language of the presentation: Japanese