ゼミナール発表

日時: 9月30日(水)5限 (16:50-18:20)


会場: L1

司会: 南 裕樹
今井 優作 1451013: M, 2回目発表 松本 裕治,池田 和司,山田 武士,新保 仁
title: Mixture of Topic Models for Analyzing Short Text Documents with User Information
abstract: We propose a mixture of topic models for analyzing short text documents with user information such as Twitter data. Latent Dirichlet Allocation (LDA), which is a representative topic model, is widely used for analyzing text documents. In previous work, which applied LDA to Twitter data, all the tweets of the same user are aggregated into a single pseudo document, since a tweet is too short to infer topic proportions properly. However, this method cannot represent the differences between the tweet topics of a given user. The proposed model addresses this issue by clustering a user’s tweets according to topic, where the tweets in each cluster have common topic proportions. The proposed model can use more data for inferring topic proportions by aggregating tweets adaptively, and can also have different topic proportions for different tweets of the same user. With the proposed model, a set of tweets for each user is modeled as a mixture of multiple topic proportions. By using Dirichlet processes, we automatically infer the number of clusters for each user. We develop the inference procedures based on collapsed Gibbs sampling. We demonstrate the effectiveness of the proposed model with experiments using Twitter data.
language of the presentation: Japanese
 
椿 真史 1461006: D, 中間発表 松本 裕治,池田 和司,新保 仁,進藤 裕之
title: Representation Learning for Natural Language and Biological Sequences
abstract: Many Natural Language Processing (NLP) applications rely on the existence of similarity measures over text data. Although word vector space models provide good similarity measures between words, phrasal and sentential similarities derived from composition of individual words remain as a difficult problem. On the other hand, protein structure prediction is an important challenge for bioinformatics. While the success of the prediction relies on the feature vectors, it is not trivial to represent proteins represented as amino acid sequences in a vector space. In this paper, we propose a new method of non-linear similarity learning for semantic compositionality and protein structure prediction. In this method, sentence and protein representations are efficiently learned through the similarity learning in a high-dimensional space with kernel functions. On the task of predicting the semantic similarity of two sentences and the contact map of proteins, our methods outperform baselines, feature engineering approaches, and achieve competitive results with neural networks and Deep Learning models.
language of the presentation: Japanese
 
下村 環太朗 1451058: M, 2回目発表 池田 和司,笠原 正治,久保 孝富
title: The influence of duplicated eigenvalues in the estimation for covariance matrices.
abstract: The estimation for covariance matrices of multivariate normal distribution plays an important role in multivariate analysis. It is known that the performance of estimation is influenced by duplicated eigenvalues of covariance matrices. Furthermore, that influence also happens when some eigenvalues of covariance matrices are close to each other. In this presentation, we will discuss about how much influence such duplicated or indistinct eigenvalues have in the estimation. To analyze the relationship between estimation performance and eigenvalues, the learning coefficient of covariance estimation is considered. The learning coefficient is the value that represents the generalization performance of estimation. The analytical and numerical results will be obtained to show this value can be formulated with respect to the difference between eigenvalues of covariance matrices.
language of the presentation: Japanese
発表題目: 共分散行列の推定における固有値重複の影響
 
町田 宗丈 1451097: M, 2回目発表 池田 和司,杉本 謙二,川人 光男
title: Behavior model and clinical scale for Obsessive-Compulsive Disorder
abstract: Humans decision-making model receiving attention as solve at mathematically and scientifically to aberration base of various psychiatric disorders. We research aim at apply to treatment obtain relationship decision-making model from analyze behavior data at psychiatric patient and healthy control, and neurotransmitter in Obsessive-Compulsive Disorder.In this presentation we report the results of analysis for correlation both clinical measure of Obsessive-Compulsive Disorder and behavioral models that focusing on compulsive. Finally talk about future plans that make mention of impulsivity of Obsessive-Compulsive Disorder.
language of the presentation: Japanese