コロキアムB発表

日時: 9月18日(金)2限(11:00~12:30)


会場: L1

司会: Kim Youngwoo
廣瀬 雄士 M, 2回目発表 自然言語処理学 渡辺 太郎, 中村 哲, 進藤 裕之 新保 仁 (客員)
title: Integrating path feature into embedding for Knowledge graph completion
abstract: Knowledge graph completion(KGC) is a task to predict missing relations in knowledge graph. Several KGC method, such as embedding model and path feature model are proposed previously. More complex model, graph based neural network model is recent state-of-the-art. In this research, we examine how much embedding model improves by integrating path feature. Our current results suggest path features can increase the model scores a little. We will explore how to improve the model more for future work.
language of the presentation: Japanese
 
芝原 隆善 D, 中間発表 自然言語処理学 渡辺 太郎, 中村 哲, 進藤 裕之
title: Term Extraction and Classification for Document Organization
abstract: Recently, famous search system (like Google) is based on keyword based search. But, to use keywords, we have to know those preliminary. So, traditionary librarians use thesaurus (hierarchical vocabulary) and index documents into the concept of thesaurus. However, this indexing is time-consuming, so we try to identify terms from documents and map these terms into thesaurus. Especially, we try to extract spans from documents and classify these spans into upper layer of thesaurus without annotated data. In this presentation, we overview related works and we show the result of our distant supervision NER.
language of the presentation: Japanese
タイトル: 文書組織化のための語句抽出と分類
発表概要: 今日よく利用される検索システムはキーワード検索に基づいている。しかしキーワードを使うためにはそのキーワードを事前に知っておく必要がある。そのため図書館員の人たちは伝統的にシソーラス(階層化された語彙集)を利用して文書をシソーラスの概念に対応させ目録を作ってきた。 しかしながらこのような目録作成はとても時間のいる作業であり、そのため私たちは文書中の語句をを同定しシソーラスに対応づける。 特に、私たちは文書からスパンを特定し、それをアノテーションデータなしにシソーラスの上の階層へと分類することを試みる。 今回の発表では、関連研究とロードマップを示し、現在取り組んでいるDistant Supervision NERの実験結果を述べる。
 
東山 翔平 D, 中間発表 自然言語処理学 渡辺 太郎, 中村 哲, 進藤 裕之
title: Word Segmentation for Various Domains
abstract: Word segmentation (WS) is a fundamental step of natural language processing for unsegmented languages, including Japanese. Although extensive research efforts have achieved accurate segmentation in general domains such as news texts, different domains have specific problems. In this work, we categorize text domains into three types and propose WS methods suitable for domain types. First, for general domains where a large amount of labeled data is available, we propose an effective method combining complementary character and word information. Second, for specialized domains where labeled data is non-available, we propose a domain adaptation method utilizing unlabeled data and lexicons in target domains. Third, for user-generated texts where non-standard words often occur, we will perform a joint WS and normalization task to convert non-standard words to standard words. Our research so far demonstrated that the first and second methods achieved performance improvements on datasets of the respective domain type.
language of the presentation: Japanese
 

会場: L2

司会: 趙 崇貴
青谷 拓海 D, 中間発表 知能システム制御 杉本 謙二, 小笠原 司, 松原 崇充, 小林 泰介
title: Multi-agent Reinforcement Learning by Reward Shaping to Classify Interests
abstract: A multi-agent system (MAS) is expected to be applied to various real-world problems where a single agent cannot accomplish given tasks. Due to the inherent complexity in the real-world MAS, however, manual design of group behaviors of agents is intractable. Multi-agent reinforcement learning (MARL), which is a framework for multiple agents in the same environment to learn their policies adaptively by using reinforcement learning, would be a promising methodology for such complexity in the MAS. To acquire the group behaviors by MARL, all the agents are required to understand how to achieve the respective tasks cooperatively. In the conventional MARL, although decentralization is essential for feasible learning, rewards for the agents have been given from a centralized system (named as "top-down MARL"). In this research, therefore, we propose "bottom-up MARL", which is a decentralized system to manage real and large-scale MARL, with a reward shaping algorithm to represent the group behaviors. The reward shaping algorithm classifies interests (cooperative agents and competitive agents) by predicting the rewards of other agents, and generates a reward for the group behaviors. The interests are regarded as correlation coefficients derived from the agents' rewards, which are numerically estimated in an online manner. Actually, in both simulations and real experiments without knowledge of the interests between the agents, they correctly estimated their interests, thereby allowing them to derive their new rewards to represent the feasible group behaviors. As a result, our extended algorithm succeeded in acquiring the group behaviors from cooperative tasks to competitive tasks.
language of the presentation: Japanese
 
藤石 秀仁 M, 2回目発表 知能システム制御 杉本 謙二, 小笠原 司, 小林 泰介
title: Safe Imitation Learning based on Anomaly Detection with Variational Auto Encoder
abstract: Behavioral cloning, which is one of the imitation learning methods, enables a robot to imitate expert's policy from the expert's state and action demonstrations. In that case, the robot does not need to interact with environment, thereby preventing robot failure. However, in general, it is difficult to obtain expert action information. Although behavioral cloning from observation allows the robot to learn the policy without action sequence, it requires a few interactions with the environment to infer expert action, which leaves the risk of robot failures. Detecting faced situations are safe or dangerous is an effective way to prevent such dangerous interactions. Suppose that the expert's demonstrations only visited the safe states, this paper proposes a new outlier detector using variational autoencoder learned by the expert's data. It can easily find unexperienced and dangerous scenes since all the data used for learning are mapped to limited space. In real machine experiment, the proposed method shows the agent can avoid the transition to dangerous state and imitate expert's policy. Furthermore, as an additional effect, the agent can learn the task with less interaction than behavioral cloning from observation by focusing on the trajectory experienced by expert.
language of the presentation: Japanese
 
馬渕 俊弥 M, 2回目発表 知能システム制御 杉本 謙二, 小笠原 司, 小林 泰介
title: RGaM: Novel Recurrent Unit with Forget Gate and Memory Trace
abstract: Recurrent Neural Networks (RNN) with high memory capacity is one of the critical challenges. To improve the memory capacity of RNN, its structure with gates has been introduced for Long Short-Term Memory (LSTM). And previous studies have shown that the forget gate is the most important gate in LSTM. In addition, the structure of only the forget gate can find an analogy with a leaky integrator, which can be sophisticated by deriving memory trace from a fractional-order leaky integrator. Therefore, we propose a new RNN structure with the forget gate and the memory trace, so-called RGaM. This structure can hold information in its internal states for a long time, while it can suppress diverge of information. The effectiveness of RGaM in terms of state prediction accuracy is confirmed by comparing it with several existing methods.
language of the presentation: Japanese
 
芳澤 健太 M, 2回目発表 知能システム制御 杉本 謙二, 小笠原 司, 小林 泰介
title: Deep Reinforcement Learning with Feedforward and Feedback Policies
abstract: In recent years, deep reinforcement learning (DRL) technology has been developed dramatically, and to date, it has the ability to beat humans with board games. One of the next frontiers for DRL is robotics with complicated tasks. However, many of robots controlled by DRL act relatively slowly because, in reinforcement learning, action to interact with environment is decided according to state sampled from sensors, which cause certain delays due to measurement and communication systems. We therefore propose a new DRL algorithm with feedforward and feedback policies. The feedforward policy, which is additionally given, would compensate the feedback policy, which is originally given, by deciding the optimal action without sensor data, instead only with history of the robot’s behavior. In this presentation, the setup of robot environment for DRL is conducted, where a preliminary DRL task is successfully achieved.
language of the presentation: Japanese