コロキアムB発表

日時: 7月26日(木)4限(15:10~16:40)


会場: L1

司会: Nishanth Koganti
CHEN ZHENG M, 1回目発表 インタラクティブメディア設計学 加藤 博一
title: Vertebrae Pose Estimation from Single X-ray Image based on 2D-3D registration with CT data
abstract: Image registration is a key step in medical imaging analysis, as it allows single or multiple images to be registered together in a common world coordinate system to acquire detail information from a consistent 3D volume. Pose estimation for medical imaging analysis is the problem of determining the transformation of an object in a 2D image which gives the 3D object. For our previous research, it uses image similarity method to compare the two of the image models and accomplish pose estimation after image registration. However, we need more accurate results than ones by existing methods and how to find the global optimum result is still a challenge. Therefore, my research is to improve and optimize the accuracy of current method. I find a related paper presents a regression approach that learns to predict rotation and translations of arbitrary 2D image slices from 3D volumes, with convolutional Neural Networks (CNNs) to learn the highly complex regression function that maps 2D image slices into their correct position and orientation in 3D. Once we can predict the geometric transformation from 2D image to the 3D image, it means we can gain he correct pose directly and accomplish the 2D/3D image registration to acquire more information from volume data. Therefore, this paper maybe is useful for optimization current method and my research.
language of the presentation: English
 
SILVIYA HASANA M, 1回目発表 インタラクティブメディア設計学 加藤 博一

title: Improving Color Discrimination for Color Vision Deficiency (CVD).


abstract: First described by Dalton in 1794, color blindness has been around for more than 20 decades and researchers suggest several ideas to assist color vision deficiency (CVD). Color blindness cannot be treated, however we can compromise on the color vision quality. Previous work focus on color optimization to increase color discrimination. In this work, we explore and investigate alternative approaches to compensate CVD.


 

language of the presentation: English

 
梅津 𠮷雅 M, 1回目発表 ユビキタスコンピューティングシステム 安本 慶一
Title: Context Recognition based on Power Output of Energy-harvesting Modules
In a sensing system that is premised on battery driving, application of energy harvesting (environmental power generation) that extracts electric power from natural physical phenomena is expected.Generated power amount of the element depends on the physical quantity that changes depending on the state of the element and the surrounding environment such as illuminance and vibration.In this presentation, we aim to use environmental power generation elements in addition to electric power supply, also as a sensor, so that we can distinguish behaviors by observing the power generation amount as the value.To this end, first, investigate what kind of situation the generation amount of the environmental power generation element depends on.Specifically, we investigate power generation characteristics for various physical quantities for five types of solar cell elements, Peltier elements that generate electricity at temperature difference, and two kinds of vibration power generation elements that generate electricity by vibration and pressure.Next, we investigate how the waveform of the voltage obtained from each environmental power generation element changes in various target behaviors and environments, and which behavior is identified by which power generation element It will be clarified.Based on the survey, we developed and experimented with a nameplate type device equipped with a solar cell and a vibration generating element, and as a result, there is a clear difference in the voltage value generated depending on the operation and location, that is, the behavior can be recognized by the proposed method.

Language of the presentation: Japanese
 
LI FEIRAN M, 1回目発表 ロボティクス 小笠原 司
title: multi-view inpainting for RGB-D Sequence
abstract: In this work we propose a novel approach to remove undesired objects from RGB-D sequences captured with freely moving cameras, which enables static 3D reconstruction. Our method jointly uses existing information from multiple frames as well as generates new one via inpainting techniques. We use balanced rules to select source frames; local homography based image warping method for alignment and Markov random field (MRF) based approach for combining existing information. For the left holes, we employ exemplar based multi-view inpainting method to deal with the color image and coherently use it as guidance to complete the depth correspondence. We test our algorithms on open-source benchmarks and make comparisons with other methods. Results show that our approach is qualified for removing the undesired objects and inpainting the holes.
language of the presentation: English
 

会場: L2

司会: 進藤 裕之
ANDROS TJANDRA D, 中間発表 知能コミュニケーション 中村 哲, 松本 裕治, Sakriani Sakti
title: Machine Speech Chain with Deep Learning
abstract: Despite the close relationship between speech perception and production, research in automatic speech recognition (ASR) and text-to-speech synthesis (TTS) has progressed more or less independently without exerting much mutual influence on each other. In human communication, on the other hand, a closed-loop speech chain mechanism with auditory feedback from the speaker's mouth to her ear is crucial. In this paper, we take a step further and develop a closed-loop speech chain model based on deep learning. The sequence-to-sequence model in close-loop architecture allows us to train our model on the concatenation of both labeled and unlabeled data. While ASR transcribes the unlabeled speech features, TTS attempts to reconstruct the original speech waveform based on the text from ASR. In the opposite direction, ASR also attempts to reconstruct the original text transcription given the synthesized speech. To the best of our knowledge, this is the first deep learning model that integrates human speech perception and production behaviors. Our experimental results show that the proposed approach significantly improved the performance more than separate systems that were only trained with labeled data.
language of the presentation: English
 
NGUYEN VAN KHANH M, 2回目発表 知能コミュニケーション 中村 哲, 安本 慶一, 鈴木 優
Title: The impact of weather on tourism recommender system
Abstract: A Tourism recommender systems (TRS) aims to give a list of Point-of-interests (POIs) that should be proper for the user in a particular context. However, weather data which according to our common-sense may affect on traveler's check-in behavior still have a little attention on exploring the impact on TRS. In this research, we attempt to analyze the impact of weather on TRS by introducing a new recommendation system which leverages the relation between weather and check-in data to improve recommendations. We address the problem by ranking prediction probability values for all POI using classification models. The results show that the proposed system not only can provide the suitable items but also improve quality of tourism recommendation.

Language of the presentation: English
 
榊原 宙 M, 2回目発表 知能コミュニケーション 中村 哲, 松本 裕治, 須藤 克仁
 
JOHANES EFFENDI THE M, 2回目発表 知能コミュニケーション 中村 哲, 松本 裕治, Sakriani Sakti, 須藤 克仁
Title: Elementary Operation Paraphrasing for Multi-expert Neural Caption Translation
Abstract: Common approaches in caption translation task is to use visual information as additional context by directly embedding image features and combining with the source language description within a multimodal system. However, the results up to now showed that the additional image features could only slightly contribute to system performance. In this research, we proposed another direction in which we use various visual descriptions by way of paraphrasing to describe an image without using the image itself in the translation process. The resulting paraphrased captions are then utilized within a multi-expert neural machine translation model. Our proposed approach outperformed the NMT baseline with 2.4 BLEU score margin, which is close to the top score that used a multimodal model.

Language of the presentation: English