ゼミナール発表

日時: 9月30日(金)1限（9:20-10:50）

会場: L１

司会: 侯亜飛

金子裕哉	1561009: D, 中間発表	ネットワークシステム学	岡田実,笠原正治,東野武史,侯亜飛,Duong Quang Thang
title: An Interference Suppression Scheme for Radio over Optical On-Off Keying abstract: Radio frequency (RF) and optical on-off Keying (OOK) signal can be simultaneously transmitted by using the radio over fiber (RoF) link. In this link, optical OOK signal works as a carrier for transmitting RF signal. However, the OOK signal interferes with the RF signal. In this paper, an interference suppression scheme using biased half-wave rectification is proposed and an experiment is conducted for proof-of-concept. The improvement in terms of error vector magnitude (EVM) and dynamic range is evaluated by computer simulation. language of the presentation: Japanese

小芝涼太	1551044: M, 2回目発表	ユビキタスコンピューティングシステム	安本慶一,岡田実,荒川豊,諏訪博彦,藤本まなと
title: User-to-user distance estimation method using the BLE for the extraction of human relation in everyday life abstract: In recent years, companies which introduce a event system called shuffle lunch are beginning to appear, using lunchtime effectively. As a method for revitalizing and strengthening cooperation between offices or workers within a company, the system is receiving attention from many enterprises. In existing systems, lunch groups are just formed randomly and automatically. Then, there are opportunities like anyone can not expect, whereas some serious troubles (e.g. a formed group includes absentees, or attendees need to set the actual time to go to a lunch restaurant) happens. Based on these situations, in this study, we consider a method forming a lunch group for appropriate people at right time, by measuring differences between user-to-user distances of current or everyday-life via sensors mounted on a smart phone. In this presentation, we introduce the developed system, an Android application, which is designed in order to realize our proposed method. As a result, we confirmed that our proposed system can realize roughly group extraction by measuring user-to-user distances’ data with the system. language of the presentation: Japanese 発表題目: 日常生活における人間関係を抽出するためのBLEを用いたユーザ間距離推定手法発表概要: 近年，ランチ時間を有効利用したシャッフルランチという制度を導入している企業が登場するなど，オフィスの活性化や連携強化の手法として，一緒にランチを取るという行動が注目されている．既存システムでは，システム上でランダムにランチグループが形成される．それによって予期しない出会いが生まれる可能性がある反面，不在者がグループに含まれていたり，実際に行く時間を別途決定する必要がある，といった問題がある．そこで，本研究では，スマートフォンに搭載されたセンサを用いて，日常生活におけるユーザ間距離と現在のユーザ間距離を計測し，日々の交流と現在の状況に応じて，適切な人々に適切なタイミングでランチグループを形成する手法を検討している．本論文では，提案システムの実現に向けて開発した Androidアプリケーションと，そのアプリケーションを用いて計測したユーザ間距離からグループ検出が可能であることが確認できた．

守谷一希	1551109: M, 2回目発表	ユビキタスコンピューティングシステム	安本慶一,岡田実,荒川豊,諏訪博彦,藤本まなと
title: Propose of Indoor Localization Based on Distance-illuminance Model and Switch-on Control of Lighting Devices abstract: In this presentation, we propose an indoor localization method using a distance-illuminance model of lighting devices and trilateration. We propose a method that estimates distance from three controllable lighting devices based on the illuminance at the target point, by alternately turning on each of the devices. Then, the proposed method estimates the position of the target point based on trilateration. The proposed method are two merits. First, it can be realized at low cost because only three lighting devices and an illuminance sensor are required. Second, it is robust against influences by external lighting devices (or sunlight) since the proposed method measures the difference of the illuminance before and after turning on a lighting device. We conducted experiments in a room in an ordinary home environment and confirmed that the proposed method could estimate the position of the illuminance sensor within 0.5m error on average. language of the presentation: Japanese

会場: L2

司会: 久保尋之

武原光	1561015: D, 中間発表	視覚情報メディア	横矢直和,向川康博,佐藤智和,中島悠太
title: Non-rigid 3D reconstruction using a single RGB-D camera abstract: Non-rigid 3D reconstruction techniques have be used for recoding motions of athletes or performers and replaying the motions from free-viewpoints. For implementation of the non-rigid 3D reconstruction with a single RGB-D camera, some methods use a 3D shape template of a non-rigid object. In order to generate the template automatically from a single RGB-D image sequence capturing the moving non-rigid object, the non-rigid registration among point clouds obtained from the depth image sequence is reqired. For the non-rigid registration, a non-rigid ICP algorithm is usually used, but it often fails due to the ambiguity in point correspondences. We developed a stable non-rigid registration method for generating the 3D shape template. In order to reduce the ambiguity in point correspondences, our method leverages point correspondences in the RGB images, which can be used for associating points in different point clouds. In the experiments, we evaluate the accuracy of our method for non-rigid registration and demonstrate the capability of our method using deforming human bodies. language of the presentation: Japanese

寺崎希	1551064: M, 2回目発表	視覚情報メディア	横矢直和,向川康博,佐藤智和,河合紀彦,中島悠太
title: Reducing Texture Flicker and Inconsistency for Novel Viewpoint Stereo Video Generation abstract: Novel view synthesis (NVS) is a technique for synthesizing an image from an arbitrary view point given by a user using multiple input images of a scene. This technique can be used in applications to virtual reality. In such applications, immersion in virtual environments can be enhanced if users watch novel viewpoint stereo videos with a stereoscopic display. However, users cannot fuse stereo images if the left and right images of stereo images are separately generated by conventional NVS methods because they have been developed tailored for generating monocular images. Also, in novel viewpoint videos generated by them, in which viewpoints move, flicker between consecutive frames often occurs because they independently generate each frame. To improve the quality of novel viewpoint stereo videos, we propose a new method to generate novel viewpoint stereo videos considering the consistency in texture between frames and a part of stereo images. language of the presentation: Japanese 発表題目:自由視点ステレオ動画像生成におけるテクスチャのちらつきおよび不整合の低減発表概要:複数地点から撮影した画像群を入力とし，任意視点（仮想視点）からの見えを再現する自由視点画像生成は，遠隔地のシーンを仮想的に体感できるアプリケーションへの応用が期待されている．このようなアプリケーションでは，左右視点の動画像を立体視可能なディスプレイに表示することで，より高い没入感を得ることができる．しかし，従来手法では，単眼の自由視点画像生成を目的としているため，左右の視野に対して独立に画像を生成するとテクスチャの不整合によりうまくフュージョンできない場合がある．また，視点移動が伴う動画像の生成においても各フレームに対して独立に画像を生成するため，テクスチャの切り替わりによるちらつきが発生する場合がある．これらの問題に対し本研究では，自由視点ステレオ動画像の品質の向上を目的とし，左右画像およびフレーム間のテクスチャの整合性を考慮した自由視点画像生成手法の開発を行う．

南村敏弥	1551070: M, 2回目発表	視覚情報メディア	横矢直和,向川康博,佐藤智和,中島悠太,河合紀彦
title: Image inpainting by using convolutional neural networks abstract: Image inpainting is a technique to plausibly fill in the removed missing regions. Pacth-based methods, which are widely studied so far, perform inpinting by finding a patch in the same image similar to each patch containing the missing region and copying it to the missing region. However, these methods sometimes fails because they can't find similar patches. This work proposes a method for higher quality image inpainting using convolutional neural networks, which is trained with a vast amount of images and thus can infer the content in the missing region. In this presentation, I describe current progress and future work. language of the presentation: Japanese 発表題目: CNNを用いた画像修復発表概要: 画像修復とは、画像中の欠損領域を違和感なく埋める技術である。現在主流であるパッチベースの手法では、欠損領域を含むパッチ毎に非欠損領域から類似するパッチを探してコピーすることで修復を行う。しかしこの手法では、類似パッチが見つからず修復結果が不自然な画像になる場合がある。そこで本研究では、多くの画像で学習されており、欠損領域の内容を予測可能であると考えられる畳み込みニューラルネットワークを利用することで、より品質の高い画像修復を可能とする手法を提案する。発表では、現在の進捗および今後の方針について述べる。

西諒一郎	1551071: M, 2回目発表	視覚情報メディア	横矢直和,向川康博,佐藤智和,河合紀彦
title: Ultra-shallow DoF imaging using a pair of faced paraboloidal mirrors abstract: We propose a new imaging method that achieves an ultra-shallow depth of field (DoF) to clearly visualize a particular depth in a 3-D scene. The key optical device consists of a pair of faced paraboloidal mirrors with holes around their vertexes. In the device, a lens-less image sensor is set at one side of their holes and an object is set at the opposite side. The characteristic of the device is that the shape of the point spread function varies depending on both the positions of the target 3-D point and the image sensor. By leveraging this characteristic, we reconstruct a clear image for a particular depth by solving a linear system involving position-dependent point spread functions. In experiments, we demonstrate the effectiveness of the proposed method using both simulation and an actually developed prototype imaging system. language Japanese

会場: L3

司会: 能地宏

小田悠介	1561007: D, 中間発表	知能コミュニケーション	中村哲,松本裕治,Graham Neubig,Sakriani Sakti
title: Neural Translation Models using Binary Code Prediction abstract: Neural network-based machine translation models (NMT) is a new approach which achieve same or better translation accuracy than conventional translation models. However, current NMT models has a problem that usually requires large time and space computation ammounts compared with conventional models. This study focuses on reducing the computation amount in the output layer, at which it requires particularly large computation amount, by replacing network structure appropriately. In particular, we first converts each words into unique binary code, and make translation models predict their bit vectors instead of estimating the probability of each word as conventional models. As a result, the proposed model can reduce the computation amount of the output layer by the logarithm of the original amount. In experiments, we show the proposed model can perform the translation as similar to the conventional models. In addition, we also study what kind of binary representations are appropriate for the proposed model. language of the presentation: Japanese 発表題目: 二値符号予測によるニューラルネット翻訳モデル発表概要: リカレントニューラルネットワークに基づく翻訳モデルは、従来の機械翻訳モデルと同等以上の翻訳精度を達成可能な新たなモデルとして注目されているが、巨大な行列を何度も計算するため、計算時間・メモリの両面で大きなコストが必要となる。本研究では、従来モデルのうち特に大きな計算量を必要とする出力層に着目し、これを別の構造に置き換えることで計算量の圧縮することを検討する。具体的には、各出力単語の確率を直接予測する従来法に対して、提案法では単語を適当な二進符号に置き換え、出力層ではその符号値を予測する。これにより出力層の計算量は従来法の対数程度まで削減することが可能となる。実験により、提案法を用いたモデルにおいても翻訳を行えることを示すとともに、どのような符号表現を用いるのが適切なのかについても検討する。

品川政太朗	1561011: D, 中間発表	知能コミュニケーション	中村哲,松本裕治,Graham Neubig,吉野幸一郎,Sakriani Sakti
title: Dialog based Image Generation using Natural Language abstract: We use natural language to communicate with others effectively. It is expected to be applied to many applications if we can control primitive information, e.g.) image, audio. This study focuses on the task that getting desirable image output from natural language input. It is difficult to get an image along with the user intention, so proposed approach is modifying images iteratively, called “dialog based image generation”. Conventionally image retrieval or generation approach have a critical problem that output image is destructively changed on each dialog turn. To solve it, proposed method does not connect natural language and images, but connect natural language and image-transition rules. In this presentation, I report current progress. language of the presentation: Japanese 発表題目: 自然言語を介した対話的画像生成発表概要: 我々は情報を効率的にやりとりするために自然言語を介したコミュニケーションを行っている。特に、画像や音声のようなprimitiveな情報を自然言語で記号的に操作することができれば、様々な分野への応用が期待できる。本研究では自然言語を入力として望みの画像を出力するというタスクに着目する。一度で望みの画像を得ることは困難であることから、本研究では対話的画像生成と称して逐次的に画像を修正していく方法を検討する。従来の画像検索・画像生成では1ターンの対話を行うごとに出力される画像が破壊的に変更されてしまうという難点があるが、本研究では画像自体に自然言語を紐づけるのではなく、画像の遷移規則に対して自然言語を紐づけるアプローチを提案する。発表ではこれまでの進捗状況について報告する。

HECK MICHAEL	1561024: D, 中間発表	知能コミュニケーション	中村哲,松本裕治,Sakriani Sakti,Graham Neubig,吉野幸一郎
title: Unsupervised Acoustic Unit Detection in the Zero Resource Setting abstract: We speak of a zero resource scenario in the speech processing domain, when labeled training data and knowledge about the target language are not available. Current technology can not yet imitate capacities that are natural to humans to robustly learn acoustic and language models in an unsupervised way. Our research is concerned with answering the question: Can we learn a whole language from scratch by deploying adaptive machine learning algorithms? The absence of supervision makes it difficult to apply machine learning methods that are commonly used to build state-of-the-art speech processing systems on rich resources. In this research we exploit methods for supervised learning on several levels without the need of prior supervision. We automatically learn feature transformations on audio data only to improve a Bayesian method for automatic sound unit detection on the same data, followed by a full-fledged training of an acoustic unit recognizer utilizing acoustic and language models trained without any additional supervision besides the automatic class labels of the sound unit detection step. Future work aims at expanding our methods to approach more challenging tasks such as keyword spotting, speech recognition or machine translation. language of the presentation: English