コロキアムB発表

日時: 9月22日(水)2限(11:00~12:30)


会場: L1

司会: 花田 研太
高橋 舜 M, 2回目発表 知能コミュニケーション 中村 哲, 渡辺 太郎, Sakriani Sakti
title: Unsupervised Neural-based Graph Clustering for Variable-Length Speech Representation Discovery in Zero-Resource Languages
abstract: Discovering symbolic units from unannotated speech data is fundamental in zero resource speech technology. Previous stud- ies focused on learning fixed-length frame units based on acous- tic features. Although they achieve high quality, they also suf- fer from a high bit-rate due to time-frame encoding. In this work, to discover variable-length, low bit-rate speech represen- tation from a limited amount of unannotated speech data, we propose an approach based on graph neural networks (GNNs), and we study the temporal closeness of salient speech features. Our approach is built upon vector-quantized neural networks (VQNNs), which learn discrete encoding by contrastive pre- dictive coding (CPC). We exploit the predetermined finite set of embeddings (a codebook) used by VQNNs to encode in- put data. We consider a codebook a set of nodes in a directed graph, where each arc represents the transition from one feature to another. Subsequently, we extract and encode the topologi- cal features of nodes in the graph to cluster them using graph convolution. By this process, we can obtain coarsened speech representation. We evaluated our model on the English data set of the ZeroSpeech 2020 challenge on Track 2019. Our model successfully drops the bit rate while achieving high unit quality.
language of the presentation: English
 
加納 保昌 M, 2回目発表 知能コミュニケーション 中村 哲, 渡辺 太郎, 宮尾 知幸, 須藤 克仁
title: Simultaneous Neural Machine Translation with Constituent Label Prediction
abstract: Simultaneous Machine Translation is the task in which the translation process starts before finishing reading the whole source sentence. It has been difficult to translate language pairs with different word orders such as English-Japanese in this task. We propose simple segmentation rules based on the syntax constituent label prediction to decide when to start translating process. Our proposed method outperformed the baselines in the experiment of simultaneous translation from English to Japanese.
language of the presentation: English
 
土肥 康輔 M, 2回目発表 知能コミュニケーション 中村 哲, 渡辺 太郎, 宮尾 知幸, 須藤 克仁, Sakriani Sakti
title: Improving Grammatical Error Correction Models Using Pseudo Data of Specific Error Categories
abstract: Grammatical error correction (GEC) is the task of automatically correcting grammatical errors in a text. Due to the lack of labeled data, various methods of augmenting the data by incorporating pseudo training data have been proposed. While previous studies use such data in pretraining, we synthesized errors in data for fine-tuning, focusing on an error category where existing GEC models perform poorly. The results showed that incorporating pseudo data with appropriate ratios improved the model performance.
In addition, I will discuss an attempt to apply GEC models to spoken language, which is difficult because of disfluencies, differences in acceptable grammar between written and spoken language, and lack of speech data.
language of the presentation: Japanese