Your_MS_Thesis_Or_Report_No.NAIST-IS-MT0851139:Torres Rodriguez Rafael Antonio

SVM and Pboost-based Inquiry Classification Methods Applied to Speech-Oriented Guidance Systems

Torres Rodriguez Rafael Antonio (0851139)

One of the most important and natural means for social interaction among humans is speech. Improvements in automatic speech recognition (ASR) technologies have made feasible the implementation of systems that interact with users through speech in real environments. Speech-oriented guidance systems interact with users through speech to provide information in a specific environment. In order to be effective, the system should be able to respond appropriately to users' inquiries. To address this, example-based response selection methods map the ASR result of an input inquiry to a response, using a question and answer database (QADB) that contains example questions paired with specific responses. An advantage of these methods is that the system's coverage can be easily expanded by adding QA pairs to the database. A conventional 1-Nearest Neighbor (1-NN) based method can be used to map the ASR result of an input inquiry to a response, by calculating a similarity score between the input inquiry and the examples contained in the QADB, and selecting the response paired to the most similar example. To deal with this problem, an approach consists on reducing the amount of responses to be evaluated in the selection process, grouping responses that share a common topic. Therefore, an input inquiry would be classified in a response category, and the response would be selected only from the possibilities included in the assigned response category. In this work we propose the use of discriminative methods to train a model from the QADB and perform a multi-class classification of input inquiries in response categories. We compare the performance on this task of a Support Vector Machine (SVM) based method and a PrefixSpan Boosting (pboost) based method, against a 1-NN-based conventional method. We have tested the methods using real data, obtained from the speech-oriented guidance system Takemaru-kun. Experimental results show a strictly correct classification accuracy improvement from 89.0% to 92.3%, using SVM with RBF kernel function, and probability estimates for the 1-vs-rest multi-class classification decision function, in the classification of ASR 10-best results, compared to the conventional method.