コロキアムB発表

日時: 6月9日(金)3限目(13:30-15:00)


会場: L3

司会: PERUSQUÍA-HERNÁNDEZ Monica
松田 裕貴 D, 中間発表 ユビキタスコンピューティングシステム 安本 慶一, 荒牧 英治, 諏訪 博彦, 松田 裕貴
title: Proposal of Integrated Method for Television Viewing Data Collected by Broadcasters and Data Analysis towards Visualizing Television Commercial Effects
abstract: We propose and implement an ID integration method for television viewing data collected by private broadcasters in the Kinki region. Each broadcaster issues a unique ID for the same television, making it challenging to identify the same television even when aggregating television viewing data. We propose a method to integrate these IDs and validate the proposed method using actual data in a joint technical experiment conducted among four private broadcasters in the Kinki region. As a result, out of approximately 3.76 million television sets, we successfully estimated the same ID for approximately 2.67 million sets (approximately 71.0%). Furthermore, to investigate whether viewers conducted internet searches for products or other items after watching commercials, we developed a system to acquire long-term Google search values. We analyze the contribution of searches to commercials by analyzing this data along with viewing data, aiming to visualize the effectiveness of television commercials. Our results demonstrate that depending on the product genre, watching commercials can have an influence on subsequent search behavior.
language of the presentation: Japanese
発表題目: 放送局が収集するテレビ視聴データの統合手法の提案とテレビCM効果の可視化に向けたデータ分析
発表概要: 在阪の民間放送局が各々取得しているテレビ視聴データのID統合手法の提案と実践を行う.放送局は同一テレビに対して各々が独自IDを発行しており,テレビ視聴データを集約したとしても同一テレビの判別ができない。これらのIDを統合する手法の提案を行い,実際に4つの在阪民間放送局の間で行われた共同技術実験において実データを用いて検証することで提案手法の論点を整理し,ID統合を実践した.その結果,約376万台分のデータのうち,約267万台(約71.0%)のテレビIDの同一推定に成功した. また,視聴者がCM視聴後に商品等の検索をインターネットで実施したのか調査するため,Google検索の値を長期的に取得するシステムを構築した.このデータと視聴データからCMの検索への寄与度を分析することでテレビCM効果の可視化を行い,商品ジャンルによってはCM視聴から検索行動に影響を与える可能性があることを示した.
 
NOHEJL ADAM D, 中間発表 自然言語処理学 渡辺 太郎, 荒牧 英治, 進藤 裕之, 大内 啓樹
title: Complex Vocabulary Suggestions and Lexical Simplification
abstract: Lexical Simplification (LS) is the task of replacing complex words with simple ones to improve reading comprehension, often targeting language learners. We propose a new task, Complex Vocabulary Suggestions, aimed at assisting learners with writing (production) and vocabulary acquisition, and discuss similarities with (LS). We also propose a new method for LS, more suitable for morphologically rich languages (such as Japanese) than the current SOTA. We evaluate its results on the recent TSAR-2022 ST datasets.
language of the presentation: English
 
VINCENT MICHAEL SUTANTO M, 2回目発表 自然言語処理学 渡辺 太郎, 荒牧 英治, 進藤 裕之

title: Improving the Performance of Mathematical Equation Extraction: Leveraging Multimodalities and Alignment-based Reranking

abstract: Recognizing mathematical equations from scientific documents has been known as a challenging problem: unlike ordinary documents, scientific papers have mathematical equations with a structure that must be carefully observed. In this study, we present a two-stage approach to recognizing LATEX markup from scientific documents’ mathematical equations. First, a multimodal Transformer model is utilized to capture long-term dependencies and relations between two inputs: the equation’s image and pdf-extracted box-embedded characters. Next, a lightweight alignment-based reranker is used to assess which candidate resembles the distribution of the input data the most. Our two-stage approach yields the following result: the first-stage multimodal Transformer model produced a BLEU score of 87.48 and 84.84% image match rate. The second-stage alignment-based reranker improved the score significantly to 90.99 BLEU and 85.35% image match rate.

language of the presentation: English