Colloquium A

日時(Date) 2023年11月17日 (火) / Tue. Nov. 17th, 2023
4限 (15:10--16:40) / 4th period (15:10--16:40)
場所(Location) エーアイ大講義室, AI Inc. Seminar Hall (L1)
司会(Chair) Lis Weiji Kanashiro Pereira
講演者(Presenter) Jason Naradowsky (University of Tokyo / Square-Enix AI & Arts Alchemy)
題目(Title) Rethinking Offensive Text Detection as a Multi-Hop Reasoning Problem
概要(Abstract) In an increasingly social online world, content moderation and the detection of toxic speech is an important problem. Existing sentence classifiers typically perform well on the task (~90%+), but do so under one critical assumption: that we can consider what is offensive without considering who is reading it. We argue for rephrasing the task as one that considers subjective experience, a switch from "is this offensive?" to "how could this be offensive?". To evaluate this we construct a "challenge dataset" of statements which may not be overtly offensive, but can have offensive interpretations. We show that traditional methods perform poorly, necessitating a change in the modeling approach. We propose classifying offensiveness by scoring the chain of reasoning underlying an offensive interpretation of the statement (i.e., a multi-hop prediction). We show that when given chains of reasoning, the multi-hop approach can perform the task with high accuracy. We will also discuss follow-up experiments with ChatGPT.
講演言語(Language) English
講演者紹介(Introduction of Lecturer)