日時(Date) |
2023年11月17日 (火) / Tue. Nov. 17th, 2023 4限 (15:10--16:40) / 4th period (15:10--16:40) |
---|---|
場所(Location) | エーアイ大講義室, AI Inc. Seminar Hall (L1) |
司会(Chair) | Lis Weiji Kanashiro Pereira |
講演者(Presenter) | Jason Naradowsky (University of Tokyo / Square-Enix AI & Arts Alchemy) |
題目(Title) | Rethinking Offensive Text Detection as a Multi-Hop Reasoning Problem |
概要(Abstract) | In an increasingly social online world, content moderation and the detection of toxic speech is an important problem. Existing sentence classifiers typically perform well on the task (~90%+), but do so under one critical assumption: that we can consider what is offensive without considering who is reading it. We argue for rephrasing the task as one that considers subjective experience, a switch from "is this offensive?" to "how could this be offensive?". To evaluate this we construct a "challenge dataset" of statements which may not be overtly offensive, but can have offensive interpretations. We show that traditional methods perform poorly, necessitating a change in the modeling approach. We propose classifying offensiveness by scoring the chain of reasoning underlying an offensive interpretation of the statement (i.e., a multi-hop prediction). We show that when given chains of reasoning, the multi-hop approach can perform the task with high accuracy. We will also discuss follow-up experiments with ChatGPT. |
講演言語(Language) | English |
講演者紹介(Introduction of Lecturer) |