コロキアムB発表

日時: 06月13日（木） 3限目（13:30-15:00）

会場: L1

司会: 藤村友貴

伊藤　元太	D, 中間発表	ソーシャル・コンピューティング	荒牧　英治,	佐藤　嘉伸,	若宮　翔子,	矢田　竣太郎
title: Improving Clinical Predictions and Analyses through Text Mining of Electronic Medical Records abstract: In recent years, it has become common practice to store medical information in electronic medical records (EMRs). EMRs contain structured data such as clinical test values, but they also often include crucial information recorded as free-text entries. In clinical research utilizing EMR data, the challenge of processing text data often results in insufficient use of the information contained within the text. This study aims to extract medically meaningful information from the text recorded in EMRs to enable the prediction of clinically significant events, patient classification, and the search for similar patients. By addressing the challenges associated with text data, this research seeks to improve the utility of EMR data, facilitating more accurate and comprehensive clinical predictions and analyses. language of the presentation: Japanese

PANGAN ZACHARY SAMSON	D, 中間発表	ソーシャル・コンピューティング	荒牧　英治,	清川　清,	若宮　翔子,	矢田　竣太郎,	Peng Shaowen
title: Bridging the API Wall: Using LLM-Agent Based Network Simulations to Study Online User Behavior abstract: With social media platforms becoming increasingly stringent with their data access, confronting the "API wall" is significant for progressing social media research. This study introduces Large Language Model (LLM)-Agent Based Network Simulations to break that “API wall” by replicating the complex social media interactions and network dynamics. This research integrates computational social science, natural language processing, and network theory to create environments where virtual agents, powered by LLMs, exhibit behaviors based on evolving profiles. A key aspect of this research is the incorporation of Graph Neural Network (GNN)-based recommender systems that are also enhanced by LLMs. These systems dynamically direct content, personalizing agent experiences based on behavior patterns and network interactions, mirroring complex, real-world social media engagement. The simulations enable exploration of scenarios that are impossible, impractical or unethical in real settings, such as the spread of misinformation or the impact of digital marketing, offering valuable insights into digital communication patterns and their implications. Academically, the research provides novel tools for studying digital sociology and psychology, while industrially, it informs strategies for healthier digital interactions. Overall, this study advances computational social science and aids in understanding the digital landscape's societal impacts. language of the presentation: English

KIDANU HAFTAMU KAHSAY	M, 1回目発表	ソーシャル・コンピューティング	荒牧　英治,	渡辺　太郎,	若宮　翔子,	矢田　竣太郎,	Peng Shaowen
title: Identifying Targets of Hate Speech in Amharic YouTube Comments abstract: The spread of hate speech on social media, particularly targeting ethnicity, religion, race, gender, or disability, is a growing concern, especially in Ethiopia, a populous and ethnically diverse country. While some research has developed methods for automated hate speech detection in Amharic (the national language of Ethiopia) using binary or multi-class datasets, there is a limited body of research on identifying the specific target of the hate speech. In this work, we propose a multi-layer annotation schema for automatic hate speech detection to identify the target of hate speech in YouTube comments written in Amharic. language of the presentation: English

大槻　優佳	M, 1回目発表	ソーシャル・コンピューティング	荒牧　英治,	渡辺　太郎,	若宮　翔子,	矢田　竣太郎
title: Towards Clinical Research DX: Developing an Automated System for Structuring Risk Factors of Stoke abstract: Natural language processing (NLP) technology has been widely used in clinical research, contributing to the digital transformation (DX) of clinical studies. This is because NLP can extract patient history and test results from the text stored in electronic health records, which is applicable to clinical research, such as prognosis prediction. Previous studies have combined various NLP models, including named entity extraction and sentence classification, to extract medical items from text. However, sustainable maintenance has been a challenge. In this study, we focused on structuring text related to stroke risk factors to develop a method that can extract a large number of items using a single NLP model. The proposed method utilizes a single NLP model trained through multi-task learning of T5, a type of large language model. This model can effectively solve both the task of extracting test values such as blood pressure and the task of discriminating whether a person smokes or drinks alcohol. In our experiment, we compared the effects of different multi-task learning methods and evaluated the performance of each risk factor item. The extraction task achieved a practical performance of more than 0.8 F1 value. However, the discrimination task's performance is low. We are currently improving the method by using a text-structured GUI application that incorporates the method. Our aim is to use it for clinical research DX. language of the presentation: Japanese 発表題目: 臨床研究DXへの試み：脳卒リスク因子の自動構造化システムの開発発表概要: 電子カルテに蓄積されたテキストから患者の既往歴や検査結果を抽出（テキスト構造化）できれば予後予測などの臨床研究に応用できることから，自然言語処理(NLP)技術は盛んに適用され，臨床研究のデジタルトランスフォーメーション(DX)に貢献している．既往研究ではテキストから抽出したい医学的項目の性質に応じて固有表現抽出や文分類などの異なるNLPモデルを組み合わせており，持続的なメンテナンスに課題があった．本研究では単一のNLPモデルで多数の項目を抽出できる手法の開発を目的に，脳卒中リスク因子のテキスト構造化に取り組んだ．提案手法は大規模言語モデルの一種であるT5を項目ごとにマルチタスク学習させた単一のNLPモデルであり，血圧などの検査値の抽出タスクと喫煙・飲酒の有無といった識別タスクの両方を解くことができる．実験では，異なるマルチタスク学習方法の効果を比較するとともに，リスク因子項目ごとの性能を評価した．抽出タスクはF1値0.8以上の実用的性能を得たが，識別タスクの性能は低いのが課題である．臨床研究DXを目指し，本手法を内蔵したテキスト構造化GUIアプリを運用しながら改善を進めている．

会場: L2

司会: 江口僚太

VALERIE MEGAN	M, 1回目発表	サイバーレジリエンス構成学	門林　雄基,	笠原　正治,	妙中　雄三
title: AI-Generated Persona Detection abstract: Current definitions of AI-generated personas overlook the significant threat posed by advanced social bots, which can convincingly manipulate individuals through their human-like characteristics. Contrary to popular belief, these sophisticated AI personas are already present in various forms, such as virtual influencers. This research aims to redefine AI-generated personas by highlighting their current prevalence and potential impact. We propose exploring a novel detection algorithm that evaluates personas holistically, incorporating behavioral patterns, interaction styles, and other contextual factors, rather than focusing solely on content analysis. This approach addresses the research gap created by outdated definitions and the emphasis on enhancing their human-like attributes. By developing this advanced detection method, we aim to mitigate the threats posed by these deceptive AI personas. language of the presentation: English

AURNA NAHID FERDOUS	D, 中間発表	サイバーレジリエンス構成学	門林　雄基,	笠原　正治,	林　優一,	妙中　雄三,	HOSSAIN, Md Delwar
title: Financial Cyber Security: Leveraging Advanced, Transparent, and Secure AI abstract: In the rapidly evolving field of financial cyber security, enhancing the robustness, transparency, and security of federated learning (FL) models is crucial. Existing challenges include significant feature discrepancies among federated clients, data privacy concerns, and lack of transparency in model decisions. To address these issues, we aim to introduce three innovative approaches. First, FedFusion, an adaptive weighted fusion method, balances local and global model contributions to handle client feature discrepancies. Second, personalized training and conditional updates ensure robust global model performance while protecting data privacy. Third, we enhance model transparency through SHAP explainability, providing clear insights into model decisions. These reserach aims to mitigate client drift, improve detection capabilities, and advance financial cyber security using FL, ensuring robust performance in dynamic environments. Moreover, tackling the adversaries in FL architecture could be crucial and evolving nature of the threats in financial domain can create challenges in future research. language of the presentation: English

TONGIAM PACHARAWAN	M, 2回目発表	サイバーレジリエンス構成学	門林　雄基,	林　優一,	妙中　雄三,	HOSSAIN, Md Delwar
title:Federated Learning and Explainable AI for Privacy-Preserving and Robust Android Malware Detection abstract: The proliferation of Android malware presents significant challenges, necessitating robust, scalable, and privacy-preserving detection mechanisms. Traditional machine learning methods struggle with the dynamic and sophisticated nature of such threats, particularly when dealing with heterogeneous data sources across distributed environments. This research explores the integration of Federated Learning (FL) and Explainable Artificial Intelligence (XAI) to address these challenges. Our methodology leverages a federated learning framework employing an ensemble of neural network models—Multi-Layer Perceptron (MLP), Convolutional Neural Network (CNN), and Long Short-Term Memory (LSTM)—trained across three distinct datasets: Drebin, Kronodroid, and CCCS-CIC-AndMal-2020. We investigate the framework's effectiveness, scalability, and robustness in handling heterogeneous data while ensuring privacy through decentralized data processing. Additionally, we employ XAI techniques, specifically SHAP and LIME, to enhance the interpretability of our models, thereby providing insights into the feature importance and decision-making processes of the models. Our results indicate that the federated ensemble approach not only achieves high accuracy in malware detection but also maintains consistent performance across varied client configurations, illustrating the potential of FL and XAI in enhancing cybersecurity measures. language of the presentation: English

OYEBODE OLUWATOBI OYEWALE	M, 1回目発表	サイバーレジリエンス構成学	門林　雄基,	林　優一,	妙中　雄三,	HOSSAIN, Md Delwar
title: * ADAPTING A CELL PHONE FOR VOICE BIOMETRIC VERIFICATION * abstract: *The banking sector emphasizes secure identity verification for online transactions, traditionally relying on passwords and PINs, which are vulnerable to theft and loss. This study investigates the feasibility and effectiveness of adapting cell phones for voice biometric verification, aiming to enhance security and user experience in mobile banking, particularly for users in developing countries who often lack high-end smartphones with biometric capabilities. Our objectives include developing a machine learning algorithm (MLA) to identify voices over cell phones and designing backend software for voice verification using the developed MLA. Existing studies on voice biometric verification have primarily focused on direct interaction with advanced devices, overlooking the challenges faced by users in developing regions that rely on basic cell phones. Our methodology involves collecting a voice dataset through various cell phones, training the system to recognize and assign voice owners, and developing a backend software for voice verification. A self-collected dataset from university students incorporating recordings from different devices and environments was used to minimize voice variation. Machine learning techniques, including convolutional neural networks (CNN) and recurrent neural networks (RNNs), were employed for voice training, with PostgreSQL used for storing the trained model and dataset. The expected outcome is a robust AI system capable of securely storing and comparing voice recordings, which significantly enhances the security of mobile banking. This research contributes to providing an accessible and reliable biometric verification method, addressing the limitations of current high-end systems, and improving e-banking security for a broader user base. * language of the presentation: * English *