伊藤 和浩 | D, 中間発表 | ソーシャル・コンピューティング | 荒牧 英治 | 安本 慶一 | 若宮 翔子 | 矢田 竣太郎 |
title: Measuring a group state through language and its effect on the psychology of individuals
abstract: In the modern era of increasing social mobility, the creation of a sense of belonging for individuals has become a critical challenge in many areas. To foster a comfortable community environment, it is essential to clarify the impact of group dynamics on individual psychological well-being. This study proposes a method to observe the influence of group conditions on individual psychology using natural language processing techniques from two perspectives. The first perspective examines the relationship between the sharing of group status and individual well-being. Specifically, we hypothesized that the more a team's condition is shared within the group, the higher the well-being of its members. This hypothesis was tested using survey data collected over two months from company employees. The second perspective explores the estimation of collective identity through innovative language usage. In particular, we hypothesized that groups using words in unique and distinctive ways exhibit a stronger collective identity. This hypothesis was tested by treating YouTube channels as groups. Future research will focus on combining both perspectives — well-being and innovative language usage — to conduct further experiments. language of the presentation: Japanese 発表題目: 言語を通じた集団的状態の測定と,個人の心理への影響 発表概要: 社会的な流動性が高まり続ける現代,人々の居場所作りは至る所で課題となっている.居心地の良いコミュニティのためには,集団の状態が個人の心理に与える影響を明らかにすることが有用である.発表者は2つの観点から集団の状態が個人の心理に与える影響について,自然言語処理技術を用いて観測する手法を提案する.1つめの観点は場の状態の共有と個人のwell-beingの関係である.具体的には,企業の従業員を対象に2か月間にわたって収集したアンケートデータを用い,チームの状態がチーム内で共有されているほど,個々のwell-beingが高いという仮説を検証した.2つめの観点はイノベーティブな言語使用度合いによる集団的アイデンティティの推定である.具体的には,YouTubeのチャンネルを一つの集団とみなし,独特な語義で言葉を使用する集団ほど,集団的アイデンティティが高いという仮説を検証した.今後はwell-beingとイノベーティブな言語使用とを組み合わせた観点での実験も予定している. | ||||||
山田 理 | D, 中間発表 | ユビキタスコンピューティングシステム | 安本 慶一 | 荒牧 英治 | 諏訪 博彦 | 松田 裕貴 |
title: PoI-Level Congestion Prediction Method Through the Generation of People Flow Data Based on Synthetic Population Data and Its Assimilation with Actual Data
abstract: In the post-COVID-19 era, as people resume more active behaviors, congestion has emerged as a significant social issue, leading to various problems. To mitigate congestion, it is crucial to predict crowding in advance, support individuals in planning their activities, and encourage behavioral changes through optimized people flow. Achieving this requires extensive and highly accurate predictions at the Point of Interest (PoI) level. This study proposes a novel prediction method aimed at forecasting future human flow with high accuracy and over wide areas at the PoI level. The method integrates a simulation that can account for individuals' stay information at PoIs and regional and societal factors. To enable accurate PoI-level predictions under diverse conditions, the proposed method is composed of three key approaches: 1. Generating stay information at each PoI from the time-series location data of smartphone users. 2. Calculating transition probabilities between PoIs. 3. Assimilating the simulation results with observed data using synthetic population data that reflects the attributes of the target area's population. The prediction model's performance was evaluated by calculating the cosine similarity between the transition probabilities derived from the model and those obtained from actual measurements. The results demonstrate that the proposed method enables highly accurate predictions. This presentation will report on these findings and discuss further developments in methods for predicting future human flow with high accuracy and over wide areas at the PoI level. language of the presentation: japanese 発表題目: 合成人口データに基づく人流データの生成と実測データとの同化によるPoIレベル混雑度予測手法 アフターコロナで人々の行動が活発化する中、混雑により様々な社会問題が引き起こされている.混雑を解消するためには,事前に混雑を予測し、人々の行動計画を支援することや人流最適化によって行動変容を促す必要がある.このような対策を行うためには,PoI(Point of Interest)レベルでの広範囲かつ高精度な予測が必要となってくる.本研究では広範囲で高精度なPoIレベルでの未来の人流予測を目的とし,人々のPoIでの滞在情報と地域ごとの事情や社会的情勢を考慮可能な人流シミュレーションを組み合わせた新たな予測手法を提案する.PoIレベルでの予測,広範囲の様々な状況下での予測を実現するため,提案手法は3つのアプローチより構成される.1.スマートフォン利用者の時系列の位置情報から各PoIの滞在人数情報の生成,2.PoI間の遷移確率の算出 3.対象地域の人々の属性を考慮した人口合成データを用いたシミュレーションによる実測データとの同化.構成された予測機構によって算出された人々の遷移確率と実測データによる遷移確率のコサイン類似度を計算し,予測性能を調査した.結果として,提案手法は高精度での予測が可能であることが示された.本発表では,この調査結果を報告し,より広範囲でさらに高精度なPoIレベルでの未来の人流予測する方法について議論を行う. | ||||||
藤川 直也 | M, 2回目発表 | ソーシャル・コンピューティング | 荒牧 英治 | 渡辺 太郎 | 若宮 翔子 | 矢田 竣太郎 | |
title: Loneliness Episodes: A Japanese Dataset for Loneliness Detection and Analysis
abstract: Loneliness, a significant public health concern, is closely connected to both physical and mental well-being. Hence, detection and intervention for individuals experiencing loneliness are crucial. Identifying loneliness in text is straightforward when it is explicitly stated but challenging when it is implicit. Detecting implicit loneliness requires a manually annotated dataset because whereas explicit loneliness can be detected using keywords, implicit loneliness cannot be. However, there are no freely available datasets with clear annotation guidelines for implicit loneliness. In this study, we construct a freely accessible Japanese loneliness dataset with annotation guidelines grounded in the psychological definition of loneliness. This dataset covers loneliness intensity and the contributing factors of loneliness. We train two models to classify whether loneliness is expressed and the intensity of loneliness. The model classifying loneliness versus non-loneliness achieves an F1-score of 0.833, but the model for identifying the intensity of loneliness has a low F1-score of 0.400, which is likely due to label imbalance and a shortage of a certain label in the dataset. We validate performance in another domain, specifically X (formerly Twitter), and observe a decrease. In addition, we propose improvement suggestions for domain adaptation. language of the presentation: Japanese | |||||||
HAN PEITAO | M, 2回目発表 | ソーシャル・コンピューティング | 荒牧 英治 | 渡辺 太郎 | 若宮 翔子 | 矢田 竣太郎 | Peng Shaowen |
title: Semantic Structure Augmented Language Model for Relation Extraction
abstract: Large language models (LLMs) exhibit strong in-context learning (ICL) abilities across various NLP tasks, we propose an AMR-enhanced retrieval-based ICL method for RE. Our model effectively retrieves tailored in-context examples based on the semantic structure similarity between each task input and the training samples. Evaluated on three standard English RE datasets, our model outperforms GPT-based baselines, achieving competitive results on all datasets. language of the presentation: *** English *** | |||||||
LI KAIFAN | M, 2回目発表 | ソーシャル・コンピューティング | 荒牧 英治 | 渡辺 太郎 | 若宮 翔子 | 矢田 竣太郎 | Peng Shaowen |
title: Recurrent Memory Transformer for Incremental Summarisation of extremely long text abstract: Pupular transformer-based language models are often constrained with a limited number of input length, result- ing in subpar performance on the task of long-text sum- marisation. Prior work has attempted to alter the model’s internal architecture to accommodate longer inputs. Even if a model supports longer input texts, limited RAMs in university laboratories and edge devices prohibit us from unleashing that input length. We, thus, explore the task of long-text summarisation based on Recurrent Memory Transformer (RMT) which provides an external memory for the transformer-based models without modifying the internal structure, and further proposed RMT-Summ. To demonstrate the validity of RMT-Summ, we introduce an incremental summarisation task, and built a dedicated dataset from PubMed medical articles containing struc- tured abstracts. Our experimental results show that an RMT-Summ powered BART model performed better than the baseline original BART by 1.24 points in ROUGE-1. language of the presentation: english | |||||||
西山 智弘 | D, 中間発表 | ソーシャル・コンピューティング | 荒牧 英治 | 渡辺 太郎 | 若宮 翔子 | 矢田 竣太郎 | |
title: Extraction of healthcare information from medical texts and social media texts
abstract: In this study, we report a method for extracting healthcare-related information from medical texts using natural language processing techniques. Specifically, we aimed to automatically extract information from both hospital documents and social media texts. For the hospital documents, we used medical records such as physicians' progress notes, pharmacists' progress notes, nursing records, RI reports, and radiology reports to extract data on the occurrence of peripheral neuropathy caused by taxane-based anticancer drugs. Additionally, we investigated the usefulness of simultaneously utilizing multiple types of documents. For social media, we analyzed posts from X (formerly Twitter), classifying cases of inappropriate drug use to explore the potential of this approach. language of the presentation: Japanese 発表題目: 医療テキストやソーシャルメディアテキストからのヘルスケア情報の抽出 発表概要: 本研究では、自然言語処理技術を用いて、医療テキストからの情報抽出を行う手法について報告する。具体的には、病院内文書やソーシャルメディアのテキストを対象に、ヘルスケアに関連する情報の自動抽出を試みた。病院内文書として、医師記載カルテ、薬歴、看護記録、RIレポート、放射線レポートを用い、タキサン系抗がん剤による末梢神経障害の発症に関するデータを抽出した。また、これら複数の文書を同時に使用することの有用性について検討した。さらに、ソーシャルメディアとしてX(旧Twitter)のツイートを対象に、薬物の不適切使用に関する分類を行い、その可能性について調査した。 | |||||||