コロキアムB発表

日時: 07月24日（Thu） 1限目（9:20-10:50）

会場: L1

司会: 遠藤新

池田　遼太	M, 2回目発表	大規模システム管理	笠原　正治,	松本　健一,	原　崇徳,	中畑　裕
title: Collusion Resistance Analysis of DAO Voting Mechanisms Based on Bribery Cost Evaluation for Multiple Voting Rounds abstract: Decentralized Autonomous Organization (DAO) is a novel organizational structure operating on a blockchain, where decision-making is conducted through voting by all participants, distinguishing it from traditional organizations. However, concerns have been raised regarding the loss of decentralization when large token holders dominate the voting process. Existing research has compared the collusion resistance of three voting mechanisms: Linear Voting, Quadratic Voting, and the veToken. However, this research assumes unrealistic conditions, such as restricting DAO voting to a single round and excluding large token holders from participation, which fail to accurately reflect the real-world environment surrounding DAO. To address these limitations, this research extends the existing models by incorporating scenarios where large token holders actively participate in voting and where voting occurs over multiple rounds. Using bribery costs as a key metric, the research evaluates the collusion resistance of various voting mechanisms. language of the presentation: Japanese 発表題目: 複数回投票を想定した賄賂コスト評価によるDAO投票メカニズムの談合耐性分析発表概要: DAO (Decentralized Autonomous Organization) とはブロックチェーン上で運営される新しい組織形態を指し，従来型の組織と異なり参加者全員の投票により意思決定が行われるとして注目を集めている．しかし，大規模トークン保有者が投票を支配することで分散性が損なわれるという課題が指摘されている．既存研究では，Linear Voting，Quadratic Voting，veTokenの3つの投票メカニズムについて，談合耐性を比較している．しかし，DAOの投票が1回に限定されていることや，大規模トークン保有者は投票には不参加という非現実的な仮定が含まれており，実際のDAOを取り巻く環境を正確に反映しているとは言えない．本研究では，この課題を解決するため，大規模トークン保有者が投票に参加することや投票が複数回行われることを想定したモデルに拡張し，賄賂コストを指標として各投票メカニズムにおける談合耐性を評価する．

山田　純也	M, 2回目発表	大規模システム管理	笠原　正治,	松本　健一,	原　崇徳,	中畑　裕
title: Dynamic Optimization of the Number of Stake Pools in Proof-of-Stake Blockchains abstract: Proof-of-Stake (PoS) blockchains achieve consensus through stake delegation, enabling energy-efficient and scalable decentralized systems. In representative democracy-based PoS mechanisms, such as that employed by Cardano, users delegate their tokens to stake pools operated by third parties, and rewards are distributed in accordance with the protocol’s reward-sharing scheme. One of the key parameters in this system is the ideal number of stake pools (k), which significantly influences network decentralization, the sustainability of node operation, and system stability. However, in current implementations, k is statically defined, making it difficult to adapt to changing network conditions. This study proposes a dynamic model to estimate the optimal value of k by evaluating the trade-offs among multiple factors, including the utility of decentralization, operational costs, and the effects of changes in pool reward caps on delegation incentives. The utility of decentralization is modeled using a logarithmic function, assuming diminishing marginal utility. Operational costs are assumed to scale proportionally with k. The model focuses on changes in delegation behavior driven by the reward saturation mechanism as k fluctuates. The proposed model aims to dynamically adjust k in each epoch to enhance fairness within the network. language of the presentation: Japanese

HUANG BIN	M, 2回目発表	計算システムズ生物学	金谷　重彦,	松本　健一,	小野　直亮,	MD.Altaf-Ul-Amin
title: Interpretable EEG Seizure Detection with a Learnable Gabor-CNN-LSTM Architecture abstract: This study proposes an interpretable neural network architecture for detecting epileptic events from electroencephalogram (EEG) signals. A learnable multi-scale Gabor convolution layer is introduced as the first stage of the model to perform frequency-domain modeling. These Gabor kernels are optimized during each training epoch to dynamically simulate physiologically meaningful seizure-related waveforms, such as spikes and sharp waves. This design not only enhances the model’s ability to extract relevant features but also incorporates biologically inspired constraints that improve interpretability. The extracted features are subsequently processed by a convolutional neural network (CNN) and a bidirectional long short-term memory (BiLSTM) network to capture both spatial and temporal dependencies. Furthermore, SHapley Additive exPlanations (SHAP) analysis is employed to visualize the contribution of each input feature, offering greater transparency in the model’s decision-making process. The proposed model is evaluated on collected EEG datasets, demonstrating strong classification performance and interpretability that is well-suited for supporting clinical EEG research. language of the presentation: English

中村　伊吹	M, 2回目発表	ソフトウェア設計学	飯田　元,	松本　健一,	柏　祐太郎,	Reid Brittany
title: Towards an Empirical Investigation into Self-Admitted Technical Debt in Test Code abstract: In software development, the additional effort incurred by choosing an incomplete but convenient implementation instead of an ideal implementation that requires more time is referred to as technical debt. Specifically, technical debt that is intentionally introduced by developers is known as Self-Admitted Technical Debt (SATD), and it is widely recognized that many developers incorporate it. In recent years, numerous studies have investigated the impact of SATD on software quality and other aspects. However, most of these studies focus on production code, treating SATD in test code as either outside the scope of research or as having similar characteristics to SATD in production code. Nevertheless, a significant amount of SATD exists in test code, and some types cannot be categorized under existing classifications. This study aims to conduct an empirical investigation to uncover the characteristics of SATD in test code. language of the presentation: Japanese 発表題目: テストコードに存在するSelf-Admitted Technical Debtの実証的調査に向けて発表概要: ソフトウェア開発において，時間を要する理想的な実装を選択する代わりに，不完全であるが簡便な実装を選択することで生じる追加工数を技術的負債と呼ぶ．特に，開発者が意図的に導入する技術的負債はSelf-Admitted Technical Debt (SATD)と呼ばれ，多くの開発者が導入していることが知られている．近年では，多くの研究がSATDがソフトウェア品質などに与える影響を調査している．しかし，ほとんどの研究はプロダクションコードに焦点が当てられており，テストコードに存在するSATDは研究対象外あるいは類似した性質のSATDであると扱っている．しかしながら，実際にはテストコード内にも多くのSATDが存在しており，既存のカテゴリに分類できないようなSATDが存在している．そこで本研究では，テストコードに存在するSATDの実態を明らかにするために実証的な調査を行う．

日時: 07月24日（Thu） 1限目（9:20-10:50）

会場: L2

司会: 笹田大翔

大羽　未悠	D, 中間発表	自然言語処理学	渡辺　太郎,	荒牧　英治,	大内　啓樹
title: Can Language Models Induce Grammatical Knowledge from Indirect Evidence? abstract: What kinds of and how much data is necessary for language models to induce grammatical knowledge to judge sentence acceptability? Recent language models still have much room for improvement in their data efficiency compared to humans. This paper investigates whether language models efficiently use indirect data (indirect evidence), from which they infer sentence acceptability. In contrast, humans use indirect evidence efficiently, which is considered one of the inductive biases contributing to efficient language acquisition. To explore this question, we introduce the Wug InDirect Evidence Test (WIDET), a dataset consisting of training instances inserted into the pre-training data and evaluation instances. We inject synthetic instances with newly coined wug words into pretraining data and explore the model’s behavior on evaluation data that assesses grammatical acceptability regarding those words. We prepare the injected instances by varying their levels of indirectness and quantity. Our experiments surprisingly show that language models do not induce grammatical knowledge even after repeated exposure to instances with the same structure but differing only in lexical items from evaluation instances in certain language phenomena. Our findings suggest a potential direction for future research: developing models that use latent indirect evidence to induce grammatical knowledge. language of the presentation: Japanese 発表題目: 言語モデルの間接証拠からの文法知識の獲得発表概要: 言語モデルが文の容認性を正しく判断するには，どのような特徴を持つどれほどの量のデータを訓練する必要があるだろうか．本研究では，昨今の言語モデルが人間と比べてデータ効率の面で改善の余地を残している点に着目し，言語モデルが間接的なデータ（間接証拠）から文の容認性を判断できるかを調査する．人間は言語獲得において，間接証拠を効果的に活用していると考えられており，効率的な学習を可能にする帰納バイアスの一つと考えられている．本研究では，Wug InDirect Evidence Test (WIDET) というデータセットを作成した． WIDETは，事前訓練データに挿入する訓練事例と，それに対応する評価事例から構成される．未知の擬似語 (wug) を含む人工的な訓練事例を事前訓練データに挿入し，それらの語に関するモデルの文法知識を評価事例を用いて分析した．訓練事例は，間接性の程度および頻度が様々な文から構成される．実験の結果，ある言語現象において，訓練事例と評価事例が語彙のみが異なり構造は一致していたにもかかわらず，モデルは繰り返しの提示を経ても文法知識を獲得しないことが明らかとなった．この結果は，言語モデルが間接証拠を活用して文法知識を獲得する能力に潜在的な課題があることを示唆しており，より人間のように間接証拠を利用可能なモデルの設計が，データ効率に重要な方向性となる可能性を示している．

EUNIKE ANDRIANI KARDINATA	D, 中間発表	自然言語処理学	渡辺　太郎,	荒牧　英治,	大内　啓樹
title: Towards a Universal Disambiguation Framework abstract: At different levels of Natural Language Processing (NLP), ambiguity occurs when there are multiple readings of text due to various reasons. Resolving ambiguity is important because it advances the complex natural language understanding, thus preventing misinterpretation and ensuring the reliability of NLP systems. Moreover, it also supports multilingual communication as humans become more interconnected nowadays. While disambiguation task is not new, there are still limited resources for evaluating the performance of different methods over various types of ambiguity and languages. Even for the widely available English, we are yet to reach the optimum level of disambiguation. Coupled with the fact that it is almost impossible to isolate one language from various influencing factors, this task becomes even more challenging. As an attempt to solve the existing problems, we aim to design a universal framework of disambiguation that includes guided steps to resolve as many types of ambiguity as possible in different languages. Using this framework, we also plan to construct a benchmark for this task by implementing the framework on authentically obtained multilingual data and aligning it with human preference. language of the presentation: English

VASSELLI JUSTIN RAY	D, 中間発表	自然言語処理学	渡辺　太郎,	荒牧　英治,	上垣外　英剛
title: Building Better AI Language Tutors abstract: Large language models (LLMs) show promise as conversational partners for language practice, but their effectiveness as language teachers remains underexplored. In STEM fields, this question has seen growing attention, with the release of tutoring dialogue datasets and evaluation benchmarks focused on pedagogical quality. My research aims to build intelligent language tutoring systems using LLMs by focusing on two key areas: generating high-quality tutor–learner dialogues and evaluating their pedagogical effectiveness. I introduce a pipeline called Dialogue Act Scripts (DAS), which uses structured semantic representations of dialogues to create synthetic multilingual data. I will outline how DAS can be used to generate training and evaluation data to support the development of more effective AI language tutors. language of the presentation: English