コロキアムB発表

日時: 9月10日（火） 2限目（11:00-12:30）

会場: L1

司会: Pham Hoai Luan

RUMMAN MAHFUJUL ISLAM	D, 中間発表	計算システムズ生物学	金谷　重彦	安本　慶一	小野　直亮	MD.Altaf-Ul-Amin
title: A latent diffusion-based generative model for analyzing histopathological images. abstract: Our research aims to create a conditional latent diffusion model. For this purpose, we extracted features from histopathological images after pre-training a Vector-Quantized Variational AutoEncoder (VQ-VAE). Then, we applied the Denoising Diffusion Probabilistic Model (DDPM) on the latent space obtained from the encoder of the pre-trained VQ-VAE. Subsequently, we want to apply Information-Maximization (IM), which is a deep learning-based clustering technique on those latent vectors and use the clustering output as a conditioning input to the diffusion model. The conditioning input works as guidance for our model and ensures the types of images that can be generated from our model. In the future, we will perform clustering for different cluster sets and determine the optimal number of clusters using some cluster validation techniques to ensure the generation of synthetic histopathological images that resemble our original dataset. language of the presentation: English

橋本　沙知	M, 2回目発表	計算システムズ生物学	金谷　重彦	安本　慶一	小野　直亮	MD.Altaf-Ul-Amin
title: Machine learning model based on MassBank to predict electron ionization mass spectra and build a library abstract: Metabolomics analysis has become increasingly vital in life sciences in recent years. Methods for identifying compounds from mass spectrometry data, such as LC-MS, are widely used. However, interpreting mass spectrometry data requires specialized knowledge, so identification is often performed by referencing databases. The number of mass spectral databases is limited compared to the diversity of compounds, making general-purpose compound identification challenging. In this study, we developed a model that generates mass spectrum peaks from molecular graphs using databases like MassBank. By using this model, we aim to build a comprehensive predictive library of mass spectra to facilitate molecular identification. language of the presentation: Japanese 発表題目: MassBankに基づく機械学習モデルによる電子イオン化質量スペクトル予測とライブラリーの構築発表概要: メタボロミクス解析は近年、生命科学の解明においてますます重要になっている。そのため、LC-MSなどの質量分析データから化合物を同定する手法が広く利用されている。しかし、質量分析データの解釈には専門的な知識が必要であり、多くの場合、データベースを参照して同定が行われている。一般的な化合物の多様性に比べ、質量スペクトルデータベースの数は限られており、汎用的な化合物同定には依然として課題が残されている。本研究では、MassBankなどのデータベースを利用し、分子グラフから質量スペクトルのピークを生成するモデルを構築する。このモデルを用いて、分子同定を支援するための汎用的な予測ライブラリの構築を目指す。

内田　翔一朗	M, 2回目発表	計算システムズ生物学	金谷　重彦	西條　雄介	小野　直亮	MD.Altaf-Ul-Amin
title: Diversity of soil microorganisms in fruit growing fields and Omics analysis of crop metabolites and soil feedback analysis abstract: In recent years, fruit tree cultivation has increased, especially in tropical rainforests. This has led to an increase in demand for research related to fruit tree plantations. In this study, we focused on changes in soil microbial diversity over time due to long-term fruit tree cultivation. Specifically, diversity and omics analyses will be conducted using data from two soils that have been in fruit tree cultivation for 12 and 25 years, respectively, to determine how changes in soil microbial diversity occur over time. language of the presentation: Japanese

会場: L2

司会: 矢田竣太郎

工藤　拓斗	M, 2回目発表	ソフトウェア工学	松本　健一	飯田　元	Raula Gaikovina Kula	嶋利　一真
title: Evaluation of the Usability of Verified Hint Generation Function Using Large Language Models in Programming Exercises abstract: In recent years, large language models have been widely studied to support novices in learning programming. When applying large language models to programming learning, there is a problem that if the output of the model is used as is, the answer to the problem is sometimes given, and novices cannot learn by themselves. Therefore, there has been a method to generate and present high-quality hints without directly presenting the answer. In this existing method, hints generated using the GPT-4 model are validated using the GPT- 3.5 model before being presented. Our study extends this validated hint generation method and introduces it into a programming exercise class at a graduate university to which the authors belong, to measure its effectiveness. As a result of the introduction of the hint generation function, it is clear that the proposed method is particularly useful for beginners with limited programming experience. language of the presentation:Japanese 発表題目: 大規模言語モデルによる検証済みヒント生成機能のプログラミング演習での有用性の評価発表概要: 近年，初学者に対するプログラミング学習の支援を目的として大規模言語モデルを用いた研究が盛んに行われている.大規模言語モデルのプログラミング学習への適用において，モデルの出力をそのまま用いると課題の答えが提示されてしまうことがあり，学生が自力で学べないという問題がある.そのため，これまでに直接答えを提示せずに品質の高いヒントを生成して提示する手法の提案が行われている.この手法では，GPT-4 モデルを用いて生成したヒントを GPT-3.5 モデルを用いて検証を行った上で提示が行われている.本研究ではこの検証されたヒントの生成手法の拡張を行い，著者らが所属する大学院大学で行われているプログラミング演習の授業に導入し，その効果を測る.ヒント生成機能の導入の結果，特にプログラミング経験が浅い初級者にとって提案手法が有用であることが明らかとなった.

増井　太一	M, 2回目発表	ソフトウェア工学	松本　健一	飯田　元	Raula Gaikovina Kula	嶋利　一真
title: Analysis of Program edits in Response to Errors in Python Programming Exercises abstract: In programming exercises for novices, participants often encounter errors. When participants with limited programming experience encounter errors, they may be unable to make appropriate corrections. Existing research has analyzed common errors that novices encountered and developed support methods for error correction. However, detailed analysis focusing on the specific processes by which novices address errors has not been sufficiently conducted. Therefore, this study aims to clarify how programming novices attempt to resolve errors when they encounter them. Specifically, based on data from programming exercises conducted at the NAIST, and the code at the time of encountering an error and the code at the next execution will be analyzed for differences. Based on these analysis results, we will demonstrate which types of errors programming novices are making edits that do not lead to error resolution. language of the presentation: Japanese 発表題目: Pythonプログラミング演習におけるエラーに対するプログラムの編集内容の分析発表概要: 初学者向けのプログラミング演習では，受講者がしばしばエラーに遭遇する．エラーに遭遇した受講者のプログラミング経験が浅い場合，適切な修正を行うことができない場合がある．これまでの研究では，初学者が遭遇しやすいエラーの分析や，エラー修正の支援手法が行われてきた．一方で，初学者の具体的なエラーへの対応のプロセスに着目した詳細な分析は十分に行われていない．そこで，本研究ではプログラミング初学者がエラーの遭遇時にどのようにエラーを解決しようとしているのかを明らかにすることを目指す．具体的には，奈良先端大で行われたプログラミング演習のデータをもとに，エラーに遭遇した際のコードと次回実行時のコードの編集差分の分析を行う．これらの分析結果をもとに，プログラミング初学者がどのようなエラーのときに，エラー解決につながらない編集をしているかを示す．

山田　裕彌	M, 2回目発表	サイバーレジリエンス構成学	門林　雄基	飯田　元	妙中　雄三
title: Proposal of a method for deobfuscating API hashing based on static analysis utilizing memory access information abstract: In malware analysis, static analysis (code analysis) is an indispensable method for thoroughly examining the structure and functionality of malware. However, static analysis is often hindered by the effects of obfuscation, making it difficult to analyze obfuscated malware. Specifically, malware employing API hashing conceals the names of Windows APIs by hashing them, thereby complicating static analysis based on API names. A widely recognized countermeasure for this type of obfuscation involves deobfuscating the malware by searching a database composed of API names and their corresponding hash values. However, this approach faces challenges, such as its inability to address unknown hashing algorithms. In response to this issue, the present study proposes a method that utilizes memory access information from malware to statically identify the embedded hash functions and automatically deobfuscate the hashed API names. This proposed method enables the deobfuscation of malware that incorporates unknown hashing algorithms. language of the presentation: Japanese 発表題目: メモリアクセス情報を活用した静的解析に基づくAPIハッシュ難読化解除手法の提案発表概要: マルウェア解析において静的解析（コード解析）は，マルウェアの構造や機能を詳細に調べるために不可欠な解析方法である．しかし，静的解析は難読化の影響を受けやすく，静的解析は困難になることが多い．特にAPIハッシュ難読化マルウェアは，使用しているWindows API名をハッシュ化することで隠蔽するため，API名を起点にした静的解析が困難になってしまう．このAPIハッシュ難読化の対策手法として，API名とハッシュ値から構成されるデータベースを検索し難読化を解除する手法が広く知られているが，未知のハッシュアルゴリズムなどに対応できないといった課題がある．そこで本研究では，マルウェアのメモリアクセス情報を活用することで，マルウェアに埋め込まれているハッシュ関数を静的に特定しハッシュ化されたAPI名を自動で解除する手法を提案する．提案手法により，未知のハッシュアルゴリズムが埋め込まれたマルウェアに対する難読化解除を実現する．