Multifaceted analysis of sequence similarity-based network of allergens

Yasue Aoyama (1651001)


Allergy is one of the most common diseases in developed countries. Classification of allergen is help of prove the immune system. Prediction of allergen is help of production of food. Allergens are classified into different groups based on their evolutionally related groups. However, the conventional classification approach is insufficient for allergen cross-reactivity prediction. The prediction of allergen by machine learning method becomes time consuming as the prediction method becomes complex. In order to solve these problems, we construct allergen networks depending on local sequence similarities. The allergens are classified into small clusters so that members in the same cluster are densely connected using DPClus algorithm. This research shows that the problems about allergen prediction may be improved by (1) developing more detailed classification method, (2) constructing small allergen model set that will represents allergens property more precisely. Our clustering results show that most of clusters contain either plant proteins or animal proteins solely. Each cluster can be characterized by one protein function/domain. Based on these insights, we proposed a new allergen prediction model from the clusters.