Popular traditional medicines from Indonesia are known as Jamu. A Jamu formula is composed of a single plant or a mixture of several plants. Jamu formulas are generally developed based on the experience of users for decades or even hundreds of years. Therefore, the systemization of Jamu formulas is needed to meet the requirement of Indonesian Healthcare Systems.
This study is intended to explore and identify interesting patterns in the formulation of Indonesian Jamu medicines by utilizing data-intensive science and machine learning approaches. Initially, we proposed a new method to predict the relation between plant and disease using network analysis and supervised clustering. By using matching score of a cluster generated by network clustering algorithm DPClusO, the dominant disease and high frequency plant associated to the cluster are determined. The plant-disease relations predicted by our method were evaluated in the context of previously published results and were found to produce very good predictions. Furthermore, we assessed the capability of binary similarity and dissimilarity equations to classify the Jamu pairs into match and mismatch efficacies by using Receiver Operating Characteristic (ROC) analysis. Hence, the selection of binary similarity and dissimilarity measures for multivariate analysis of Jamu medicines is data dependent. Out of equations used over the last century, the Forbes-2 similarity measure is recommended for studying the relationship between Jamu formulas.
In addition, we extended our analysis of plant-disease relations by including metabolites information from plants used as Jamu ingredients for predicting Jamu efficacy and identifying important metabolites. The Support Vector Machine (SVM) with linear kernel and Random Forest (RF) produced good classification models if we combined these classifiers with filtering and feature selection techniques. We also identified significant metabolites associated to efficacy groups by applying inTrees framework.