NAIST-IS-MT1151130: Kibinge Nelson Kipchirchir

Plasmodium phylogenetics and numeric modi fication of amino acid residues

Kibinge Nelson Kipchirchir (1151130)


Phylogenetics is one of the major subjects of bionformatics. It has been used to formulate and evaluate testable hypotheses on taxonomy, epidemiology studies and inference of functional relationships among studies. In this presentation, we discuss about the application of phylogenetics in inferring the functional association of malaria parasites: Plasmodium and its primary hosts. Possibility of genome shaping events such as Lateral gene transfer and convergent evolution is presented as an explanation to incongruent phylogenies from the test dataset. We will also present a novel method of protein sequences phylogenetics. Proteins are made up amino acid residues and are often chosen over DNA sequences for phylogenetic studies involving distantly related organisms. One category of molecular phylogenetic mmethods uses genetic distance matrices as input to tree constructing algorithms. These distance methods are computationally fast and offer a quick and reliable estimate of phylogenies for large sequence datasets. Conventionally, pairwise distances of elements in a MSA are derived based on similarity or dissimilarity and various models of correction for multiple substitutions. The most common way of deriving genetic distance from protein sequences is by use of transition probability matrices such as Dayhoff and BLOSUM matrices. There are two problems with this approach. First, there is inherent loss of information about site-specific variability in the conversion of a MSA to a distance matrix. Secondly, the accuracy of distance between two proteins is dependent on use of robust models correcting for multiple substitution. In this talk, I will present an approach of modifying amino acid residues in sequences, to incorporate attributes and properties that define them. This enables utilization of MSA site-specific variation for distance derivation. I will also talk about performance of statistical distance methods over conventional models of deriving genetic distances.