Protein sequence modeling and transcription regulation network analysis towards big data biology
Kibinge Nelson Kipchirchir ( 1361017 )
Computer science plays a key role in analysis of biological data. For example, to compare
similarities between genes (DNA), sequences of interest are mapped to each other using computational tools.
Due to the decreasing costs of instruments such as those used in DNA sequencing, there has been a corresponding
increase in data available further strengthening the need for computer science in biology.
One of the recent trends in computational biology is the use of data from multiple sources
and across various levels of biological studies to understand integrated systems. This is especially true for
both RNA and protein studies.
In this work, we introduce two examples of integrated systems biology applications. We first will describe an innovative approach for protein sequence representation in computer applications and examine how this tool can be used to
incorporation of amino acid biological properties during computational analysis of proteins. We will also describe another integrated
pipeline for examining transcription regulation networks from gene expression data. This approach will demonstrate the usefulness of biological
objectivity in designing computer tools for analysing molecular data.