Approaches that involve protein-protein interaction data are considered among the most powerful for inferring functions of uncharacterized proteins, because protein-protein interaction data represent physical interactions between proteins. However, interpreting comprehensive protein-protein interaction data for predicting uncharacterized protein functions has been a challenging task because the data contain a lot of false positives and negatives.
To overcome these problem, I have developed a method which extracts functionally similar proteins with high confidence by integrating protein-protein interaction data and domain information. I used this method to analyze publicly available data from Saccharomyces cerevisiae. I identified 1,042 functional associations, involving 765 proteins of which 86 (11.2%) had no previously ascribed function. In addition, I identified seven clusters of uncharacterized proteins which represent potentially new functions.
My method extracts functionally similar protein pairs more accurately than conventional methods, and predicting functions for previously uncharacterized proteins can be achieved with high confidence. I found that integrating protein-protein interaction data with domain information is a rational and reliable approach for predicting protein function. It can of course be applied to protein-protein interaction data for any species.