A Grammar-Based Approach to RNA Pseudoknotted Structure Prediction for Aligned Sequences
Nobuyoshi Mizoguchi (0951120)
RNA secondary structure prediction is one of the major topics in
bioinformatics.
A method based on a parsing algorithm for formal grammars is
a promising approach to the prediction problem.
In this thesis, I propose a prediction method which uses the formal grammar
named SMCFG (stochastic multiple context-free grammar).
This grammar can express the important substructures found in
RNA, pseudoknot structure.
The method is based on comparative sequence analysis, which accepts some
of RNA sequences as input and predicts their common secondary structure.
Unlike an existing using SMCFG, the proposed method can determin the
parameters of grammar rules without training data, which is a large set
of RNA sequences annotated by their secondary structures.
Finally, I conducted experiments to compare the performance with the
method {\it hxmatch}, which is one of the best known existing methods for
predicting pseudoknot structure by comparative sequence analysis.
F-measure of the propsed method for eight RNA families are comparable
with that of hxmatch.