Two aspects of bioinformatics in different timescales: sampling problem on molecular dynamics simulation and perspective to the gliding mechanism of Mycoplasma mobile

目次正一 (0261028)


Computational science has deeply penetrated into molecular biology for the last few decades and emerged as a new field called bioinformatics. Bioinformatics treats biological problems of extremely different timescales. One extreme is molecular dynamics (MD) simulation, which deals with the motion of protein molecules in the order of nanoseconds. Another extreme is the analysis of molecular evolution, which deals with the events that take place during the time scale of millions of years. Here, I would like to report my approach to two biological problems of two distinct time scales. One is the analysis of sampling problems in MD simulation of protein, and the second is the analysis of gliding mechanism of Mycoplasma mobile. MD simulation has become one of general tools to estimate thermal property of proteins. A simulation of nanoseconds of protein motion can be carried out. To derive thermal properties of molecule from a MD trajectory, an assumption has to be made, that the simulation has been long enough to reconstruct the ensemble of the molecular motion. There is no simple criterion to make a judgment, if the simulation time was sufficiently long to estimate the physical property of the molecule. Many physical quantities are known to be expressed as a second moment of dynamical variables , which can directly deduce vibrational modes . Therefore, I have investigated the effect of vibrational modes on convergence of the second moment matrix component of atom positions. We found that the frequency modes higher than 68[cm-1] converges to the equilibrium value within 10% error value within 400 pico seconds. The difference between the eigenvalues of second moment matrix of all modes and those of only middle (10~68[cm- 1]) and low (~10[cm-1]) modes is very small. These results mean that the higher frequency components of second moment matrix converge v ery fast and hardly influence the convergence of middle and low frequency components. The middle frequency modes influence the convergence of lower modes but this influence is less significant, compared from the effect from low modes obtained by former study . As a summary, the lower modes are dominant for the convergence of second moment matrix. Computational analysis of evolution is another important tool to clarify the molecular mechanism of biological machinery. Here I describe one such approach to the analysis of the movement mechanism of a bacterium, Mycoplasma mobile. M.mobile has an ability to glide, when attached on a solid surface. The mechanism of the motion is not understood yet, however, a few genes including Gli349 are experimentally shown to be related to the motion. The DNA sequence of the Gli349 gene has already been reported but the three -dimensional structure has not been solved yet. To obtain a clue for the mechanism of its movement, I carried out an evolutionary analysis of the Gli349 sequence. As a result, I found 21 sequential repeats within the Gli349. Each repeat consists of approximately 100 amino acids with a conserved sequence motif, “YxxxxxGF”. No homologous sequence to Gli349 was found in NCBI RefSeq non-redundant sequence database. In general, repeat sequences are known to assoc iate with other protein. This leads to a speculation that the repeat regions in Gli349 are also responsible for the interaction with other proteins. Chymotrypsin breaks the amide bond where the sequence is not within a structural domain, such as a loop region between structural domains. The structural domain regions speculated from the chymotrypsin cleavage experiment agreed well with the repeats I discovered. I predict one sequence repeat corresponds to one structural domain, and the entire structure of Gli349 is composed of the tandem domains. Electron microscopy experiment shows Gli349 takes rod shape with a couple of kinks. The size and shape of the observed shape fits very well with the predicted domain repeat structure. Immu noglobulin fold is known to form a rod structure by forming tandem repeats. However, the repeat sequences do not fit into any of know immunoglobulin fold structure s . Therefore, I conclude that the repeat sequence in M.mobile responsible for th e gliding forms a novel type of structural domain.

Keywords: molecular dynamics, sampling, evolutionary analysis, Mycoplasma, gliding protein