Two aspects of bioinformatics in different timescales: sampling problem on molecular dynamics simulation and perspective to the gliding mechanism of Mycoplasma mobile
目次正一 (0261028)
Computational science has deeply penetrated into molecular biology for the last few decades
and emerged as a new field called bioinformatics. Bioinformatics treats biological problems of
extremely different timescales. One extreme is molecular dynamics (MD) simulation, which
deals with the motion of protein molecules in the order of nanoseconds. Another extreme is the
analysis of molecular evolution, which deals with the events that take place during the time
scale of millions of years. Here, I would like to report my approach to two biological problems
of two distinct time scales. One is the analysis of sampling problems in MD simulation of
protein, and the second is the analysis of gliding mechanism of Mycoplasma mobile.
MD simulation has become one of general tools to estimate thermal property of proteins. A
simulation of nanoseconds of protein motion can be carried out. To derive thermal properties of
molecule from a MD trajectory, an assumption has to be made, that the simulation has been long
enough to reconstruct the ensemble of the molecular motion. There is no simple criterion to
make a judgment, if the simulation time was sufficiently long to estimate the physical property
of the molecule. Many physical quantities are known to be expressed as a second moment of
dynamical variables , which can directly deduce vibrational modes . Therefore, I have
investigated the effect of vibrational modes on convergence of the second moment matrix
component of atom positions. We found that the frequency modes higher than 68[cm-1]
converges to the equilibrium value within 10% error value within 400 pico seconds. The
difference between the eigenvalues of second moment matrix of all modes and those of only
middle (10~68[cm- 1]) and low (~10[cm-1]) modes is very small. These results mean that the
higher frequency components of second moment matrix converge v ery fast and hardly influence
the convergence of middle and low frequency components. The middle frequency modes
influence the convergence of lower modes but this influence is less significant, compared from
the effect from low modes obtained by former study . As a summary, the lower modes are
dominant for the convergence of second moment matrix.
Computational analysis of evolution is another important tool to clarify the molecular
mechanism of biological machinery. Here I describe one such approach to the analysis of the
movement mechanism of a bacterium, Mycoplasma mobile. M.mobile has an ability to glide,
when attached on a solid surface. The mechanism of the motion is not understood yet, however,
a few genes including Gli349 are experimentally shown to be related to the motion. The DNA
sequence of the Gli349 gene has already been reported but the three -dimensional structure has
not been solved yet. To obtain a clue for the mechanism of its movement, I carried out an
evolutionary analysis of the Gli349 sequence. As a result, I found 21 sequential repeats within
the Gli349. Each repeat consists of approximately 100 amino acids with a conserved sequence
motif, “YxxxxxGF”. No homologous sequence to Gli349 was found in NCBI RefSeq
non-redundant sequence database. In general, repeat sequences are known to assoc iate with
other protein. This leads to a speculation that the repeat regions in Gli349 are also responsible
for the interaction with other proteins. Chymotrypsin breaks the amide bond where the sequence
is not within a structural domain, such as a loop region between structural domains. The
structural domain regions speculated from the chymotrypsin cleavage experiment agreed well
with the repeats I discovered. I predict one sequence repeat corresponds to one structural
domain, and the entire structure of Gli349 is composed of the tandem domains. Electron
microscopy experiment shows Gli349 takes rod shape with a couple of kinks. The size and
shape of the observed shape fits very well with the predicted domain repeat structure.
Immu noglobulin fold is known to form a rod structure by forming tandem repeats. However, the
repeat sequences do not fit into any of know immunoglobulin fold structure s . Therefore, I
conclude that the repeat sequence in M.mobile responsible for th e gliding forms a novel type of
structural domain.
Keywords:
molecular dynamics, sampling, evolutionary analysis, Mycoplasma, gliding protein