Pub. Date:
Wiley, John & Sons, Incorporated
Reviews in Computational Chemistry

Reviews in Computational Chemistry

by Thomas R. Cundari, Donald B. Boyd


View All Available Formats & Editions
Current price is , Original price is $110.0. You
Select a Purchase Option
  • purchase options

Product Details

ISBN-13: 9781560815150
Publisher: Wiley, John & Sons, Incorporated
Publication date: 08/01/1991
Pages: 520
Product dimensions: 6.50(w) x 1.50(h) x 9.50(d)

About the Author

Kenny B. Lipkowitz and Raima Larter are Professors of Chemistry at Indiana University - Purdue University at Indianapolis. Tom Cundari is Professor of Chemistry at the University of Memphis.

Read an Excerpt

Reviews in Computational Chemistry, Volume 19

John Wiley & Sons

Copyright © 2003

Kenny B. Lipkowitz, Raima Larter, Thomas R. Cundari
All right reserved.

ISBN: 0-471-23585-7

Chapter One


This chapter is written for the reader who would like to learn how
Monte Carlo methods are used to calculate thermodynamic properties of systems
at the atomic level, or to determine which advanced Monte Carlo methods
might work best in their particular application. There are a number of
excellent books and review articles on Monte Carlo methods, which are generally
focused on condensed phases, biomolecules or electronic structure theory.
The purpose of this chapter is to explain and illustrate some of the
special techniques that we and our colleagues have found to be particularly
well suited for simulations of nanodimensional atomic and molecular clusters.
We want to help scientists and engineers who are doing their first work in this
area to get off on the right foot, and also provide a pedagogical chapter for
those who are doing experimental work. By including examples of simulations
of some simple, yet representative systems, we provide the reader with some
data for direct comparison when writing their own code from scratch.

Although a number of Monte Carlo methods in current use will be
reviewed, this chapter is not meant to be comprehensive in scope.Monte Carlo
is a remarkably flexible class of numerical methods. So many versions of the
basic algorithms have arisen that we believe a comprehensive review would be
of limited pedagogical value. Instead, we intend to provide our readers with
enough information and background to allow them to navigate successfully
through the many different Monte Carlo techniques in the literature. This
should help our readers use existing Monte Carlo codes knowledgably, adapt
existing codes to their own purposes, or even write their own programs. We
also provide a few general recommendations and guidelines for those who are
just getting started with Monte Carlo methods in teaching or in research.

This chapter has been written with the goal of describing methods that
are generally useful. However, many of our discussions focus on applications
to atomic and molecular clusters (nanodimensional aggregates of a finite number
of atoms and/or molecules). We do this for two reasons:

1. A great deal of our own research has focused on such systems,
particularly the phase transitions and other structural transformations induced by
changes in a cluster's temperature and size, keeping an eye on how various
properties approach their bulk limits. The precise determination of thermodynamic
properties (such as the heat capacity) of a cluster type as a function of
temperature and size presents challenges that must be addressed when using
Monte Carlo methods to study virtually any system. For example, analogous
structural transitions can also occur in phenomena as disparate as the denaturation
of proteins. The modeling of these transitions presents similar
computational challenges to those encountered in cluster studies.

2. Although cluster systems can present some unique challenges, their
study is unencumbered by many of the technical issues regarding periodic
boundary conditions that arise when solids, liquids, surface adsorbates, and
solvated biomolecules and polymers are studied. These issues are addressed
well elsewhere, and can be thoroughly appreciated and mastered once
a general background in Monte Carlo methods is obtained from this chapter.

It should be noted that "Monte Carlo" is a term used in many fields of
science, engineering, statistics, and mathematics to mean entirely different
things. The one (and only) thing that all Monte Carlo methods have in common
is that they all use random numbers to help calculate something. What we
mean by "Monte Carlo" in this chapter is the use of random-walk processes
to draw samples from a desired probability function, thereby allowing one to
calculate integrals of the form [integral] dqf(q) [rho](q). The quantity [rho](q) is a normalized
probability density function that spans the space of a many-dimensional
variable q, and f(q) is a function whose average is of thermodynamic importance
and interest. This integral, as well as all other integrals in this chapter,
should be understood to be a definite integral that spans the entire domain of
q. Finally, we note that the inclusion of quantum effects through path-integral
Monte Carlo methods is not discussed in this chapter. The reader interested in
including quantum effects in Monte Carlo thermodynamic calculations is
referred elsewhere.


Monte Carlo simulations are widely used in the fields of chemistry, biology,
physics, and engineering in order to determine the structural and thermodynamic
properties of complex systems at the atomic level. Thermodynamic
averages of molecular properties can be determined from Monte Carlo methods,
as can minimum-energy structures. Let (f) represent the average value
of some coordinate-dependent property f(x), with x representing the 3N
Cartesian coordinates needed to locate all of the N atoms. In the canonical
ensemble (fixed N, V and T, with V the volume and T the absolute temperature),
averages of molecular properties are given by an average of f(x) over the
Boltzmann distribution

(f) = [integral] dxf(x)exp[-[beta]U(x)]/[integral]dx exp[-[beta]U(x)] [1]

where U(x) is the potential energy of the system, [beta] = 1/[k.sub.B]T, and [k.sub.B] is the
Boltzmann constant. If one can compute the thermodynamic average of
f(x) it is then possible to calculate various thermodynamic properties. In
the canonical ensemble it is most common to calculate E, the internal energy,
and [C.sub.v] , the constant-volume heat capacity (although other properties can be
calculated as well). For example, if we average U(x) over all possible configurations
according to Eq. [1], then E and [C.sub.v] are given by

E = 3N[k.sub.B]T/2 + (U) [2]

[C.sub.v] = 3N[k.sub.B]/2 + ([U.sup.2]) - [(U).sup.2]/([k.sub.B][T.sup.2]) [3]

The first term in each equation represents the contribution of kinetic energy,
which is analytically integrable. In the harmonic (low-temperature) limit, E
given by Eq. [2] will be a linear function of temperature and [C.sub.v] from Eq. [3]
will be constant, in accordance with the Equipartition Theorem. For a small
cluster of, say, 6 atoms, the integrals implicit in the calculation of Eqs. [1]
and [2] are already of such high dimension that they cannot be effectively computed
using Simpson's rule or other basic quadrature methods. For
larger clusters, liquids, polymers or biological molecules the dimensionality
is obviously much higher, and one typically resorts to either Monte Carlo,
molecular dynamics, or other related algorithms.

To calculate the desired thermodynamic averages, it is necessary to have
some method available for computation of the potential energy, either explicitly
(in the form of a function representing the interaction potential as in
molecular mechanics) or implicitly (in the form of direct quantum-mechanical
calculations). Throughout this chapter we shall assume that U is known or can
be computed as needed, although this computation is typically the most computationally
expensive part of the procedure (because U may need to be computed
many, many times). For this reason, all possible measures should be
taken to assure the maximum efficiency of the method used in the computation
of U.

Also, it should be noted that constraining potentials (which keep the
cluster components from straying too far from a cluster's center of mass)
are sometimes used. At finite temperature, clusters have finite vapor pressures,
and particular cluster sizes are typically unstable to evaporation. Introducing
a constraining potential enables one to define clusters of desired sizes.
Because the constraining potential is artificial, the dependence of calculated
thermodynamic properties on the form and the radius of the constraining
potential must be investigated on a case-by-case basis. Rather than diverting
the discussion from our main focus (Monte Carlo methods), we refer the
interested reader elsewhere for more details and references on the use of con-straining

Random-Number Generation: A Few Notes

Because generalized Metropolis Monte Carlo methods are based on
"random" sampling from probability distribution functions, it is necessary
to use a high-quality random-number generator algorithm to obtain reliable
results. A review of such methods is beyond the scope of this chapter,
but a few general considerations merit discussion.

Random-number generators do not actually produce random numbers.
Rather, they use an integer "seed" to initialize a particular "pseudorandom"
sequence of real numbers that, taken as a group, have properties that leave
them nearly indistinguishable from truly random numbers. These are conventionally
floating-point numbers, distributed uniformly on the interval (0,1). If
there is a correlation between seeds, a correlation may be introduced between
the pseudorandom numbers produced by a particular generator. Thus, the
generator should ideally be initialized only once (at the beginning of the random
walk), and not re-initialized during the course of the walk. The seed
should be supplied either by the user or generated arbitrarily by the program
4 Strategies for Monte Carlo Thermodynamic Calculations
using, say, the number of seconds since midnight (or some other arcane formula).
One should be cautious about using the "built-in" random-number
generator functions that come with a compiler for Monte Carlo integration
work because some of them are known to be of very poor quality. The
reader should always be sure to consult the appropriate literature and obtain
(and test) a high-quality random-number generator before attempting to write
and debug a Monte Carlo program.

The Generalized Metropolis Monte Carlo Algorithm

The Metropolis Monte Carlo (MMC) algorithm is the single most widely
used method for computing thermodynamic averages. It was originally developed
by Metropolis et al. and used by them to simulate the freezing transition
for a two-dimensional hard-sphere fluid. However, Monte Carlo methods can
be used to estimate the values of multidimensional integrals in whatever context
they may arise. Although Metropolis et al. did not present their algorithm
as a general-utility method for numerical integration, it soon became
apparent that it could be generalized and applied to a variety of situations.
The core of the MMC algorithm is the way in which it draws samples from
a desired probability distribution function. The basic strategies used in
MMC can be generalized so as to apply to many kinds of probability functions
and in combination with many kinds of sampling strategies. Some authors
refer to the generalized MMC algorithm simply as "Metropolis sampling,"
while others have referred to it as the M(RT) method in honor of the five
authors of the original paper (Metropolis, the Rosenbluths, and the Tellers).
We choose to call this the generalized Metropolis Monte Carlo (gMMC) method,
and we will always use the term MMC to refer strictly to the combination
of methods originally presented by Metropolis et al.

In the literature of numerical analysis, gMMC is classified as an importance
sampling technique. Importance sampling methods generate configurations
that are distributed according to a desired probability function rather
than simply picking them at random from a uniform distribution. The probability
function is chosen so as to obtain improved convergence of the properties
of interest. gMMC is a special type of importance sampling method which
asymptotically (i.e., in the limit that the number of configurations becomes
large) generates states of a system according to the desired probability distribution.
This probability function is usually (but not always) the actual
probability distribution function for the physical system of interest. Nearly
all statistical-mechanical applications of Monte Carlo techniques require the
use of importance sampling, whether gMMC or another method is used (alternatively,
"stratified sampling" is sometimes an effective approach).
gMMC is certainly the most widely used importance sampling method.

In the gMMC algorithm successive configurations of the system are
generated to build up a special kind of random walk called a Markov
. The random walk visits successive configurations, where each
configuration's location depends on the configuration immediately preceding
it in the chain. The gMMC algorithm establishes how this can be done so as
to asymptotically generate a distribution of configurations corresponding to
the probability density function of interest, which we denote as [rho](q).

We define K([q.sub.j] -> [q.sub.j) to be the conditional probability that a
configuration at [q.sub.i] will be brought to [q.sub.j] in the next step of the random walk. This conditional
probability is sometimes called the "transition rate." The probability
of moving from q to [q'] (where q and [q'] are arbitrarily chosen configurations
somewhere in the available domain) is therefore given by P(q -> [q'):

P(q -> [q') = K(q -> [q')[rho](q) [4]

For the system to evolve toward a unique limiting distribution, we must place
a constraint on P(q -> [q']). The gMMC algorithm achieves the desired limiting
behavior by requiring that, on the average, a point is just as likely to move
from q to [q'] as it is to move in the reverse direction, namely, that
P(q -> [q') = P([q'] -> q). This likelihood can be achieved only if the walk is
ergodic (an ergodic walk eventually visits all configurations when started
from any given configuration) and if it is aperiodic (a situation in which no
single number of steps will generate a return to the initial configuration).
This latter requirement is known as the "detailed balance" or the "microscopic
reversibility" condition:

K(q -> [q'])[rho](q) = K([q'] -> q)[rho]([q']) [5]

Satisfying the detailed balance condition ensures that the configurations
generated by the gMMC algorithm will asymptotically be distributed according
to [rho](q).

The transition rate may be written as a product of a trial probability
and an acceptance probability A

K([q.sub.i] -> [q.sub.j]) = [PI]([q.sub.i] -> [q.sub.j])A([q.sub.i] -> [q.sub.j]) [6]

where [PI] can be taken to be any normalized distribution that asymptotically
spans the space of all possible configurations, and A is constructed so that
Eq. [5] is satisfied for a particular choice of [PI].


Excerpted from Reviews in Computational Chemistry, Volume 19

Copyright © 2003 by Kenny B. Lipkowitz, Raima Larter, Thomas R. Cundari.
Excerpted by permission.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Table of Contents

Protein Structure Classification   Patrice Koehl     1
Introduction     1
Classification and Biology     2
The Biomolecular Revolution     2
Basic Principles of Protein Structure     4
Visualization     4
Protein Building Blocks     5
Protein Structure Hierarchy     5
Three Types of Proteins     7
Geometry of Globular Proteins     9
Protein Domains     12
Resources on Protein Structures     13
Protein Structure Comparison     14
Automatic Identification of Protein Structural Domains     14
The Rigid-Body Transformation Problem     16
Protein Structure Superposition     23
cRMS: An Ambiguous Measure of Similarity     31
Differential Geometry and Protein Structure Comparison     33
Upcoming Challenges for Protein Structure Comparison     36
Protein Structure Classification     37
The Structure Classification of Proteins (SCOP)     40
The CATH Classification     42
The DALI Domain Dictionary (DDD)     43
Comparing SCOP, CATH, and DDD     43
Conclusions     45
Acknowledgments     46
Appendix     47
References     48
Comparative Protein Modeling   Emilio Xavier Esposito   Dror Tobi   Jeffry D. Madura     57
Introduction     57
Anatomy of a Comparative Model     60
Searching for Related Sequences and Structures     61
Expert Protein Analysis System (ExPASy)     62
Blast and PSI-Blast     65
Protein Data Bank (PDB)     68
Sequence Alignment and Modeling System with Hidden Markov Models     70
Threading     73
Threader     78
Example: Finding Related Sequences and 3-D Structures     80
Sequence Alignment     84
Preparing the Sequences     87
Alignment Basics     90
Similarity Matrices     91
Clustal     95
Tree-Based Consistency Objective Function for Alignment Evaluation (T-Coffee)     99
Divide-and-Conquer Alignment (DCA)     100
Example: Aligning Sequences     101
Selecting Templates and Improving Alignments     104
Selecting Templates     104
Improving Sequence Alignments With Primary and Secondary Structure Analysis     107
Example: Aligning the Target to the Selected Template     111
Constructing Protein Models     111
Satisfaction of Spatial Restraints     113
Segment Match Modeling     115
Multiple Template Method     118
3D-Jigsaw     119
Overall Protein Model Construction Methods     121
Example: Constructing a Protein Model     122
Refinement of Protein Models     124
Side-Chains with Rotamer Library (SCWRL)     125
Energy Minimization     132
Molecular Dynamics     133
Molecular Dynamics with Simulated Annealing     135
Evaluating Protein Models     138
Procheck     138
Verify3D     140
Errat     141
Protein Structure Analysis (ProSa)     142
Protein Volume Evaluation (PROVE)     144
Model Clustering Analysis     146
Example: Evaluation of Protein Models     148
Conclusions     154
References     155
Simulations of Protein Folding   Joan-Emma Shea   Miriam R. Friedel   Andrij Baumketner     169
Introduction     169
Theoretical Framework     172
Energy Landscape Theory      172
Thermodynamics and Kinetics of Folding: Two-State and Multistate Folders     175
Protein Models     179
Introduction and General Simulation Techniques     179
Coarse-Grained Protein Models     181
Fully Atomic Simulations     190
Advanced Topics: The Transition State Ensemble for Folding     201
Transition State and Two-State Kinetics     202
Methods for Identifying the TSE     204
Conclusions and Future Directions     219
Acknowledgments     219
References     220
The Simulation of Ionic Charge Transport in Biological Ion Channels: An Introduction to Numerical Methods   Marco Saraniti   Shela Aboud   Robert Eisenberg     229
Introduction     229
System Components     231
Time and Space Scale     241
Experiments     242
Electrostatics     243
Long-Range Interaction     245
Short-Range Interaction     258
Boundary Conditions     261
Particle-Based Simulation     263
Implicit Solvent: Brownian Dynamics     264
Explicit Solvent: Molecular Dynamics     267
Flux-Based Simulation      273
Nernst-Planck Equation     274
The Poisson-Nernst-Planck (NP) Method     278
Hierarchical Simulation Schemes     282
Future Directions and Concluding Remarks     283
References     284
Wavelets in Chemistry and Chemoinformatics   C. Matthew Sundling   Nagamani Sukumar   Hongmei Zhang   Mark J. Embrechts   Curt M. Breneman     295
Preface     295
Introduction to Wavelets     296
Fourier Transform     297
Continuous Fourier Transform     297
Short-Time Fourier Transformation     298
Wavelet Transform     300
Continuous Wavelet Transform     301
Discrete Wavelet Transform     303
Wavelet Packet Transform     307
Wavelets vs. Fourier Transforms: A Summary     308
Application of Wavelets in Chemistry     309
Smoothing and Denoising     309
Signal Feature Isolation     312
Signal Compression     313
Quantum Chemistry     314
Classification, Regression, and QSAR/QSPR     316
Summary     321
References     321
Author Index      331
Subject Index     349

What People are Saying About This

From the Publisher

"...I recommend this book. Anyone interest in computational chemistry should browse through it and may well end up reading most of it." (Journal of the American Chemical Society, Vol. 125, No. 22, 2003)

Customer Reviews

Most Helpful Customer Reviews

See All Customer Reviews