Field of Science

Showing posts with label force fields. Show all posts
Showing posts with label force fields. Show all posts

So what exactly are force fields good for?

Image Hosted by ImageShack.us

Sue Storm tries hard to use her favorite force field to counter the 1 kcal/mol barrier

ResearchBlogging.org

Every once in a while there is a study asking what method X (X = docking, free energy calculations, molecular dynamics, force fields etc.) is good for. Such studies can be useful to take stock of a particular paradigm. So the question that Jonathan Goodman and his group ask in this paper is "Are force fields good for reproducing non-bonded interactions, especially hydrogen bonding, pi-stacking and dispersion?". He and his group compare very high-level quantum chemical ab initio data with data obtained from the most commonly used force fields, namely MM2*, MM3*, MMFFs, OPLS-2005 etc. The ab initio data used is from Pavel Hobza who has almost consummately published on these methods. The question is; how well do the force fields do compared to the gold standard? The answer is necessarily incomplete and complex and again raises many interesting questions about the enigmatic role of hydrogen bonding in chemical and biological systems.

The complexes studied include purely pi-stacked complexes, purely hydrogen bonded complexes and mixed complexes where both interactions play roles. Typical examples include alcohol-amide complexes, water oligomers and of course, the classic stacked and hydrogen bonded DNA nucleoside bases. The parameters that the authors looked at were geometries and energies, both of optimized complexes as well as crystal structures.

The results are perhaps not too surprising; the more recent OPLS-2005 and MMFFs are probably the best in reproducing known geometries and energies while MM2* and MM3* don't perform that well in general. As noted in some other studies, at least some of the results for MMFFs and OPLS compare with those obtained with high-level ab initio calculations, thus indicating the value of these cost-effective methods for geometry optimization and energy determination (let's ignore for a moment that solvation models in ab initio methods make even these less than perfect).

What is more important though is that all the force fields are generally not good for reproducing hydrogen bonded systems compared to systems where dispersion, stacking etc. are the key players. This is partly an indication of the tricky events including long-range solvation which play an important role in h-bond formation. But what is interesting is that the methods underestimate the energetics of hydrogen bonds. While I am a little puzzled by this, one of the explanations that comes to my mind regarding this curious fact is that in real systems, h-bonding is a cooperative interaction. An h-bond can pay for loss of entropy, thus making the overall free energy of the next h-bond more favourable. Of course force fields don't calculate free energy, but to a first approximation we can probably assume that the enthalpy and free energy are similar for these simple systems. To be honest, because of the complex nature of long-range dispersion interactions I would have assumed that the force fields would be worse in modeling these. I frankly don't understand why they work better for such interactions but it's an interesting observation.

But now for some general thoughts; it's always worth remembering that for molecules like proteins which are stabilized by h-bonds, the h-bonds when formed are simply swapped for similar bonds with water, thus making a relatively insubstantial contribution to protein stability. It is the large number of such interactions that can tip the balance for a protein, but the real driving force is now universally recognized as the hydrophobic effect and the burial of non-polar groups. Calculations such as those above indicate that because of the fine-tuning of h-bonds that proteins often use to achieve stability, force fields have some way to go in predicting tiny energy differences. It is still a great challenge to model the sub-angstrom geometry optimization of h-bonds that biopolymers achieve. But force fields are hardly unique in not being able to do this; so are other methods which are still trying to break the 1 kcal/mol barrier. Ironically in this study, the mean unsigned error when the hydrogen-bonded complexes are included is about 1 kcal/mol.

So are force fields good for anything at all? The short answer is yes, exemplified by the massive number of publications that regularly use force fields as well as the substantial number of people in academia and industry studying them. Obviously people think they are important, otherwise so many common programs doing everything from protein folding to drug-protein interactions would not have relied on them. I have had reasonable experience with force fields and have always kept in mind a couple of things about them that are worth reiterating:

1. Force fields are usually good at reproducing geometries, and best for reproducing sterics.
2. Force fields are usually not so good at reproducing energies since energy estimation is a function of the special parameterization and convergence criteria unique to every force field (As the Zen master says, "What the answer is depends on what question you ask"). However, relative conformational energies using a single force field for instance may be useful.
3. As a corollary, force fields can be pretty poor for dealing with molecules having a large number of polar functional groups. While this means that peptides are hard to model, modeling of peptides has also been mitigated by the fact that unlike small molecules, the chemistry to be parameterized is limited.
3. Many times the real problem is not with force fields per se but with the accompanying implicit solvation models. Admirable effort has been expended in developing these models but to be honest we still don't understand enough about that enigmatic solvent named water to do a satisfactory job. We are just scratching the surface when it comes to modeling things like solvent entropy for instance.

If you are following the field's developments, you also see an engaging and ongoing debate that pits the "science first" camp against the "parameterization first" camp. The science first camp disapproves of the other camp's efforts to improve their force fields simply by adding more parameters and optimizing against experiment; to them it is much more important to meticulously improve the methodology by incorporating as much real science as possible. The parameterization first camp argues that statistical methods have their honored place in the annals of science and that getting results fast and efficiently is important for application-oriented scientists like drug discovery people. I believe that as in other matters, both sides are right. It is an uncomfortable feeling when you don't truly understand the science behind a method and yet the method works, but at the same time it is important to have a well-parameterized and tested model that could help you in a practical sense, even if incompletely understood.

As with everything else, finally it is an astute application of force fields that takes into account their strengths and limitations which will lead to productive results. One of the most interesting things about doing science involves weighing the pros and cons of methods, techniques and algorithms and deciding what judicious combination would provide the best answer and why. It may not always work, but it could keep us from getting seduced by the dark side of the force (field)

Paton, R., & Goodman, J. (2009). Hydrogen Bonding and π-Stacking: How Reliable are Force Fields? A Critical Evaluation of Force Field Descriptions of Nonbonded Interactions Journal of Chemical Information and Modeling, 49 (4), 944-955 DOI: 10.1021/ci900009f

Strain Energies in Ligand Binding: Round Two- Fight!

Or why to be wary of ligands in the PDB, force field energies, and anybody who tells you not to be wary of these two

ResearchBlogging.org

One of the longstanding questions in protein-ligand binding has been; what is the energy penalty that a protein has to pay in order to bind a ligand? Another question is; what is the strain energy that a protein pays in order to bind the ligand? Contrary to what one might initially think, the two questions are not the same. Strain energy is the price paid to twist the conformation of the ligand into the binding conformation. Free energy of binding is the energy that the protein has to pay in addition to the strain energy in order to bind the ligand.

A few years ago, this question shot into the limelight because of a publication in J. Med. Chem. by Perola et al. from Vertex. The authors did a meticulous study of hundreds of ligands in their protein-bound complexes, some from the PDB and others proprietary. They used force fields to estimate the difference between the energy of the bound conformation of the ligands and the nearest local energy minimum conformation- the strain energy penalty. For most ligands, they obtained strain energies ranging from 2-5 kcal/mol. But what raised eyebrows was that for a rather significant minority of ligands, the strain energies seemed to be more than 10 kcal/mol, and for some they seemed to be up to 20 kcal/mol.

These are extremely high numbers. To understand why this is so, consider a fact that I have frequently emphasized on this blog; the concentration of a particular conformation in solution is virtually negligible if the free energy difference between it and a stable conformation is about only 3 kcal/mol. For a conformation to pay that much of an energy penalty in order to transform itself into the bound conformation would already be a stretch, considering its low concentration. For a conformation to pay an energy penalty of 20 kcal/mol does not make sense at all in this light, since such a conformation should be non-existent. Plus, think about the fact that hydrogen bonds usually contribute about 5 kcal/mol and that energy at room temperature is itself about 20 kcal/mol- significantly greater than the rotational barriers in most molecules- and this number for the strain energy penalty starts looking humungous. Where exactly would it come from?

Perola's paper generated a lot of buzz- a good thing. It was discussed by speakers at a conference in March last year that I attended. Now, a paper in J. Comp. Chem. seems to clear up the air a little. In a nutshell, the authors conclude that the strain energies they have measured seldom, if ever, surpass 2 kcal/mol. Needless to say, this is a huge difference compared to the earlier studies.

Why such a startling difference? It seems that as always, the answer strongly depends on the method and the data.

First of all, the PDB is not as flawless as people assume it is. Most people who are crystallizing protein-ligand complexes are first and foremost interested in the structure of the protein. They often do a poor job of fitting ligands to the electron density; Gerard Kleywegt of the University of Uppsala has done some marvelous work on detecting errors in PDB ligands, and his review on this should be a must-read for all scientists even marginally connected with crystallography. Because of poor fits, conformations of ligands in the electron densities in the PDB can be completely unrealistic and at the very least, brutally strained. Amides can be cis or non-planar, and more rarely planar aromatic rings can be deformed. There can be severe steric clashes which are not easily apparent. Quite naturally, such conformations when refined would lead to huge drops in energy. Therein lies the first source of the unrealistically large strain energy differences.

The second factor has to do with the vagaries and inadequacies of force fields, often unknown to crystallographers but known to experienced computational chemists. Force fields are quite poor at determining energies and their results are especially skewed by an overemphasis on electrostatic interactions which the force fields are ill-equipped to damp. Now consider what happens when a ligand in a PDB that has a positively and negatively charged group in it is optimized. If you relax it to the nearest local energy minimum, these two groups would instantly snap together and form a very strong ionic bond. This would lead to a huge overstabilization of the conformation, thus again giving the illusion of a large strain energy difference between the PDB conformation and the local minimum.

Finally, the devil is in the details. In doing the initial refinement of the conformation, the earlier study used a constraint called the flat-bottom potential in optimizing the PDB ligands in their bound state. However the flat-bottom potential, which extracts no penalties for atomic movement within a certain short distance and suddenly ramps up the penalty, is not physically realistic. A better method might be to use a harmonic potential which continuously and smoothy extracts a penalty proportional to atomic displacement.

The present study takes all these factors into account and also substitutes the force field results with some well-established quantum chemical energy determinations at the B3LYP/6-31G* level. They use this method to calculate the energies of bound and local energy minimum conformations. Secondly, they use a well-established continuum solvation model (PCM) as incorporated in the latest version of the Gaussian program to incorporate damping effects due to solvation. Thirdly as indicated above, they use the harmonic potential for optimization. Fourthly and most importantly, for the cases where the strain energy seems unusually high (and even there they set the bar quite high- anything greater than 2 kcal/mol), the authors closely investigate the relevant PDB entries and find that indeed, the ligands were not fit well into the electron density and had unrealistically strained conformations.

Once they tackled these problems, the strain energies all fell down to between 0.5 and 2 kcal/mol, which seems to be a realistic penalty that a conformation with a respectable concentration in solution could pay. There is now a second question; what is the maximum strain energy penalty that a ligand can pay to be transformed into the bound conformation? The authors are working on this question, and we will await their answer.

But this study reiterates two important lessons that should be remembered by anyone dealing with structure at all times:
1. Don't trust the PDB
2. Don't trust force field energies

Better still, as old Fox Mulder said, trust no one and nothing.

References:
1. Keith T. Butler, F. Javier Luque, Xavier Barril (2009). Toward accurate relative energy predictions of the bioactive conformation of drugs Journal of Computational Chemistry, 30 (4), 601-610 DOI: 10.1002/jcc.21087

2. Emanuele Perola, Paul S. Charifson (2004). Conformational Analysis of Drug-Like Molecules Bound to Proteins: An Extensive Study of Ligand Reorganization upon Binding Journal of Medicinal Chemistry, 47 (10), 2499-2510 DOI: 10.1021/jm030563w

3. A Davis, S Stgallay, G Kleywegt (2008). Limitations and lessons in the use of X-ray structural information in drug design Drug Discovery Today, 13 (19-20), 831-841 DOI: 10.1016/j.drudis.2008.06.006